Does scaling matter?
In most situations, scaling is really not all that important. The overall shape of the spectrum matters much more than the absolute scale.
What are the conventions?
But if you really are worried about it, there are several different conventions from which you can choose (see definitions below):
I generally use either option #1 or option #2 depending on my mood and whether it's raining outside.
Here I am assuming that I have a discrete-time signal x represented as an M x N matrix, where M is the number of samples and N is the number of channels.
[M,N] = size(x);
Furthermore, I am assuming that the sampling rate is Fs and that I have defined the time increment as
dt = 1/Fs;
and the frequency increment as:
dF = Fs/M;
What do all these conventions have in common?
All of these conventions have one thing in common: The product of the two scaling factors is always 1. Please note that the ifft function in MATLAB includes a scaling factor of 1/M as part of the computation, so that the overall round-trip scaling is 1/M (as it should be).
You are right about scaling being unimportant if only the shape of the spectrum is desired. However, if it is necessary that the amplitudes in the frequency spectrum be correct, then there is only one option for scaling - your option #1. In order for Parseval's theorem to hold, the energy in the time domain must equal the energy in the frequency domain. The example below demonstrates this:
> N = 8; > dt = 0.1; > df = 1/(dt*N)
> a = randn(N,1)
0.70154 -2.0518 -0.35385 -0.82359 -1.5771 0.50797 0.28198 0.03348
> b = fft(a)*dt
-0.32813 0.10746 + 0.30519i -0.080365 + 0.075374i 0.34826 + 0.17802i 0.13866 0.34826 - 0.17802i -0.080365 - 0.075374i 0.10746 - 0.30519i
> energy_a = sum(a.*conj(a) * dt) % Not necessary to use conj here
> energy_b = sum(b.*conj(b) * df) % Necessary to use conj here