Tomek wrote:
> I've tried to run it on this sequence
> TZ = 100;
> v = [repmat([ones(1,7) zeros(1,130)],1,10) ones(1,7) zeros(1,TZ)];
> and the computed runlength is extremely dependent on the number of
> leading/trailing zeros
You are right; I have done some testing, and finding the peak fft
coefficient is *not* a good way of detecting the period of a binary
signal. For example, [0 ones(1,63)] is completely flat (and realvalued)
in the fft after the DC component, all 1's, which would mean
sin(x)sin(2*x)sin(3*x)sin(4*x) and so on. [ones(1,63) 0] on the
other hand is similar to a sine curve, peaking in the middle, starting
from negative coefficients and going to positive coefficients. As
humans, we can recognize that however we want to define the period of
such a vector to be, then by symmetry it should be the same in the two
cases.
Each zero padding on the end of data adds one more sine convolution (of
the fft coefficients). And repeating the signal, fft([x x]) can have the
effect of moving the detected peak from location n to location 2*n+1...
Oh wait, that one actually makes sense since you doubled the length so
length / location would still be either the same or a hair more than
before (increased accuracy.)
Anyhow, I do not have any fftbased suggestions on how to proceed.
Tideman hinted at other methods; going along with what he was saying,
one approach might be to histogram the detected periods. If you get a
notable multimodal distribution in the histogram, that would be a
reflection of internal transitions within the periodic binary signal...
e.g., [1 1 0 1 0 0 0 0 0] repeated (without error) many times would have
as many bins with "gap length 1" as for "gap length 5". Might be a bit
tricky to reconstruct the periodic signal from the bin counts... for
example, 0 1 0 1 0 0 0 0 1 would have twice as many zerosgaps of length
1 as it would of length 4. I guess you could look for the longest gap
runs of zeros that have significant counts, and then look for other bins
that have close approximations of multiples of that count.. but that
won't tell you the order, and you will need to adapt this (perhaps by a
second histogram counting runs of 1's) to figure out the likely total
signal length.
This signal... when there are varying runs of 1's, is there a known (or
reasonably estimated) probably distribution for the number of 1's?
Poisson perhaps? Or does it represent something like "work load varying
over time in an way whose prediction is beyond the scope of this
project"? Or is it just random measurement error, or relatively accurate
measurement of a process that is prone to minor errors?
