version 1.0.0.0 (2.05 KB) by
R P

Optimization of parameters of the sine function to time series

Syntax:

[param]=sine_fit(x,y)

which is the same that

[param]=sine_fit(x,y,[],[],[]) % no fixed_params, automatic initial_params

[param]=sine_fit(x,y,fixed_params) % automatic initial_params

[param]=sine_fit(x,y,[],initial_params) % use it when the estimation is poor

[param]=sine_fit(x,y,fixed_params,initial_params,plot_flag)

param = [offset, amplitude, phaseshift, frequency]

if fixed_params=[NaN, NaN , NaN , NaN] % or fixed_params=[]

optimization of offset, amplitude, phase shift and frequency (default)

if fixed_params=[NaN, 1 , NaN , 1/(2*pi)]

optimization of offset and phase shift of a sine of amplitude=1 and frequency=1/(2*pi)

if fixed_params=[0, NaN , NaN , NaN]

optimization of amplitude, phase shift and frequency of a sine of offset=0

Example:

%% generate data vectors (x and y)

fsine = @(param,timeval) param(1) + param(2) * sin( param(3) + 2*pi*param(4)*timeval );

param=[0 1 0 1/(2*pi)]; % offset, amplitude, phaseshift, frequency

timevec=0:0.1:10*pi;

x=timevec;

y=fsine(param,timevec) + 0.1*randn(size(x));

%% standard parameter estimation

[estimated_params]=sine_fit(x,y)

%% parameter estimation with forced 1.5 fixed amplitude

[estimated_params]=sine_fit(x,y,[NaN 1.5 NaN NaN])

%% parameter estimation without plotting

[estimated_params]=sine_fit(x,y,[],[],0)

R P (2021). Sine function fit (https://www.mathworks.com/matlabcentral/fileexchange/41246-sine-function-fit), MATLAB Central File Exchange. Retrieved .

Created with
R13

Compatible with any release

**Inspired:**
sigm_fit

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

John D'ErricoI have gotten a comment that my suggested code fails when c is not in the interval [0,pi]. The fix is simple enough. Use the 4 quadrant version of atan2. Thus, in the code I wrote, it would be

c = atan2(abb(2),abb(3));

raffaeleJan GerkenSiyuan LinJohn D'ErricoGood point by Alex to recover c.

HalluIt doesn't fit plots that have a lot of points near the sine parts, and only a few points but that are quite far from the sine parts. It only works when the points are really clustered together, which makes the program quite useless unfortunately, as most people would be able to do it themselves. It needs some bells and whistles. If you know a better sinus fit program, please post it here.

Alex BurkaJohn D'Errico, thanks for these notes. I am using this fitting method. One correction: the use of acos() to find c loses some quadrants. It works better to do

c = atan2(abb(3), abb(2));

GaryMissing a couple functions: minmax and smooth.

John D'ErricoPeople should realize that a sine fit requires nothing special, not even any special nonlinear optimization, like the call to nlinfit as this uses. fminbnd will suffice.

Suppose you want to know the parameters of this 4 parameter model:

y = a + b*sin(c+d*x)

where x and y are vectors, with presumably additive Gaussian noise in y?

The trick is to first convert the problem to a related one, using a sin identity.

sin(u+v) = sin(u)*cos(v) + cos(u)*sin(v)

So we can convert the model to a related one, that probably looks more complicated.

y = a + b*sin(c )*cos(d*x) + b*cos(c )*sin(d*x)

We don't know the phase, c, nor do we know the parameter b. So let us now rewrite the model as

y = a + b1*cos(d*x) + b2*sin(d*x)

This is starting to look a bit simpler. The next trick is to recognize that a,b1, and b2 are what some would call conditionally linear parameters in a regression model. If we KNEW what the value of d was, then we could estimate the values of a, b1, and b2 using LINEAR regression.

So the trick is to use fminbnd to estimate d, at each iteration our "objective" function can then estimate a,b1,b2. Once we have a, b1, and b2, then we will recover the original parameters a,b,c,d.

To show how this all works, I'll need to create some fake data.

x = 10*rand(10,1);

y = 1 + 2*sin(3 + 4*x) + randn(size(x))/10;

I've picked some pretty simple coefficients there so we will see if the result was good or not when all done.

Create a couple of necessary function handles. The first one will generate linear regression coefficients, given d. Note that it is presumed that x and y are column vectors. The function handle automatically builds x and y into the definition.

cfun = @(d) [ones(size(x)),sin(d*x),cos(d*x)]\y;

This next function computes a sum of squares of the residual errors, as a function of the as yet unknown d.

sumerr2 = @(d) sum((y- ...

[ones(size(x)),sin(d*x),cos(d*x)]*cfun(d)).^2);

So now just minimize the objective function, using fminbnd. Be careful not to give too wide a range for the value of d, as we can easily get stuck in a local minimizer. That is true if you give poor starting values for ANY parameter however. Choose poor starting values, and expect to get poor results.

dopt = fminbnd(sumerr2,2,7)

dopt =

3.99774832748898

So we have recovered the value of d. How about the other parameters?

abb = cfun(dopt)

abb =

1.00007930296214

-1.99239863846747

0.264147541820845

a = abb(1);

Recall that a was 1 and d was 4 when we created this fake data. So we have done ok so far. Can we recover b and c though? Of course. We get b from another trig identity.

sin(u)^2 + cos(u)^2 = 1

Thus, we know that

sqrt(b1^2 + b2^2) = b

This gives us the simple formula for b as:

b = norm(abb([2 3]))

b =

2.00983239560343

Not bad, since b was originally 2. How about c? Given that we have now recovered b, the phase angle is simply

c = acos(abb(2)/b)

c =

3.00978367214213

All very simple, requiring no starting values other than a range for d, the period of the sine wave.

Why is the scheme I suggest above a good one, far better than nlinfit? It is more robust to poor starting values. It will surely be faster, requiring fewer function evaluations.

So in the end, is the code the author has provided in this submission overtly a poor one? Well, the help is reasonably good.

It uses eval to create the objective function, IMHO, a poor choice. That will be slow.

It requires starting values for all of the parameters, and as we all should know, nonlinear optimizations with poor starting values tend to yield poor results.

It uses the findpeaks code from the file exchange to choose a starting value for the period. While this is arguably a good idea, the author never tells you that was done, so anyone who downloads this code may be surprised. However, a nice code would reference findpeaks, or even if the author of findpeaks permitted it, this code could include findpeaks as a subfunction. A really nice code would allow the user to provide a range for d, and if none was provided, then use findpeaks to infer a good starting value for d for the scheme I've suggested to use.

Overall, the problems with this code are fixable. I'd give it 3 stars, hoping the author was willing to improve their work.

Mathis MörkeAurangozeb Rahman