MATLAB Answers


Fitting scattered data to multiple cosine functions

Asked by Ahmad Hamad on 11 Jul 2018
Latest activity Commented on by Matt J
on 12 Jul 2018
I have data that represent 16 cosine shaped curves, but the data is in the form of scattered points (x_i y_i) i= 1,2,3 .... N. please relate to the attached plot. The points are not associated to the functions, further more, I don't have the exact functions but only their models, more specifically, each function follows the form: f_k = A_k cos(phi_0k + omega*x), omega is fixed for all the functions, only the amplitude A_k and the phase shift phi_k are specific to each function. Is there an easy way to associate each data point to one of the 16 curves?
Thanks in advance.


Two or more curves intersect at multiple points and data is a bit noisy in some places. Combined effect of these observations is that it will not be possible to assign each data point to a single curve. Unique assignments can be made for the majority of the data points, but not the ones that lie close to the points of intersection. Still, doing so will not be "easy" as you will first have to use nonlinear least squares to estimate parameters (i.e., amplitudes and phase-shifts) of the 16 curves.
@Anton Semechko, the nonlinear least squares has to be carried out on what exactly?
Well, that's the issue; if you just throw all the data at a NLLSQ algorithm trying to estimate 32 parameters (mag and phase for sixteen functions), it'll fail miserably unless you have some way a priori to associate which point(s) belong to which term; otherwise it just looks basically like random noise.

Sign in to comment.

2 Answers

Answer by Anton Semechko on 11 Jul 2018
Edited by Anton Semechko on 11 Jul 2018
 Accepted Answer

Below is an example where I use brute-force search to find an optimal set of sinusoid parameters that best fit an unorganized dataset; like the one you have. Fitted model parameters can be subsequently used to classify the individual data points. I omitted that latter part, but it wont be too difficult to implement.
function brute_force_sinusiod_fit_demo
% Generate a sample data set composed on N sinusoids with the same angular
% frequency but varying amplitudes and phases
% -------------------------------------------------------------------------
N=16; % # of sinusoids
w=3; % angular frequency
A_o=(1+rand(N,1))/2; % amplitudes in the range [0.5 1]
phi_o=rand(N,1)*pi; % phases in the range [0 pi]
f=@(x,A,phi) A*cos(w*x+phi);
for n=1:N
X{n}=sort(pi*rand(1E3,1)); % unevenly spaced samples in time
F{n}=f(X{n},A_o(n),phi_o(n)); % measured signal
% Scramble the order of samples so it becomes difficult to tell which point
% came from what signal
% Attempt to recover parameters of the N curves from simulated data
% -------------------------------------------------------------------------
A_rng=[0.4 1.2]; % expected range of amplitudes
phi_rng=[0 pi]; % expected range of phase shifts
% Search grid
% Search A-phi parameter space
MAE=median(abs(F))*ones(size(A)); % expected error
for i=1:Ng
Fi=f(X,A(i),phi(i)); % output of the model
dF=abs(F-Fi); % absolute residuals
dF(dF>5*dA)=[]; % remove data points that deviate from Fi by more than 5*dA
if isempty(dF), continue; end
if isempty(dF), continue; end
MAE(i)=median(dF); % quality of the fit
% Extract N best fits
for n=1:N
% Absolute minimum
% Set neighbourhood around absolute minimum to Inf
D=((A-A_fit(n))/dA).^2 + ((phi-phi_fit(n))/dphi).^2;
% Visualize
% -------------------------------------------------------------------------
hold on
set(gca,'YDir','normal','XLim',phi_rng+dphi*[-1 1],'YLim',A_rng+dA*[-1 1],'FontSize',15)
axis equal
title('Best Fit Model Parameters','FontSize',25)
hold on
set(gca,'YDir','normal','XLim',phi_rng+dphi*[-1 1],'YLim',A_rng+dA*[-1 1],'FontSize',15)
axis equal
title('Actual Model Parameters','FontSize',25)


@Anton Semechko,
Thank you for the code, Hough transform seems to be the way to go but the parameter space is huge, keep in mind that in your example you assumed w is given but that is not the case. It's common between all the signals, that is true, but it is also unknown.
Yeah, that problem is significantly more challenging, but can be tackled using a similar approach.
There are various heuristics you can use along the way to speed-up the search. Assuming sinusoids have different amplitudes, you can fit them in decreasing order of amplitude, and remove data points best explained by the fitted model along the way. Maxim amplitude can be estimated directly from the data without any optimization. Frequency and phase of the signal with maximum amplitude can be optimized using the same general strategy I used in the demo above. Frequency found for this first signal can either be fixed when fitting remaining sinusoids, or you can implement a more robust strategy that allows you to test this solution on some sub-set of data.

Sign in to comment.

Answer by Matt J
on 11 Jul 2018

You need to implement a sort of sinusoidal Hough transform. Bin the 2D space of coordinates (A,phi) into cells and loop over the cells. For each combination (A,phi), generate the appropriate curve y=f(x) and see how many points lie within a tolerance region of that curve. This will give you a tableau Counts(A,phi). The top 16 peaks in that tableau will give you your 16 sinusoids.


Ah...that's not a bad idea at all...
I thought about Hough transform but with three parameters ( amplitude, phase and frequency which is common but unknown) the parameter space will be huge. Or is it manageable?
You could start with coarse sampling, e.g., 10x10x10 and gradually refine.

Sign in to comment.