This function remove spike noise from data. This function originally wrote for removing spike noise in time-series water velocity data but can be used for general purpose. The basic idea comes from Goring and Nikora (2002) which considers first and second derivatives of time series signal. See detail in the reference.
Referece
- Mori, N., T. Suzuki and S. Kakuno (2007) Noise of acoustic Doppler velocimeter data in bubbly flow, Journal of Engineering Mechanics, American Society of Civil Engineers, Volume 133, Issue 1, pp.122-125.
When looking at these scripts, I found a difference with the algorith of Goring and Nikora (2002), and I wonder if there is a certain reason why it is so.
In these scripts, an ellipsoide is constructed with axes lambda*sigma_u , lambda*sigma_du and lambda*sigma_d2u. However, in the article of Goring and Nikora the ellipsoide has axes 'a', lambda*sigma_du and 'b', with 'a' and 'b' defined by their equations 9 and 10. The axes 'a' and 'b' are chosen so that the maxima of the ellipse, i.e. the points with extreme values of u and d2u, are equal to lambda*sigma_u and lambda*sigma_d2u. Note that these maxima points are in general not equal to the main axes.
It is a very subtile but fundamental difference, and I wonder if there is a certain reason why this change has been made?
A good thing about these scripts is that it circumvents a problem with the algorithm of Goring and Nikora: if sigma_u and sigma_d2u substantially differ, the latter algorithm can not define a proper ellipsoide. I face this problem with my ADV data, and that's why I am interested in your method. The algorithm in these scripts can always produce an ellipsoide.
I would like to stress that this alternative in these scripts is not 'erroneous', since it uses a certain logic, just as the algorith of Goring an Nikora uses a logic. Since despiking is not an exact science, it is still open for debate which logic is best.
Only one should be aware that these scripts do not implement the method of Goring and Nikora (2002), but an alternative.
I look forward to your comments and ideas.
Kind regards,
Laurent Schindfessel
Ghent University
The despiking process is 3 times faster if lines 100-106 in function "func_excludeoutlier_ellipsoid3d.m"
>> z2 = -sqrt(zt);
elseif z1 > 0
z2 = sqrt(zt);
else
z2 = 0;
end
Ignore previous post. I have now got this to work by installing the statistics toolbox. Results are rather worrying, however, as it is filtering out 100% of my data points even though WinADV rates were only ~5% using the same filter!
I created my own nanmean function as suggested by Georg Stillfried above. I then tried running func_despike_phasespace3d and found I needed a nanstd function. I tried this:
function m = nanstd (x,dim)
if nargin<2, dim=1; end
nans = isnan(x);
x(nans) = 0;
sumx = sum(x,dim);
m = sqrt((sumx./sum(~nans))/sum(~nans));
...but now getting back:
Error using *
Inner matrix dimensions must agree.
Error in func_excludeoutlier_ellipsoid3d (line 97)
x2 = a*b*c*x1/sqrt((a*c*y1)^2+b^2*(c^2*x1^2+a^2*z1^2));
Error in func_despike_phasespace3d (line 101)
[xp,yp,zp,ip,coef] = func_excludeoutlier_ellipsoid3d(f,f_t,f_tt,theta);
Using this function set for a custom MatLab based ADV signal processing toolset I made. Exactly what I needed. So glad I didn't have to write this myself. Works great!
Thx!
So far it works fairly well for my seismic data, even though it might take some of the valid signals as spikes as well. Still, I am amazed by the quite accurate targeting this this function. Excellent work!
P.S. I just noticed that this works only if x is a vector. If x is an array, the "official" nanmean will calculate the mean columnwise or along a specified dimension. What they do there is to set the NaNs to zero, sum up the columns and divide by the number of no-NaNs:
function m = nanmean (x,dim)
if nargin<2, dim=1; end
nans = isnan(x);
x(nans) = 0;
sumx = sum(x,dim);
m = sumx./sum(~nans);
Save the file as nanmin.m in a directory on the Matlab path.