calculate the isotropic and anisotropic experimental (semi) variogram
variogram calculates the isotropic and anisotropic experimental variogram in various dimensions.
Syntax:
d = variogram(x,y)
d = variogram(x,y,'pn','pv',...)
The function uses parseargs (objectId=10670) by Malcolm wood as subfunction.
Currently, the function calculates all variogram values at one step. While this is fast for small data sets (n<2000), it may fail on large data sets owing to memory constraints. For large datasets you may thus want to use the 'subsample' option.
1.4  Bug removed that occurred when calculating the distances for more than two dimensions. 

1.3  removed a bug in calculating the number of angular bins for anisotropic variograms. 

1.2  Removed dependency from IPDM. Fortunately, it is even faster now on my machine. The relatively large memory overhead is still unresolved. Added subsampling option. 

1.1  dependence on consolidator was removed,

Inspired by: parseargs: Simplifies input processing for functions with multiple options
Inspired: variogramfit, Ordinary Kriging, petersontimj/GroundwaterStatisticsToolbox
yjing ning (view profile)
haamii (view profile)
Wolfgang,
I tested your suggestion for 3d variogram (variogram([x(:) y(:) z(:)], values, ...)), but, it do not work very well.
Could you please any another suggestion for 3d variogram?
cheers
Carlos (view profile)
Wolfgang,
Thanks for this awesome submission. It was very helpful. I however have a questions..
On line 165 you perform the following operation
edges(end) = inf;
Can you explain the logic behind it?
The reason that I ask it that when the data is uniformly distributed in space the number of data pairs separated by a lag (h) decreases as h increases.
However, as written line 165 then makes later lines (l78 and 179)
S.val = accumarray(ixedge,...
lam,[numel(edges) 1],fvar,nan);
S.num = accumarray(ixedge,ones(size(lam))...
,[numel(edges) 1],@sum,nan);
Calculate the gamma value and the number of pairs using the the rest of the data set. You then do a final modification of the arrays but that does eliminate the extra data poins used int he Variogram calculation..
Let me give you an example.
My edges vector before line 165
250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000
My edges vector after line 165
250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 Inf
Then on lines
178,179 and 186, 187
you produce the S.distance,S.num and S.val struct arrays
This is what the S.distance says:
625
875
1125
1375
1625
1875
2125
2375
2625
2875
Then this is what the S.num says:
41
40
39
38
37
36
35
34
33
32
31
59
You see That 59 should not be there there are no 59 number of data pairs with an average lag distance of 2875. In fact there are only 30.
If I comment line 165 then I get the expected result of monotonically decreasing number of pairs as lag distance increases.
Let me know what you think...
Carlos Soto
kaiba Wong (view profile)
Fantastic code  just what I needed. However, I am trying to modify it a bit for crossvariogram. Can I do it this way: save the
lam = (y(iid(:,1))y(iid(:,2)));
for two separate variable (say A and B) Combine=bsxfun(@times, lamA,lamB);
Combine_cross = accumarray([ixedge ixtheta],Combine,...
[21 7],fvar,nan);
Combine_cross(:,end)=Combine_cross(:,1);
Is this approach correct?
Thanks
Zoltán (view profile)
Hi Wolfgang!
Thank you for this, it is a very good one!
Could you please explain me, where is your algorithm calculate the separation (or lag) vector(h). Theoretically, if I am calculating a variogram, I use this formula:
gamma^2(h)=(1/2N(h))*sum(f(u+h)f(u))^2
Did you used the same? If yes, how do you calculated h? (in which row?)
Thank you!
Zoltán
Wolfgang Schwanghart (view profile)
@JG: Yes, [x y] can also be a nx1 vector with n number of points in time. Use datenum to calculate a numeric vector from your dates. Variogram doesn't do the conversion for you.
J G (view profile)
Thanks Wolfgang. Also, can this be used for temporal separation too? i.e., can [x y] be time?
Wolfgang Schwanghart (view profile)
@JG: This syntax is correct. Take care, however, if latitude and longitude refer to geographic coordinates... The function doesn't calculate distances on a sphere.
J G (view profile)
Thanks for this!
I just want to make sure I'm using it correctly;
If I have latitude and longitude coordinates ([x y]), and my data (z),
I use: variogram([x y],z)
Wolfgang Schwanghart (view profile)
@Edgar.
Hi Edgar, yes, this seems to be an error in the function. Thanks for reporting it! I hardly needed those plot options so I probably forgot to care for them when adding anisotropy. Regarding your earlier question, I suppose that this should be ok.
I'll see if I can make a update in the next couple of days. Unfortunately, my time to support these functions at the moment is quite scarce...
Edgar (view profile)
Hi Wolfang,
When types "cloud1" and "cloud2" are used in combination with anisotropy, the function does not return an array (a column for each theta tested). I'm guessing this is an error in the function?
Best,
Edgar
Edgar (view profile)
Hi Wolfang,
I'm working with meteorological data, I can see a mountain range drives the spatial variation of the meteorological fields. The mountain range has an orientation (larger axis) at 110° (calculated clockwise from top, i.e. positive latitud axis). The corresponding semivariogram from your script would be the one at 20°? Thanks for your help. Really nice algorithm, congrats!
Ricardo (view profile)
Hi Wolfgang,
Yes, that's a decision that concerns me these days, because either I use one variogram per time sample or as i have a lot of timesamples, I can do a training phase for kriging and I get a variogram characteristic from the system, which I will use in the test phase. I am currently working on my own 3d kriging and i wanted to compare results.
Thank you very much!
Wolfgang Schwanghart (view profile)
Hi Ricardo, I am afraid my kriging won't work in three dimensions. However, it shouldn't be a very big deal do modify the function to do so. However, I am skeptical how you want to handle time and space dimensions in calculating one variogram.
Ricardo (view profile)
Hi again,
I just realize the NaN make sense in my data.
But still i have some questions.
Once I already know that variogram can work with 3d input, I tryed the kriging script but it seems like the input is 2dimensional, is there a way to use 3D data on the kriging script?
I am working with temperature samples along a period of time, and i wonder if i can in the variogram function give as values the entire time samples matrix, instead only one time sample only.
I hope i am being clear enough.
Thank you so much
Ricardo (view profile)
Ricardo (view profile)
Thank you so much Wolfgang!
I tried and at the beginning i had an error but i changed what @Shen said
d = sqrt(sum((X(i,:)X(j,:)).^2,2));
and it worked, but the d.val has a lot of NaN, is that OK? because I have very bad fits on the variogramfit script.
Thanks again
Wolfgang Schwanghart (view profile)
@Riccardo: variogram supports higher dimensional data. In your case, just call the function like
variogram([x(:) y(:) z(:)], values, ...)
Ricardo (view profile)
Hello Wolfgang,
I am trying to apply Kriging to a 3d mesh but first of all i don't know if i can use your variogram from a 3d data, it seems that only accepts X ans Y, not Z coordinates. Is there any way to use it with 3d input?
Thank you very much
Ricardo
Wolfgang Schwanghart (view profile)
Hi Shen,
you are right! Thanks for pointing out this bug. Seems I have not thoroughly checked the multidimensional support since I've never needed it. I'll correct for it as soon as possible.
Thanks again,
Wolfgang
Shen Liu (view profile)
Hi Wolfgang,
I am not sure if the following had been spotted:
In Line 253, when calculating the distances for more than two dimensions, the code was
d = sqrt(sum((X(i,:)X(j,:)).^2));
I think it should be
d = sqrt(sum((X(i,:)X(j,:)).^2,2));
as we need the sums of the rows but not the columns. After this correction, everything works perfectly in calculating 3dimensional variograms.
Many thanks for your function. It is absolutely fabulous!
Best regards,
Shen
Wolfgang Schwanghart (view profile)
@vipul utsav: the nugget variance is the variogram value at zero lag distance. You can estimate it visually from the experimental variogram or (better) use variogramfit (http://www.mathworks.de/matlabcentral/fileexchange/25948) to estimate it from the data.
vipul utsav (view profile)
nugget variance from variogram ,how it is possible?
Wolfgang Schwanghart (view profile)
@dd: Right now, this is not possible, but with a minor modification of the function, it is not a problem. Just enter following line in the section where the output for the type 'cloud1' is calculated (Line 202)
S.ij = iid(:,[1 2]);
This gives you the indices for each point pair. You can now take these to calculate e.g. the correlation coefficient for a binned lag distance.
x = rand(1000,1)*42;
y = rand(1000,1)*42;
z = 3*sin(x*15)+ randn(size(x));
d = variogram([x y],z,'type','cloud1');
% should give you this
d =
distance: [382476x1 double]
val: [382476x1 double]
ij: [382476x2 double]
distbins = unique(d.distance);
% calculate the correlation coefficient
% for the second distance bin
I = d.distance == distbins(2);
corrcoef(z(d.ij(I,1)),z(d.ij(I,2)))
ans =
1.0000 0.2306
0.2306 1.0000
Hope it works out for you.
Cheers, W.
dd (view profile)
@ Wolfgang:
I would need it mostly for didactic purpose: how gamma, corr. and covar. are evolving along with lag distance.
Another question: how can I produce a crossplot of a variable between two lag distances?
Many thanks
Wolfgang Schwanghart (view profile)
@dd: the covariance (C(h)) as a function of lag distance h is directly related to the semivariogram (gamma(h)) and can be calculated by
gamma(h) = C(0)C(h)
You can take the sill variance as C(0) but this only works for bounded variograms.
Why do you need the correlation or covariance?
dd (view profile)
Hi Wolfgang,
I was wondering if there would be a simple way to include correlation and covariance in the analysis?
Thanks
dd (view profile)
Chris Sherwood (view profile)
Great function...helped me do a quick analysis of spatial correlation. One enhancement would be to return a handle to the plotted data, so users could change the color, symbols, etc.
dd (view profile)
Hi Wolfgang,
I keep on getting these error messages when running this code. There's however nothing special to my data:
'Error using set
Bad property value found.
Object Name : axes
Property Name : 'XLim'
Values must be increasing and nonNaN.
Error in axis>LocSetLimits (line 208)
set(ax,...
Error in axis (line 94)
LocSetLimits(ax(j),cur_arg);
Error in variogram (line 217)
axis([0 params.maxdist 0 max(S.val)*1.1]);'
Any suggestion?
Cheers
kamila benaissa (view profile)
kamila benaissa (view profile)
kamila benaissa (view profile)
Adam (view profile)
Wolfgang Schwanghart (view profile)
Hi Jorge,
are those data gridded? If yes, you should call the function like this:
d = variogram([x(:) y(:)], impedance(:), 'nrbins',50,'anisotropy',true,'thetastep',30);
Best regards,
Wolfgang
jorge parra (view profile)
Thank you Wolfgang ,
I got both plots OK(for theta = 0 and theta = 90 degrees). Now I am trying to apply your program to a real data set but I am having some problems. The data set consist of 200 x 50 points (impedances) contained in a 2D rectangular cross section. The coordinates x and y and the impedance are arranged in three vectors, each of them of dimension 200x 50 = 10000. The function in the script file is given by:
d = variogram([x y], impedance, 'nrbins',50,'anisotropy',true, 'thetastep',30);
The error message: “ Anisotropy is only supported for 2D",
What am I doing wrong?
Jorge
Wolfgang Schwanghart (view profile)
Hi Jorge. I have just uploaded a new version with a bug removed. Please take this one as soon as it is available (probably by 25 Nov 2011). Then, in order to obtain variograms for different directions, just calculate the anisotropic variogram. E.g. for obtaining variograms in 0 and 90° direction, you can do following:
d = variogram([x y],z,...
'nrbins',50,'anisotropy',true,...
'thetastep',30);
plot(d.distance,...
d.val(:,find(d.theta==0  d.theta==pi/2)));
You may choose tighter angular bins. In this case it will be 0+15 and 90+15.
HTH, Wolfgang
jorge parra (view profile)
Wolfgang,
Very good function; is possible calculate variograms at different angles, for example vertical and horizontal variograms? If you can do this how do you specify the angles and how do you plot them,
Thank you
Jorge
Amy (view profile)
Wolfgang Schwanghart (view profile)
@ Marian and others
I am sorry that IPDM is currently not available on the FEX. You can send me a mail via the contact formular with your email and I'll send you the function then. I'll be working on a workaround next week.
Thanks for your interest,
Wolfgang
Marian Axente (view profile)
Is there anything else we can use instead of IPDM since there is no such function on Matlab FEX
Wolfgang Schwanghart (view profile)
@Kibre. T.
Nugget effect and sill are required when you fit a variogram model to the experimental variogram and, hence, they are not accepted as parameters here. However, you may want to try variogramfit, which enables to set initial values for a nugget model and sill.
Kibre. T. (view profile)
Very good. Does this code accept the other parameters such as nugget effect and sill?
oskenund (view profile)
Patrick A. (view profile)
Like Variogramfit, it is for me an invaluable contribution. Very nice code, very nice comments, well I've learned a lot with these two pieces of code, Thanks a lot !
Anthony Kendall (view profile)
Thanks a lot for this function, it works very well (though as you mentioned it is a bit of a memory hog, I had to subsample my data but the variogram was robust).
Shazux Gharasoo (view profile)
sorry, I found out that these values are saved already in variable 'd' as a structure. thanks though
Shazux Gharasoo (view profile)
that was what I meant, thank you :)
one more question, how to pass the data that has been plot? i mean here:
plot(out.distance,out.val,marker)
I need to do some curve fitting on those values...
Wolfgang Schwanghart (view profile)
Hi Shazux,
the distance measure used here is the euclidean distance. I guess that is what you mean by direct distance, do you?
Best regards,
Wolfgang
Shazux Gharasoo (view profile)
very nice program indeed. I had a question by the way. how the distance is calculated? direct distance or along X and Y streets?