Approach to computing statistics on a latitude/longitude grid

I'm looking for suggestions on the best approach to computing statistics on a latitude/longitude grid in MATLAB.
I have a non-gridded dataset with global wind observations (wind magnitude in knots) based on latitude and longitude over 30 years. I want to average that data on some latitude/longitude grid (say 1°x1° grid) then create a density-like plot of those average wind values on a map.
I'm trying to find the most efficient approach to doing this analysis in MATLAB. Thanks.

Answers (3)

I'd probably convert the (lat, lon, speed) into a double matrix (image). Then I'd sum up all the matrices to get the overall average. See attached scatteredInterpolant demo.
It would probably be best to use Mapping Toolbox functions for this, since on a world map, the latitude and longitude distances would be different at different locations.
It has several interpolation functions, geointerp may be closest to what you want to do, and mapinterp could also be useful. There are also others. (I do not have the Mapping Toolbox, so I have little experience with it.)
Calculating spatial statistics could be a challenge.
There are contour functions (specifically contour3m) that could help to depict those values.

11 Comments

I use a free mapping toolbox (m_map) that does a decent job at contouring gridded lat/long data. It's very similar to contourm in MATLAB's mapping toolbox.
I'm mainly interested on the best way to get the data averaged to a defined grid (i.e., 1°x1° grid) that I can then plot using m_map. I suppose just create a 180x360 matrix with a series of for loops and then compute the statistics for each grid that way?
As I envision it, the purpose of interpolating your data to a grid would be to approximate the values from the non-gridded data to the grid, so whatever was on the grid would be an interpolated approximation of the (nearest) values of the actual data. You would then interpolate each year to the same grid, then average those data over the 30-year time span, over the collected ('stacked') grids. (I believe we already did something like that a short while ago.)
That would be relatively straightforward. The problem then becomes projecting those data to a map grid to provide a reasonable interpretation of the data. I defer to the mapping routines for that. (The projection is obviously important, since all projections of a sphere to a flat plane leave something to be diesired.)
The 'm_map' tooolbox sounds interesting. I will investigate it to see what it needs in terms of computaational requirements, and what routines it has. I rarely do any sort of mapping, so I do not routinesly need that capability, however if it would help you solve this, I will download and install it and become familiar with it.
If you want me to help with that, let me know and I will learn 'm_map'. Just now, I have no experience with it.
Yes, you're correct. I need some kind of interpolation scheme to get the native data points to the new grid.
Let's say I choose bilinear interpolation for the regridding. I've not used interp2, but wondering if that's an option outside of the mapping toolbox?
The dataset I have now are winds by latitude/longitude to the nearest 0.1 degree. The first step is to get these datapoints to the 1°x1° grid using bilinear or cubic interpolation. Like you say, from there, computing statistics is fiarly straightforward.
I've been using M_Map for years and feel confident in its approach to projecting to a map grid.
You'll have to look up the documentation for interp2 and scatteredInterpolant (or any of the mapping interpolation functions) to learn the differences. I don't know what they might be. I just always use scatteredInterpolant but I'm not saying that is the best in all situations.
Rather than interp2 I would suggest scatteredInterpolant for the simple reason that it does not require that the input data be gridded. It takes scattered data and interpolates it on a grid. That part is relatively straightforward. The problem is that it is not going to work with geographic data.
While your data care absolutely nothing about map coordinates, it is necessary to interpolate them with respect to map coordinates. Obviously, is going to be different at the equator than the Arctic Circle. A grid that treats both of those as equal is not going to interpolate to reliable data. At worst (consider the Mercator porjection) it is going to distort them.
Doing an Interweb search on 'interpolating geographic coordinates' brings up several citations. Apparently, Python has a set of these (Spatial Interpolation) however I am relatively new to Python (just starting to learn it), so I am not certain how to integrate these with MATLAB code. This is obviously not a new problem, and people more knowledgable than I am, have solved it. Another citation is Interpolate irregularly sampled data to a regular grid in the Stack Exchange Geographic Information Systems section.
I looked at M_Map, and while it has several great features, it has a paucity of functions (at least thaat I can see) and no list of them and their documentation. It apparently has no interpolation functions of its own. If I am wrong about this (you obviously have more experience with it than I do), please provide links to the appropriate M_Map functions.
This seems to be much more challenging than I at first thought it would be. Getting the descriptive statistics on the data is going to be relatively straightforward. Interpolating them to a geographic coordiates grid first is not.
.
@John Cruce what is the maximum extents of your data? If it's over a small portion of the globe, not much affected by the curvature, then it might be good enough to assume they're linear coordinates, but if it's over something like a whole state, then that might not be so accurate to use linear coordinates like scatteredInterpolant assumes.
I think I'll probably need something with a projection for the interpolation. The data are over about 60 degrees of latitude and about 100 degrees of longitude.
I have experience with ArcGIS, so I may just need to do the interpolation outside of MATLAB. I was hoping for an all-in-one solution, but it sounds like that's going to be an uphill climb with MATLAB alone for geospatial data.
Wondering if I should consider the mapping toolbox or if that could get me to the solution I need?
I looked up ArcGIS. It does not go into any detail on its website with respect to what it offers. I'm not certain it can do the sort of spatial interpolation you need. It's also apparently not free.
I mentioned a few Python resources. Coding all this in MATLAB would be time-consuming. I would not be averse to doing that, however it would take me a while to learn the essential information and assumptions first.
The Mapping Toolbox appears to have relevant interpolation functions. The interpm, geointerp, and mapinterp functions seem to be appropriate, however I have no experience with any of these, and do not know which would be best for what you want to do.
With respect to wind magnitude, I assume that also includes direction. It would likely be best to decompose the wind magnitude and direction to individual east-west and north-south components before doing any statistics on them. That would avoid the problem of angle dscontinuities, for example between and that if averaged would be , however averaging their components and using atan2d would produce , as expected.
I might be overthinking this whole thing. My scattered datapoints are measuring different dates/times. What I might need to do is simply bin the datapoints in the 1°x1° grid then compute statistics. Interpolation would be more important to regridding but that's not what I'm aiming to do here.
At a basic level, I want to capture all wind observations (taken to the nearest 0.1 degree) in a 1°x1° grid and compute the mean.
Maybe I create a 4d matrix for the new 1°x1° grid, with lat, long, total wind, and total count using a series of for loops? From there, the average would be simply dividing the total wind by the count for each grid.
I'm not certain that I completely understand.
If your data are in the same locations and simply gathered at different dates and times, and all you want are the statistics on those specific locations (that don't change), then using the mean function (and related functions) will work. There would be no specific need to interpolate them.
Since I doubt that any of the data are ever negative and are all probably greater than zero, they would most likely be lognormally distributed. If so, use the lognormal distribution functions to estimate their means and variances or standard deviations. You can get estimates of the parameters in the original data units from:
lognparms = @(mu,sigma) [exp(mu + sigma^2/2); exp(2*mu + sigma^2) * (exp(sigma^2)-1); exp(mu); exp(mu-sigma.^2)]; % [mean; var; median; mode]
using the returned by the lognormal distribution fitting functions.
.
"With respect to wind magnitude, I assume that also includes direction. It would likely be best to decompose the wind magnitude and direction to individual east-west and north-south components before doing any statistics on them."
Aside from issues related to continuity, this comment from Star Strider is especially important if your are planning on using the summarized data for computing atmospheric mixing/transport/etc. (e.g., in a Eulerian model). Imagine a trivial world where winds go east to west half the time and west to east the other half. Averaging your data in this situation would lead to a wind velocity of 0 (or close), that is no mixing and no transport. In this context, it is often preferable to decompose wind speed+directions or velocities into components u+, u-, v+, v-, before averaging (otherwise we underestimate mixing/transport).

Sign in to comment.

[Edit: I am replacing the script with a revised version that corrects a mistake in the previous version. The error was in the calculation of X and Y components of velocity for the quiver plot. Wind directions in this simulated data file, and as recorded by data buoys, are compass headings. When computing x and y components of wind vectors, do x=-r* cosd(theta) and y=-r*sind(theta), where theta=90-compass heading. In the earlier version, I did x=-r*cosd(compass heading), etc. (The minus signs are because wind is "out of", or from, the specified direction.)]
I've been trying to submit an answer but I keep getting an error. I will try with text, and without the figures I had intended to attach. I will attach data files and script separately.
------------
Here is a file with 180K simulated observations of wind direction and speed at random latitudes and longitudes.
I was unable to upload the full file of wind observations because it exceeds 5 MB, even when zipped. Therefore I saved the top and bottom halves of the array separately, and attached them. Do the following to reconstruct the full file:
load('windObs3a');
load('windObs3b');
windObs3=[windObs3a;windObs3b];
save('windObs3',windObs3');
Then you will be able to run the attached script, which loads data from file windObs3.mat.
The latitudes are distributed in proportion to the area (circumference) at each latitude, up to some random flucuations. If you didn't do that, you'd get much more dense sampling at the poles.
Find the mean of direction by taking the cosine and sine of each direction, then find the mean values of the cosines and sines, then using atan2d() to find the mean direction:
meanX=mean(cosd(wdir));
meanY=mean(sind(wdir));
meanwDir=atan2d(meanY,meanX);
This approach prevents the problems that otherwise occur for values near the 0 to 360 boundary. (You don't want the mean direction of 359 and 1 to be 180; you want it to be 0...)
Use logical indexing to find the means for each latitude range and for each latitude, longitude block. For example,
LatEdges=-90:3:90; % edges of 3-degree latitude bands
LatMid=(LatEdges(1:end-1)+LatEdges(2:end))/2; % middle of each band (used for plotting)
M1=length(LatMid);
LongEdges=0:3:360; % edges of 3-degree longitude bands
LongMid=(LongEdges(1:end-1)+LongEdges(2:end))/2; % middle of each band (used for plotting)
M2=length(LongMid);
% ...
% Compute mean speed in each latitude,longitude block
meanWspd2=zeros(M1,M2);
for i=1:M1
for j=1:M2
meanWspd2(i,j)=mean(wspd(Lat>=LatEdges(i) & Lat<LatEdges(i+1) & ...
Long>=LongEdges(j) & Long<LongEdges(j+1)));
end
end
The attached script makes 5 figures.
Figure 1 shows the number of observations in each band of latitude and longitude. It shows that the number of samples at each latitude is approximately proportional to cos(Lat), and the number at each longitude is approximately the same at all longitudes. This is what we expect, if the sampling is random with the same likelihood over the whole globe.
Figure 2 shows the mean wind speed and mean wind direction as functions of latitude. The winds vary across the tropics (defined here at 0 to +-30), mid-latitudes (+-30 to 60), and polar regions (+-60 to 90). The mean wind speed is near 0 at the equator, the poles, and the boundaries between topics, mid-latitudes, and polar regions. The mean wind speed is ~10 knots in the mid-tropics, ~15 in the mid-mid-latitudes, and ~10 in the mid-polar regions. The prevailing wind direction is S at S pole, SE at Lat=-75, E at Lat=-60, W at Lat=-59, NW at Lat=-45, N at Lat=-30, S at Lat=-29, SE at Lat=-15, and E at the equator. In the northern hemisphere the directions are reflected in the N versus S direction, so wdir=NE at Lat=+15, SW at Lat=+45, NE at Lat=+75, and N at N pole.
Figure 3 shows the histogram of wind speed around latitudes +15, +45, and +75.
Figure 4 shows the histogram of wind direction around latitudes +15, +45, and +75. The plots shows that the most likely direction is NE at Lat=+15, SW at Lat=+45, and NE at Lat=+75, and it shows that the direction varies significantly.
Figure 5 shows mean wind speed as a function of latitude and longitude, using 6x6 degree blocks. There are some blocks near the poles where there are no samples. This is an equirectangular projection of the Earth's surface.
Combine mean wind speed and direction data to get the x and y components of wind velocity.
meanWindX=-cosd(meanWdir2).*meanWspd2;
meanWindY=-sind(meanWdir2).*meanWspd2;
Plot the data with quiver():
quiver(LongMid,LatMid,meanWindX,meanWindY)
Good luck with your research.

1 Comment

If you run the script attached to the answer I offered, you should get 6 figures. They correspond to the figures I refer to in my answer as Figures 1-5, plus quiver plot.

Sign in to comment.

Products

Release

R2019a

Asked:

on 20 Oct 2025

Edited:

on 27 Oct 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!