ERA5 average hourly data and yearly data

44 views (last 30 days)
I have data netcdf from ERA hourly, i choose 1940-2024 for all of days ech month and also because of value of data i consider just 4 hours 0:00, 6:00, 12:00, 18:00 for every grid point and my parameters are u10 & v10. my goal is to have in a map average time series that compute total days we have wind speed more than 15 m/s, and we khnow that wind speed = sqrt (u^2+v^2),
my data time is for example for jan is as 1940-01-01-0:00, 1940-01-01-6:00, ....., 2024-01-31-18:00. i need atfirst get average hourly to be daily data for every grid point every year and than compute for every grid point total days have wind speed more than 15 m/s , and after that for every grid point compute average 85 yerly to show on the map , please guide me
my code is
clear
clc
year = 1940:2024;
filename1 = 'uv1.nc';filename2 = 'uv2.nc';filename3 = 'uv3.nc';filename4 = 'uv4.nc';filename5 = 'uv5.nc';filename6 = 'uv6.nc';
filename7 = 'uv7.nc';filename8 = 'uv8.nc';filename9 = 'uv9.nc';filename10 = 'uv10.nc';filename11 = 'uv11.nc';filename12 = 'uv12.nc';
u1 = ncread(filename1, 'u10');u2 = ncread(filename2, 'u10');u3 = ncread(filename3, 'u10');u4 = ncread(filename4, 'u10');
u5 = ncread(filename5, 'u10');u6 = ncread(filename6, 'u10');u7 = ncread(filename7, 'u10');u8 = ncread(filename8, 'u10');
u9 = ncread(filename9, 'u10');u10 = ncread(filename10, 'u10');u11 = ncread(filename11, 'u10');u12 = ncread(filename12, 'u10');
v1 = ncread(filename1, 'v10');v2 = ncread(filename2, 'v10');v3 = ncread(filename3, 'v10');v4 = ncread(filename4, 'v10');
v5 = ncread(filename5, 'v10');v6 = ncread(filename6, 'v10');v7 = ncread(filename7, 'v10');v8 = ncread(filename8, 'v10');
v9 = ncread(filename9, 'v10');v10 = ncread(filename10, 'v10');v11 = ncread(filename11, 'v10'); v12 = ncread(filename12, 'v10');
time1 = double(ncread(filename1, 'valid_time'));time2 = double(ncread(filename2, 'valid_time'));time3 = double(ncread(filename3, 'valid_time'));
time4 = double(ncread(filename4, 'valid_time'));time5 = double(ncread(filename5, 'valid_time'));time6 = double(ncread(filename6, 'valid_time'));
time7 = double(ncread(filename7, 'valid_time'));time8 = double(ncread(filename8, 'valid_time'));time9 = double(ncread(filename9, 'valid_time'));
time10 = double(ncread(filename10, 'valid_time'));time11 = double(ncread(filename11, 'valid_time'));time12 = double(ncread(filename12, 'valid_time'));
lat = double(ncread(filename1,'latitude'));lon = double(ncread(filename1,'longitude'));[Lat,Lon] = meshgrid(lat,lon);
% % compute land-sea mask and wind speed for every month Jan to dec
land = island(Lat,Lon);
ws1 = sqrt(u1.^2 + v1.^2);ws2 = sqrt(u2.^2 + v2.^2);ws3 = sqrt(u3.^2 + v3.^2);ws4 = sqrt(u4.^2 + v4.^2);
ws5 = sqrt(u5.^2 + v5.^2);ws6 = sqrt(u6.^2 + v6.^2);ws7 = sqrt(u7.^2 + v7.^2);ws8 = sqrt(u8.^2 + v8.^2);
ws9 = sqrt(u9.^2 + v9.^2);ws10 = sqrt(u10.^2 + v10.^2);ws11 = sqrt(u11.^2 + v11.^2);ws12 = sqrt(u12.^2 + v12.^2);
ws1(land==0)=NaN;ws2(land==0)=NaN;ws3(land==0)=NaN;ws4(land==0)=NaN;ws5(land==0)=NaN;ws6(land==0)=NaN;
ws7(land==0)=NaN;ws8(land==0)=NaN;ws9(land==0)=NaN;ws10(land==0)=NaN;ws11(land==0)=NaN;ws12(land==0)=NaN;
  3 Comments
dpb
dpb on 22 Apr 2025
Edited: dpb on 22 Apr 2025
Please format your code (select and use the "Code" button or "Ctrl-E") for legibility...
Simply cut your script down to a couple of years and a couple of months and then upload that file and somebody will almost certainly be willing to look at it. One can develop the code logic without analyzing all the data, but it's essentially impossible to do anything without some of it.
Certainly one could operate by date/time over a 3D array, but having a dataset in hand to work with would make it far simpler to illustrate.
dpb
dpb on 22 Apr 2025
"... date for example for jan is a vector 1940/01/01 0:00, 1940/01/01 6:00,..."
The first next thing to do is to convert the date to datetime

Sign in to comment.

Answers (2)

dpb
dpb on 17 Apr 2025
Edited: dpb on 18 Apr 2025
Use arrays and/or tables to hold multiple sets of data, not sequentially-named individual variables, then one can either loop or use MATLAB vector syntax and/or <builtin grouping and processing tools> or groupsummary
As a starting point, simply building all the filenames could be more like
yrs = 1940:2024; % 'year' and 'years' are important builtin MATLAB functions, don't alias them
BASEDIR='The Base Directory of the Data files'; % set where the files are located so can use fully-qualified name
duv=dir(fullfile(BASEDIR,'uv*.nc')); % return directory structure of filenames
data=[]; % an empty variable to load into dynamically
for i=1:numel(duv) % iterate over the files found
u=ncread(fullfile(duv(i).folder,duv(i).name),'u10');
v=ncread(fullfile(duv(i).folder,duv(i).name),'v10');
t=double(ncread(fullfile(duv(i).folder,duv(i).name),'valid_time'));
ws=hypot(u,v); % compute windspeed this set
data=[data;[t u v ws]]; % combine all into an array
end
tData=array2table(data,'VariableNames',{'Time','U','V','WS'}); % convert to a table
lat=double(ncread(fullfile(duv(1).folder,duv(1).name),'latitude'));
lon=double(ncread(fullfile(duv(1).folder,duv(1).name),'longitude'));
[Lat,Lon] = meshgrid(lat,lon);
% % compute land-sea mask and wind speed for every month Jan to dec
land = island(Lat,Lon);
...
The best way forward depends somewhat on the actual data structure to associate the windspeeds with the locations and as far as summarizing because we don't know the details of that structure nor what, specifically, the time data are in order to use them as a grouping variable...
Attaching one or two of the files would go a long ways towards clarification...not having used netcdf files other than as finding them here, I don't have a lot of familiarity with them; it does seem as though it's a shortcoming in the ncrread function can only get one variable at a time...
  5 Comments
dpb
dpb on 18 Apr 2025
Thanks for the compliment, @William Rose; I recall some interesting threads, indeed! <g>
It's not an afront to @Mahboubeh Molavi-Arabshahi, when one is working on a specific problem and either has the data at hand or knows where it is if downloading, it is very easy to make the assumption it's no problem for someone else...but, the TMW Answers forum, while supported and maintained by Mathworks, is exclusively a volunteer action (even TMW Staff who monitor it are almost always doing so as additional effort above and beyond their daily tasks). Given that and that virtually all are quite involved either with a "real" job or other activities, time available needs to be able to be spent effectively and many (if not most) will find having to go search for and obtain data to be able to solve a problem as more than are willing to commit.
dpb
dpb on 19 Apr 2025
Edited: dpb on 19 Apr 2025
ADDENDUM: I did go looking for and apparently <found> the data repository...but it requires quite a lot of messing around including signup...not enough time nor ambition to try to deal with directly to decipher the format...but, above would appear to be the link if somebody else has the time and inclination if @Mahboubeh Molavi-Arabshahi isn't motivated enough in his own Q? to help us out here...

Sign in to comment.


Kris Fedorenko
Kris Fedorenko about 4 hours ago
I was curious to understand this workflow, so I took a stab at it. To me, using datetime and timetable and groupsummary / rowfun seemed like the most intuitive way to approch this problem.
If I understood correctly, the data is organized such that each file stores data for a specific month over all the years (e.g., “uv1.nc” stores the data for all the Januaries 1940-2024, "uv2.nc" for all the Februaries, etc.). I think it might be possible to download the required data from https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels as one file or organized in a different way (e.g. grouped by year, etc.), but I downloaded it in the same way to match the setup of this question (but only for years 2020-2025). It would have indeed been helpful to have example data attached to the question.
I attempt to filter out the data for over-land coordinates, calculate daily windspeed averages for each point of the lat-lon grid, and calculate the total number of days in 2020-2025 with wind speeds over 15 m/s for each point of the lat-lon grid. I am not a timetable or mapping expert, so there might be better ways.
removeLand = true;
% get all the files
ncFiles = dir("ERAS_byMonth/*.nc");
% the time in the files is in 'seconds since 1970-01-01'
refDate = datetime('1970-01-01');
% read lat and lon from the first file (should be the same for all)
firstFile = fullfile(ncFiles(1).folder, ncFiles(1).name);
lat = ncread(firstFile, "latitude");
lon = ncread(firstFile, "longitude");
% for removing the land values later
if removeLand
coast = load('coastlines.mat');
[lon2D, lat2D] = ndgrid(lon, lat);
isLand = inpolygon(lon2D, lat2D, coast.coastlon, coast.coastlat);
end
% setup the timetable for daily average wind
% (will contain data from all files)
variableNames = ["Lat", "Lon", "MeanWindSpeed"]; % apart from Time
% create a zero-rows table to be concatenated with tables from each file
dailyWindTable = timetable(Size=[0, numel(variableNames)],...
VariableTypes=["double", "double", "double"],...
VariableNames=variableNames, TimeStep=days(1),...
StartTime=datetime("2020-01-01"));
% loop over files
for i=1:numel(ncFiles)
% read in raw data
curFile = fullfile(ncFiles(i).folder, ncFiles(i).name);
u = ncread(curFile, "u10"); % lon x lat x time
v = ncread(curFile, "v10"); % lon x lat x time
secsSince1Jan70 = ncread(curFile, "valid_time");
latFile = ncread(curFile, "latitude");
lonFile = ncread(curFile, "longitude");
% make sure the lat-lon grid is indeed the same for all files
if any(latFile ~= lat)
error("Latitudes are different in " + ncFiles(i).name)
end
if any(lonFile ~= lon )
error("Longitudes are different in " + ncFiles(i).name)
end
% filter out data over the land
if removeLand
thirdDimLength = size(u, 3);
u(repmat(isLand, 1, 1, thirdDimLength)) = NaN;
v(repmat(isLand, 1, 1, thirdDimLength)) = NaN;
end
% convert to datetime
time = refDate + seconds(secsSince1Jan70);
% roll out this file's data into a timetable
% (each wind speed measurement for unique lon-lat-time combination will
% have its own row)
[lonGrid, latGrid, timeGrid] = ndgrid(lon, lat, time);
fileTable = timetable(timeGrid(:), latGrid(:), lonGrid(:), u(:), v(:),...
VariableNames=["Lat", "Lon", "u", "v"]);
% compute wind speed as another column of the table
fileTable.WindSpeed = sqrt(fileTable.u.^2 + fileTable.v.^2);
% average daily wind speed for each cell of the grid
dailyAverageByLocation = groupsummary(fileTable,...
["Time", "Lat", "Lon"],... % grouping variables
["day", "none", "none"],...% group Time by day
"mean", "WindSpeed");
% make it into a timetable again
dailyAverageWindTimeTable = timetable(...
datetime(string(dailyAverageByLocation.day_Time)), ...
dailyAverageByLocation.Lat, ...
dailyAverageByLocation.Lon, ...
dailyAverageByLocation.mean_WindSpeed, ...
VariableNames = variableNames);
% add it to the big table
dailyWindTable = [dailyWindTable; dailyAverageWindTimeTable];
end
After this we have a timetable with a row for each unique day-lat-lon combination and the corresponding wind speed (the lat-lon that are over land will have NaN wind speed values):
dailyWindTable =
5394657×3 timetable
Time Lat Lon MeanWindSpeed
___________ ___ _____ _____________
01-Jan-2020 35 45 NaN
01-Jan-2020 35 45.25 NaN
01-Jan-2020 35 45.5 NaN
01-Jan-2020 35 45.75 NaN
01-Jan-2020 35 46 NaN
: : : :
30-Sep-2025 50 54 NaN
30-Sep-2025 50 54.25 NaN
30-Sep-2025 50 54.5 NaN
30-Sep-2025 50 54.75 NaN
30-Sep-2025 50 55 NaN
Display all 5394657 rows.
We can plot the daily average wind for any day, e.g.:
% plot average daily wind for one specific day
oneDate = datetime("2021-12-31");
oneDateTable = dailyWindTable(oneDate,:);
plotMapData(unique(oneDateTable.Lat), unique(oneDateTable.Lon),...
oneDateTable.MeanWindSpeed, ...
"Average daily wind on " + string(oneDate));
We can now calculate the number of days with wind > 15m/s in 2020-2025 for each point of the lat-lon grid:
% calculate number of days with high wind per each grid cell
windThresh = 15;
highWindDays = rowfun(@(wind) sum(wind>windThresh),...
dailyWindTable,...
InputVariables=["MeanWindSpeed"],...
GroupingVariables=["Lat", "Lon"], OutputVariableNames=["NumWindyDays"])
The resulting table looks like this (a lot of rows are zeros because we made the wind data over land to be NaN):
highWindDays =
2501×4 timetable
Time Lat Lon GroupCount NumWindyDays
___________ ___ _____ __________ ____________
01-Jan-2020 35 45 2157 0
01-Jan-2020 35 45.25 2157 0
01-Jan-2020 35 45.5 2157 0
01-Jan-2020 35 45.75 2157 0
01-Jan-2020 35 46 2157 0
: : : : :
01-Jan-2020 50 54 2157 0
01-Jan-2020 50 54.25 2157 0
01-Jan-2020 50 54.5 2157 0
01-Jan-2020 50 54.75 2157 0
01-Jan-2020 50 55 2157 0
Display all 2501 rows.
And we can plot it:
% plot number of days with high wind
plotMapData(unique(highWindDays.Lat), unique(highWindDays.Lon),...
highWindDays.NumWindyDays,...
"Number of days with average wind > 15 m/s")
Here is the function I used above for plotting the map data:
function plotMapData(latVec, lonVec, rolledOutData, plotTitle)
% back to non-rolled-out data
gridData = reshape(rolledOutData, ...
numel(lonVec), numel(latVec), []);
% coordinate limits
[latmin,latmax] = bounds(latVec);
[lonmin,lonmax] = bounds(lonVec);
figure
ax = worldmap([latmin,latmax], [lonmin,lonmax]);
pcolorm(latVec, lonVec, gridData')
% add coastlines
hold on
coast = load('coastlines.mat');
plotm(coast.coastlat,coast.coastlon)
colorbar
title(plotTitle, FontSize=16)
hold off
end
  1 Comment
William Rose
William Rose about 1 hour ago
Thank you for a great demonstration of how to organize data and map it. I am summarizing, to help me learn.
You used isLand=inpolygon() to find which points on the lat-lon grid are land. Then you did something clever with repmat() to set the wind velocities u and v equal to NaN, for land locations. The land points outside the Caspian Sea polygon in the plots are correctly recognized as "inside" (i.e. land) by inpolygon(...). I am not exactly sure how inpolygon() knew which sides of the (global) coastlines files were land, and which sides were water, but it worked.
You used datetime() to create datetime variables, and you used functions that work on datetimes. You used ndgrid() to make a 3D array. You used timetable() to organize data. You used groupsummary() in a clever way to compute the mean wind speed at each grid location in a table, for a specified date. You used rowfun() in a clever way to compute the number of windy days at each grid location.
You rolled out the gridded data for calculations. You used unique() to make vectors of the unique latitudes and longitudes, from grid arrays with many repeated values.
You made a custom map-plotting function. You used reshape() to change rolled-out data back to gridded arrays, for plotting. You used ax=worldmap() to set up the plot axes, and pcolorm() to plot gridded data (mean wind speed, or number of windy days, at each grid point) and plotm() to plot the coastline.
Question related to result of code below:
ax=worldmap('World');
coast = load('coastlines.mat');
plotm(coast.coastlat,coast.coastlon)
disp(ax.Projection)
orthographic
I am surprised that Matlab calls this an orthographic projection. A standard orthographic projection shows (at most) one hemisphere. Matlab's own documentation for the orthographic projection says "Parallels: Unequally spaced circles centered on the central pole. Spacing decreases away from this pole. The opposite hemisphere cannot be shown." This map shows the entire Earth, so how can it be called orthographic?
Thanks for your instructive code!

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!