How to extract specific rows with requirements from multiple CSV files?

17 views (last 30 days)
Ryan Wang
Ryan Wang on 28 Aug 2017
Commented: MUKESH KUMAR on 3 Jun 2020
Hi,
I am having a question regarding to data extracting and need your help.
I have roughly 700 CSV files under three folders with the same format. Each CSV contains 40 columns, and the columns I am interested in are the 32th and 33th column, which are "Latitude" and "Longitude".
Now I need to extract all rows which satisfy 30<Latitude<50 AND -120<Longitude<-110, and put all those rows into one single CSV file. What I have known so far is to open the file, textscan those data into 40 different cells, and close the file.
Could someone help me on this issue? Thanks!
Typical CSV files are like:
Date Time ...... Latitude Longitude ......
08/27 7:00 ...... 33.44 110.43 ......
08/28 7:00 ...... 35.44 160.43 ......
08/29 7:00 ...... 33.44 120.48 ......
08/29 7:00 ...... 32.44 150.42 ......
08/29 7:00 ...... 38.44 130.46 ......
08/29 7:00 ...... 53.44 110.63 ......
......
Two out of 600 sample CSV files have been attached.
  2 Comments
Ryan Wang
Ryan Wang on 28 Aug 2017
Thank you very much Akira. I just uploaded two csv files, where I have nearly 600 of them. Specifically, the latitude and longitude columns are the 32th and the 33th, and their requirements are 33.975~33.982, and -117.373~-117.331, respectively.

Sign in to comment.

Accepted Answer

Akira Agata
Akira Agata on 28 Aug 2017
Hi Ziran-san, thank you for uploading some sample csv files. Thanks to these files, I could understand the point!
Based on these files, I believe the solution for your problem will be like the following. In this code, I assumed that your csv files are stored in ./data folder.
But looking at your sample csv files, there are no rows that matches your condition. So, using your two csv files, the output of the following code is empty (I hope this is simply due to these two sample files...). Anyway, I would be happy if my answer could be your help! :-)
% List of CSV files
dataFolder = './data/'
list = dir([dataFolder,'*.csv']);
% CSV data format
format = ['%{MM/dd/yy HH:mm:ss}D',repmat('%f',1,39)];
% Extraction condition
latLim = [33.975, 33.982];
lonLim = [-117.373, -117.331];
% Read each CSV file, extract rows and store them to 'Output'
Output = table();
for kk = 1:numel(list)
data = readtable(...
fullfile(list(kk).folder, list(kk).name),...
'HeaderLines',2,...
'Format', format);
idxLat = (data{:,32} >= latLim(1)) & (data{:,32} <= latLim(2));
idxLon = (data{:,33} >= lonLim(1)) & (data{:,33} <= lonLim(2));
idx = idxLat & idxLon;
Output = [Output; data(idx,:)];
end
  3 Comments

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!