How to parse large dataset and convert into a 3xn matrix
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
0 votes
Share a link to this question
- Un,sensor_data,unix_epoch_time
- Where Un is U1, U2, U3, …, U7. This is the port designation on the logger. The sensor_data is what lies in between the port designator and the epoch time. This can vary from sensor to sensor of course. You would need to know what sensor is on which port to parse the data. Usually this information can be obtained from the data file. In the file you have, U4 is the AML xchange pH sensor. You can see the pre-amble from this sensor in the startup logged data, it gives all the sensor information like serial number, sensor type etc. After this it just prints the pH value ‘U4,pH,epoch_time’. On U6 was a temperature sensor, this sensor outputs more data ‘U6, aanderaa_sensor_type serial_number temperature raw_value,epoch_time’.
- Of course the values in the data fields would vary depending on the setup. Would be smart to have an ability to configure this manually before running your script.
- So for this particular data set, you have 2 different sampling rates, the 4060 at 1s and pH at 50ms. If you read about the idronaut pH sensor you will find that it has a settling time of 4s or so. I would suggest averaging all the pH data in between temperature measurements. That way at one specific time and for every second you have 1 temperature and 1 pH measurement.
2 Comments
To avoid ambiguity, supply a datetime format using SETVAROPTS, e.g.
opts = setvaropts(opts,varname,'InputFormat','MM/dd/uuuu');
Answers (1)
0 votes
- Read the text file as a table and set the delimiter to ‘,’ and the variable type to string in the import options before reading the file
- Delete the row entries with non-empty entries in the fourth column as these may correspond to entries irrelevant to your result. Then delete the last column
- Delete the rows corresponding to the U4 port that contain values other than the pH value in the sensor_data column
- Parse the sensor_data for the rows corresponding to the U6 port and obtain the value for the temperature from it
- Delete the rows with missing values in the second column and the duplicate rows
- Find the average of all the pH values between temperature entries and create a new table with following:
- unix_epoch_time in the U6 port row immediately succeeding the averaged pH value
- averaged pH value
- the temperature in the U6 port row immediately succeeding the averaged pH value
Categories
Find more on String Parsing in Help Center and File Exchange
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)