How to search for previous values in a table?

In the above table, I have GPS time inputs as well as sensor readout values. Every value in column 1 named '$GPRMC' has a year timestamp in column 10. And after that command there are a bunch of sensor readouts until the next GPS timestamp updates.
I want to pick out all of the values in column 5 that have a 'T' (column 3, 'sensor'). I also want to add the nearest year timestamp from the '$GPRMC' reading nearest every 'T' sensor value and append that year to a new arrray for every sensor value until a new '$GPRMC' value comes in. This way there are an equal number of year values and sensor values so I can plot them against each other.
I've managed to pick out all 'T' sensor values by using:
trows = data.sensor=='T';
But I have no idea how to search for the nearest row with column 1 value of '$GPRMC' and append to an array the year timestamp in column 10 for every row entry with 'T' in it.
I guess basically I want to find out how to search above in the file until it finds a '$GPRMC' entry.
How are some different ways I can achieve this?

6 Comments

You will have better luck getting a solution if you attach a sample of your table saved in a .mat file.
One thought: why not loop through the rows until you find a $GPRMC record and then start saving 'T' records (along with the latest timestamp) until you find another $GPRMC record, instead of searching from the location of the 'T' records?
That was my initial plan, but this file has 3.8million entries, and looping seems like it would take forever.
Updated post with sample table.
This is what I came up with for a loop to add arrays for year, t sensor values, and time stamps:
time =[];
tilt=[];
year=[];
x_test =[];
n=0;
for n=1:height(data)
if data{n,1} == '$GPRMC'
yearStamp = (data{n,10});
n+1;
elseif data{n,3} == 'T'
x_test = [x_test data{n,5}];
year = [year yearStamp];
time = [time data{n,4}];
n+1;
else
n+1
end
end
But obviously it takes an ungodly amount of time to run for an 8 million line file, haha.
You can shorten the table considerably to start with before anything else --
>> data1=data1(data1.sep=='$GPRMC'|data1.sensor=='T',:)
tD =
11×10 table
sep VarName2 sensor time X Y Z VarName8 VarName9 VarName10
______ _________ ______ __________ ____ _____ ____ ___________ ________ __________
$GPRMC 171647.00 V NaN NaN NaN NaN <undefined> NaN 3.0082e+05
$PAAG DATA T 1.7165e+05 -128 -416 7424 A NaN NaN
$PAAG DATA T 1.7165e+05 224 -352 8320 A 7 NaN
$PAAG DATA T 1.7165e+05 -512 -1920 9664 A 11 NaN
$PAAG DATA T 1.7165e+05 256 -1152 7936 A 31 NaN
$PAAG DATA T 1.7165e+05 3904 -192 6880 A 39 NaN
$PAAG DATA T 1.7165e+05 352 288 6720 A 22 NaN
$PAAG DATA T 1.7165e+05 -128 -2496 8768 A 17 NaN
$PAAG DATA T 1.7165e+05 1664 -2432 9600 A 5 NaN
$PAAG DATA T 1.7165e+05 1696 -1856 8256 A NaN NaN
$GPRMC 171648.00 V NaN NaN NaN NaN <undefined> NaN 3.0082e+05
>> tD.Properties.VariableNames(end)={'Date'};
>> tD.Date(1)==tD.Date(end)
ans =
logical
1
>> whos tD
The date value in both '$GPRMC' records shown is the same so what does it mean to add the "nearest" date? What values are of interest and what would it mean to plot them all at the same time?
I don't understand why this little table is taking 10MB of memory, either, that seems really, really peculiar -- unless one of the categorical variables was defined for the entire file and contains as many different element values as elements (or close)?
Oh, well, we can test that...the categories will be imbedded even if not used...
>> numel(categories(tD.VarName2))
ans =
80459
>> numel(categories(tD.VarName8))
ans =
3238
>> numel(categories(tD.sensor))
ans =
5285
>>
Indeed; those probably would be better off NOT being categorical -- are the floating point values for the sensor an actual sensor ID or data? We seem to have mixed metaphors in the data file -- the same variable seems to mean different things???
What to do next would depend on answer to the first Q?; we'll definitely have to have more than just one set of data to work with and a better definition of the end objective to do anything specifically useful.
To add more context:
The date will eventually update, so even if the date is the same, it needs to pull the "nearest' date so it can capture an update appropriately.
This is a very small sample of ~3 millionx16 size array that takes up a huge amount of memory. True I can probably filter out a good amount of the file though. The rest of the file looks exactly like this just repeated for 24 hours worth of time.
Ideally my output would be this:
The date stamp from data1.VarName10 under a '$GPRMC' to be appended with the hour timestamp from data1.time under a 'T' sensor value row.
So I end up with one array of [append(data1.VarName10,data1.time)] for every sensor "X" value. With the VarName10 data just being a repeat for every sensor value until the year timestamp comes back in.

Sign in to comment.

 Accepted Answer

This should do what you want. Note that the cell array in this case only has one cell because there are only two timestamp rows in your sample data and all of the sensor == T rows are between those two timestamps.
load('sample table.mat');
idx_T = find(data1.sensor == 'T') % find all of the sensor == T indices
idx_T = 9×1
9 14 19 24 29 34 39 44 49
idx_timestamps = find(data1.sep == '$GPRMC') % find all of the timestamp indices
idx_timestamps = 2×1
6 53
timestamps = data1.VarName10(idx_timestamps); % extract the timestamps
% Extract the data and timestamps into a cell array
for i = 1:numel(timestamps)-1
idx_current_data = idx_T > idx_timestamps(i) & idx_T < idx_timestamps(i+1);
for k = 1:numel(idx_current_data)
% get the data for the current timestamp
current_data(k,:) = [timestamps(i) data1.X(idx_current_data(i))];
end
data{i} = current_data;
end
data
data = 1×1 cell array
{9×2 double}

2 Comments

Ideally, I want the date stamp from data1.VarName10 under a '$GPRMC' to be appended with the hour timestamp from data1.time under a 'T' sensor value row.
So I end up with one array of [append(data1.VarName10,data1.time)] for every sensor value. With the VarName10 data just being a repeat for every sensor value until the year timestamp comes back in.
Can I just add:
current_data(k,:) = [timestamps(i) append(data1.time(idx_current_data(i),data1.X(idx_current_data(i)))];
Since data1.time and data1.VarName10 are both doubles (numbers), append would not be the right way to combine them. You will have to figure out what those numbers represent to figure out how to add them together.

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!