MATLAB Answers

Seeking suggestions to speed up JSON parsing to table.

27 views (last 30 days)
Roger Pierson
Roger Pierson on 28 Dec 2019
Commented: Roger Pierson on 2 Jan 2020
I am seeking assistance in speeding up the processing time needed to parse very large JSON files and endup with a flat table of values. I have a solution that works, it just takes on the order of 3-6 minutes to process even a relaively small (for the sensor data in question) file of 57,360 JSON strings.
The original file is many thousands of individual track reports from a radar system. To aid further processing and analyis work, I want to get these track reports into a flat table of 130 variables. jsondecode faithfully decodes the strings and produces a struct with all the data. Unfortunately, the data includes many structures itself, and so simply doing a struct2tableisn't then whole answer.
for i = 1:lengthOfArray
% Decode the current row of the character array containing the JSON
currentRow = jsondecode(importedFile{1,1}{i,1});
end
In fact, trying to different variations of converting each sub-structure to a table and concatenating into one final flat table appears to be even more time consuming than my current solutions, which is brute force reading each field of each structure/sub-structure and assigning it to a flat temporary holding structure, then at the end converting that temporary structure to a table. Here is a short excerpt to show you what I mean:
tempStruct(i).trackQuality = currentRow.data.TrackData.Q_Value; %Track quality
tempStruct(i).covarianceType = currentRow.data.Track_Data.Covariance_Data.x_discriminator; %Discriminator = cartisian or shperical
tempStruct(i).coVarCartesian_varX = currentRow.data.Track_Data.Covariance_Data.Cartesian_Data.Var_X; % Track Variance for X. Meters^2
tempStruct(i).coVarCartesian_covXY = currentRow.data.Track_Data.Covariance_Data.Cartesian_Data.Cov_X_Y; % Track Covariance for X & Y
tempStruct(i).coVarCartesian_covXZ = currentRow.data.Track_Data.Covariance_Data.Cartesian_DataDataCartesian.Cov_X_Z; % Track CoVar for X & Z
Before I continue, here are some relevant profiler results:
JsonParser_Profile.png
The tabular display I think must be the final output to the command window of the struct2Table function. I can't figure out how to suppress that.
trackReportTable = struct2table(tempStruct);
Time conversions are killing me. The radar reports seconds and microseconds in seperate fields every time a timestamp is required. There are 25 times stamps in each JSON string. So for every iteration of the loop that parses the JSON file, I have to call a custom function ambTime2mat to convert the time stamps to datenum.
function [serialDateNum] = ambTime2mat (epochSeconds,microSeconds)
%This function returns date and time in matlab date serial number format
%from a given input of seconds since 1/1/1970 and microseconds since the
%last second.
%
% **** NOTE: Due to the mechanisim to combine the fields, the result only
% has millisecond resolution. ****
%
%INPUTS:
% Epochseconds = Seconds since January 1st, 1970
% microSeconds = Microseconds since value in seconds.
%
%OUTPUTS:
% serialDateNum = date and time in Matlab date serial number format.
%
%EXAMPLE 01: ambTime2mat(timeInEpochSeconds,microseconds);
%% CONVERT
%Check to see that something was passed.
if nargin == 0
error('No data passed, nothing to convert');
end
% Epoch seconds to date serial
dnum = datenum(1970,1,1,0,0,epochSeconds);
% No apparent way to add microseconds to datenum, so convert to milliseonds
% and accept loss of resolution :(
partialSeconds = round(microSeconds/1000);
% Add the milliseconds to the date serial number
serialDateNum = addtodate(dnum,partialSeconds,'millisecond');
end
I think there is substantial room for improvement here, but I can't identify it. The goal is to get seconds and microseconds from seperate fields converted into a single serial date number, loosing some resolution if necessary.
Thoughts, opinions, suggestions all welcome. Keep in mind this parser is part of a larger set of analysis tools and so doing things like creating a flat table or converting to datenum is simply to get the data into a common format the other tool components expect.
Thanks,
Roger
  4 Comments
Roger Pierson
Roger Pierson on 2 Jan 2020
Breakthrough on the issue with struct2table outputing results to the command window, and taking a significant amount of time to do so. Like most problems in computing, this one came down to user error.
The output wasn't comming from struct2table at all. It was comming from the result of the very function I was running.
I was using the 'run' command in the editor with this command string:
trackTable=parseDi20_01('/MATLAB/InputFiles/TRACKS.json')
OF COURSE the command window received a bunch of data - from assigning the output of the function to varialble trackTable. There is no ; at the end!
In short - I didn't even think about needing a semi-colon at the end of the command string tucked away up in that little run button on the menu. Out of sight, out of mind I guess. Sure enough
trackTable=parseDi20_01('/MATLAB/InputFiles/TRACKS.json');
Results in no unintended output to the command window.
I just didn't realize what was going on because struct2table is litterally the last line in the function, so it took the blame.
D'oh.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!