Datastore - support for mat-files

6 views (last 30 days)
MSTK
MSTK on 14 Jun 2017
Commented: Pouya Aminaie on 19 Mar 2022
I am trying to use the tall array functionality to sort a very large table, as well as other calculations. The tables are currently stored in several mat-files. When trying to create a datastore object to be used as input to the tall arrays, using one or more mat-file as inputs does not seem to be possible.
Am I missing something? Do I have to export the data to some other format to use as input for the datastore object? Any input would be appreciated.

Answers (4)

Anandan Rangasamy
Anandan Rangasamy on 17 Oct 2017

If you are using R2017b MATLAB version, you could set the UniformRead option to true when creating fileDatastore and that will save you from using cell2underlying. This allows you to create a tall table directly. Here is the documentation: FileDatastore

>> fds = fileDatastore('patients.mat', 'ReadFcn', @(x)struct2table(load(x)), 'UniformRead', true);
>> t = tall(fds)
t =
Mx10 tall table
     Gender      LastName     Age    Weight    Smoker    Systolic    Diastolic    Height             Location              SelfAssessedHealthStatus
    ________    __________    ___    ______    ______    ________    _________    ______    ___________________________    ________________________
    'Male'      'Smith'       38      176      true        124          93          71      'County General Hospital'            'Excellent'
    'Male'      'Johnson'     43      163      false       109          77          69      'VA Hospital'                        'Fair'
    'Female'    'Williams'    38      131      false       125          83          64      'St. Mary's Medical Center'          'Good'
    'Female'    'Jones'       40      133      false       117          75          67      'VA Hospital'                        'Fair'
    'Female'    'Brown'       49      119      false       122          80          64      'County General Hospital'            'Good'
    'Female'    'Davis'       46      142      false       121          70          68      'St. Mary's Medical Center'          'Good'
    'Female'    'Miller'      33      142      true        130          88          64      'VA Hospital'                        'Good'
    'Male'      'Wilson'      40      180      false       115          82          68      'VA Hospital'                        'Good'
    :           :             :      :         :         :           :            :         :                              :
    :           :             :      :         :         :           :            :         :                              :
  3 Comments
KAE
KAE on 21 Apr 2021
There needs to be more examples like this in the documentation.
Pouya Aminaie
Pouya Aminaie on 19 Mar 2022
What is your response data in this table?

Sign in to comment.


Steven Lord
Steven Lord on 14 Jun 2017
Have you seen this example in the documentation? Alternately, the documentation page for FileDatastore includes an example that creates a datastore for a collection of MAT-files. That might allow you to turn your collection of table arrays into a tall table.

MSTK
MSTK on 10 Aug 2017
Update:
I am now using fileDatastore to generate a datastore for 20 tables with about 3E8 rows each, where each one fits in memory. The tables are stored in separate .mat files. I had to define a custom load function to avoid the output from the tall() function being a cell array of structs.
As a next step, I want to define a single merged tall array. In order to being to treat the subtables as one table, a function like cell2underlaying is needed, as the tall() function outputs a cell array of tables. The cell2underlaying function does not work with tables of this size, as it appears to be based on performing a preview call. The preview function quickly cause the computer to run out of memory.
Are there any alternatives to cell2underlaying that could work? Sure there are workarounds, but my main aim here is to test and use the tall array functionality to sort a very large matrix. The cell2underlaying or similar functionality should be included in the tall() function in my opinion.
M

bigdataanalyses
bigdataanalyses on 9 Nov 2021
Hoping this helps someone:
Task at hand: After producing one very large 2D-table in Matlab via smaller table excerpts, I save each excerpt as a separate .mat-file. However, the re-loading causes problems.
Problem: Re-loading the tables stored as .mat via fileDatastore( ) and then transforming them to tall( ) makes each row in the table its own subtable.
Solution: After some trial-and-error, this works:
ds = fileDatastore(fullfile('input','database_regdata_*.mat'),... % the * is for the numbering of the outputted files
'ReadFcn', @(x)struct2table(load(x)), ... % Try to convert structure to table to avoid error in tall( )
"FileExtensions",".mat",...
"UniformRead", true); % one big table with same variables, o/w error
% Turn datastore into tall array for analysis out of memory.
tt = tall(ds);
ttt = vertcat(tt{:,:});

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!