MATLAB Answers

Datastore - support for mat-files

81 views (last 30 days)
MSTK
MSTK on 14 Jun 2017
Commented: MSTK on 18 Oct 2017
I am trying to use the tall array functionality to sort a very large table, as well as other calculations. The tables are currently stored in several mat-files. When trying to create a datastore object to be used as input to the tall arrays, using one or more mat-file as inputs does not seem to be possible.
Am I missing something? Do I have to export the data to some other format to use as input for the datastore object? Any input would be appreciated.

  0 Comments

Sign in to comment.

Answers (3)

Steven Lord
Steven Lord on 14 Jun 2017
Have you seen this example in the documentation? Alternately, the documentation page for FileDatastore includes an example that creates a datastore for a collection of MAT-files. That might allow you to turn your collection of table arrays into a tall table.

  0 Comments

Sign in to comment.


Anandan Rangasamy
Anandan Rangasamy on 17 Oct 2017

If you are using R2017b MATLAB version, you could set the UniformRead option to true when creating fileDatastore and that will save you from using cell2underlying. This allows you to create a tall table directly. Here is the documentation: FileDatastore

>> fds = fileDatastore('patients.mat', 'ReadFcn', @(x)struct2table(load(x)), 'UniformRead', true);
>> t = tall(fds)
t =
Mx10 tall table
     Gender      LastName     Age    Weight    Smoker    Systolic    Diastolic    Height             Location              SelfAssessedHealthStatus
    ________    __________    ___    ______    ______    ________    _________    ______    ___________________________    ________________________
    'Male'      'Smith'       38      176      true        124          93          71      'County General Hospital'            'Excellent'
    'Male'      'Johnson'     43      163      false       109          77          69      'VA Hospital'                        'Fair'
    'Female'    'Williams'    38      131      false       125          83          64      'St. Mary's Medical Center'          'Good'
    'Female'    'Jones'       40      133      false       117          75          67      'VA Hospital'                        'Fair'
    'Female'    'Brown'       49      119      false       122          80          64      'County General Hospital'            'Good'
    'Female'    'Davis'       46      142      false       121          70          68      'St. Mary's Medical Center'          'Good'
    'Female'    'Miller'      33      142      true        130          88          64      'VA Hospital'                        'Good'
    'Male'      'Wilson'      40      180      false       115          82          68      'VA Hospital'                        'Good'
    :           :             :      :         :         :           :            :         :                              :
    :           :             :      :         :         :           :            :         :                              :

  1 Comment

MSTK
MSTK on 18 Oct 2017
Thanks for the input. My organization has not facilitated upgrading to 2017b yet, but I will check when it is available.

Sign in to comment.


MSTK
MSTK on 10 Aug 2017
Update:
I am now using fileDatastore to generate a datastore for 20 tables with about 3E8 rows each, where each one fits in memory. The tables are stored in separate .mat files. I had to define a custom load function to avoid the output from the tall() function being a cell array of structs.
As a next step, I want to define a single merged tall array. In order to being to treat the subtables as one table, a function like cell2underlaying is needed, as the tall() function outputs a cell array of tables. The cell2underlaying function does not work with tables of this size, as it appears to be based on performing a preview call. The preview function quickly cause the computer to run out of memory.
Are there any alternatives to cell2underlaying that could work? Sure there are workarounds, but my main aim here is to test and use the tall array functionality to sort a very large matrix. The cell2underlaying or similar functionality should be included in the tall() function in my opinion.
M

  0 Comments

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!