readtable changing range (not all contents)

Hello,
Manually Importing an XLS file (right click import data) the window by default highlights a matrix, say C10:X90
If I manually change the default "column vectors" to "table" and then click import, then it imports the matrix I need.
I need to do this for many files. However, using readtable('file1.xls') it imports all contents, beyond that matrix.
The matrix range changes from file to file so I cannot fix it to C10:X90, so file2.xls can be C15:X80, etc
How can I import the highlighted matrix (which changes from file to file) in a loop?

7 Comments

By "highlighted matrix", do you mean to say that the data you're interested in importing is different from file to file, and you want to be able to select that data visually?
I suspect that the import tool is being used. I seem to recall that the import tool will attempt to find a rectangular numeric area.
If that is the task, to find a rectangular numeric area, then readtable() as a first step is probably still your best choice.
I attach an xls file with the data. In Matlab, if you right-clik file, then import data, you will see that the highlighted data is a matrix. I would like to import/read that matrix, and not the whole spreadsheet.
If you use
readtable('Importing.xls')
it imports all data, beyond the matrix.
I have multiple files like this that I would like to import one at a time, stack w previous file and so on. The problem is that the matrix of interest is not always the same range. In the example file it is A7:C15 but that can be A5:C18, etc. Columns A:C are fixed, what changes is the row start-end.
For some, good, reason, when I manually right click import, by default it selects the matrix I need, so I need to do this in a loop. Thanks
I could use
[num,txt,raw] = xlsread('Importing.xls','A7:C15')
But some files have different starting-ending rows. So I don't know how to auto-select the apropriate matrix range.
If I manually select to import data, and in "Import select" I select generate function, the code reads this
% If row start and end points are not specified, define defaults
if nargin <= 3
startRow = 7;
endRow = 15;
end
But how does it specify the start and endRow? Again, I have multiple files, and the start/endRow changes from file to file.
I manage to get the startRow but cannot seem to get the endRow, help
opts = detectImportOptions('Importing.xls');
startRow=opts.DataRange
With that file the highlighted area is A7:F20. There is numeric data in F7 and there is no obvious reason why it should be excluded.
If you were to use
[num,txt,raw] = xlsread('Importing.xls')
then num would correspond to B7:F15
If you want to automate then you need to define more rigorously what is to be imported or not. For example is the rule that you are always extracting from the left side, column A? And is the rule that you always stop at the first empty column (excluding header lines) ?

Sign in to comment.

Answers (1)

Not sure how to detect the end of your dataset before using readtable, but what about reading it in and then deleting the extra rows? For this, I'm assuming Title1 will be populated for every row that has data. Adjust for how your data actually behaves.
opts = detectImportOptions('Importing.xls');
startRow=opts.DataRange
tbl = readtable('Importing.xls',opts)
tbl(ismissing(tbl.Title1),:) = []
Probably worth stating that this also assumes every file has a column header Title1 that contains the Char2, ... values.

1 Comment

If you're already using import options,
opts.MissingRule = 'omitrow' will remove rows with missing. However that will omit rows with ANY missing data in ANY column.
But, if all you want is to read a block, setting opts.DataRange to the beginning cell of that range will read until it reaches the bottom, then stop.
e.g.
opts.DataRange = 'B3'

Sign in to comment.

Asked:

on 11 Dec 2018

Commented:

on 14 Dec 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!