MATLAB Answers

How do I read only specific columns in from an Excel file, not consecutive columns?

48 views (last 30 days)
K
K on 13 Mar 2016
Commented: Are Mjaavatten on 15 Mar 2016
I have a set of Excel files that are ~ 17,000 rows by 6,000 columns. Needless to say, that's kind of tough to work with. I would like to take (for example), columns 1, 2, 5, 77, 124,255, 334, 1000, etc- you get the idea. Seems like the xlsread function only accepts Excel cell notation ('A1:B5') as input. Is there any slick way to read in just some columns from my massive spreadsheets?
As a follow up, I have a loop to write data for each file, but what I really want is the data from each new file vertically concatenated onto the existing file. Can I just use vertcat somehow, or since it's a loop, do I need to use Data=[Data;newData] or something like that?

  0 Comments

Sign in to comment.

Accepted Answer

Are Mjaavatten
Are Mjaavatten on 14 Mar 2016
I enclose a first version of a function I just wrote. This will read a number of columns from a large Excel file. Since it opens the Excel file only once, it is significantly faster than repeated calls to xlsread, reading one column at a time.
No error checks so far. A crash may leave an Excel process running. End it using the Windows tast manager
function data = read_excel_columns(filename,sheet,columns,firstrow,lastrow)
% Read selected columns from large Excel sheet using ActiveXServer
% filename: Seems that you have to use the full path to the Excel file
% sheet : e.g. 'Sheet1'
% columns : array of column mumbers, e.g [17,341,784]
% firstrow, lastrow: The first and last rows to be read
% data: : array of numerical values
%
% Are Mjaavatten, 2016-03-14
nrows = lastrow-firstrow+1;
ncols = length(columns);
data = zeros(nrows,ncols);
first = num2str(firstrow);
last = num2str(lastrow);
hExcel = actxserver('Excel.Application');
hWorkbook = hExcel.Workbooks.Open(filename);
hWorksheet = hWorkbook.Sheets.Item(sheet);
for i = 1:ncols
col = col2str(columns(i));
Range = [col,first,':',col,last];
RangeObj = hWorksheet.Range(Range);
data(:,i) = cell2mat(RangeObj.value);
end
release(hWorksheet)
release(hWorkbook)
release(hExcel)
end
function colname = col2str(n)
% Translate Excel column number to Column characters
s = '';
while n > 0
s = [s,char(mod(n-1,26)+65)];
n = floor((n-1)/26);
end
colname = deblank(fliplr(s));
end

  3 Comments

K
K on 14 Mar 2016
Hi, thanks very much! Only trouble I am having is with the sheet callout. Each spreadsheet has a different sheet name- is there anyway to make it so it just reads the first sheet as a default? Can I just remove the "hWorksheet" line and change hWorksheet to hWorkbook wherever it occurs to make it just read the first sheet as default? I've tried using both the sheet name as a string and the value 1 for sheet, and neither work. Thanks!

Sign in to comment.

More Answers (0)

Sign in to answer this question.