How do I read only specific columns in from an Excel file, not consecutive columns?

I have a set of Excel files that are ~ 17,000 rows by 6,000 columns. Needless to say, that's kind of tough to work with. I would like to take (for example), columns 1, 2, 5, 77, 124,255, 334, 1000, etc- you get the idea. Seems like the xlsread function only accepts Excel cell notation ('A1:B5') as input. Is there any slick way to read in just some columns from my massive spreadsheets?
As a follow up, I have a loop to write data for each file, but what I really want is the data from each new file vertically concatenated onto the existing file. Can I just use vertcat somehow, or since it's a loop, do I need to use Data=[Data;newData] or something like that?

 Accepted Answer

I enclose a first version of a function I just wrote. This will read a number of columns from a large Excel file. Since it opens the Excel file only once, it is significantly faster than repeated calls to xlsread, reading one column at a time.
No error checks so far. A crash may leave an Excel process running. End it using the Windows tast manager
function data = read_excel_columns(filename,sheet,columns,firstrow,lastrow)
% Read selected columns from large Excel sheet using ActiveXServer
% filename: Seems that you have to use the full path to the Excel file
% sheet : e.g. 'Sheet1'
% columns : array of column mumbers, e.g [17,341,784]
% firstrow, lastrow: The first and last rows to be read
% data: : array of numerical values
%
% Are Mjaavatten, 2016-03-14
nrows = lastrow-firstrow+1;
ncols = length(columns);
data = zeros(nrows,ncols);
first = num2str(firstrow);
last = num2str(lastrow);
hExcel = actxserver('Excel.Application');
hWorkbook = hExcel.Workbooks.Open(filename);
hWorksheet = hWorkbook.Sheets.Item(sheet);
for i = 1:ncols
col = col2str(columns(i));
Range = [col,first,':',col,last];
RangeObj = hWorksheet.Range(Range);
data(:,i) = cell2mat(RangeObj.value);
end
release(hWorksheet)
release(hWorkbook)
release(hExcel)
end
function colname = col2str(n)
% Translate Excel column number to Column characters
s = '';
while n > 0
s = [s,char(mod(n-1,26)+65)];
n = floor((n-1)/26);
end
colname = deblank(fliplr(s));
end

3 Comments

Hi, thanks very much! Only trouble I am having is with the sheet callout. Each spreadsheet has a different sheet name- is there anyway to make it so it just reads the first sheet as a default? Can I just remove the "hWorksheet" line and change hWorksheet to hWorkbook wherever it occurs to make it just read the first sheet as default? I've tried using both the sheet name as a string and the value 1 for sheet, and neither work. Thanks!
Try using the number 1 as the sheet argument (argument no. 2)
I just submitted an improved version of this file to the file exchange.

Sign in to comment.

More Answers (0)

Categories

Asked:

K
K
on 13 Mar 2016

Commented:

on 15 Mar 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!