Getting error message "Index exceeds the number of array elements. Index must not exceed 0."

T = readtable("Data.xlsx");
data = readtable('Data.xlsx','TextType','string');
textData = data.Properties.Description;
textData(1:10)
cleanedDocuments = tokenizedDocument(textData);
cleanedDocuments(1:10)
cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);
cleanedDocuments = removeStopWords(cleanedDocuments);
cleanedDocuments(1:10)
cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma');
cleanedDocuments(1:10)
cleanedDocuments = erasePunctuation(cleanedDocuments);
cleanedDocuments(1:10)
cleanedBag = bagOfWords(cleanedDocuments);
cleanedBag = removeInfrequentWords(cleanedBag,2);
[cleanedBag,idx] = removeEmptyDocuments(cleanedBag);
labels(idx) = [];
cleanedBag;

3 Comments

We cannot reproduce the error without your Data.xlsx file. Thus you will wait in vain for an answer.
The file has like thousands of data points to analyze
Thousands of data points in an Excel file is not too many to upload, and that's the fastest way for us to help you.
You could also just upload a few rows of the file, if that gives the same error. (If that does not give the same error, then you've taken a step toward debugging the problem.)
Also, which line gives that error?

Sign in to comment.

Answers (2)

readtable() by default uses detectImportOptions or one of its variations. For an xlsx file, a spreadsheetImportOptions object would get created. That kind of import options object has no property that can control where to look in the xlsx file to find information to store in the table Description property
readtable() in turn has no option to indicate where to look to find information to store in the table Description property.
Which is to say that the table property 'Description' is initialized to empty. But your code expects that it has at least 10 elements to it.
There is a property with a related name, data.Properties.VariableDescriptions which potentially contains a description for each variable. The VariableDescriptions property can be set by readtable() under at least some conditions. Conditions have to be just right for automatic detection of variable descriptions.... That or the detected variable names have to include at least one variable name that is not a valid MATLAB identifier: in that case the default is to generate valid MATLAB variable names for the columns and to write the detected variable names into the VariableDescriptions property...
Note that data.Properties.Description is not the same as data.Description -- which would be what would be used if you had a variable whose name was Description .
[cleanedBag,idx] = removeEmptyDocuments(cleanedBag);
labels(idx) = [];
no empty documents?
labels not same size as cleanedBag?

Products

Release

R2023a

Asked:

on 8 Sep 2023

Answered:

on 19 Nov 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!