How to set table VariableNames from a cell?
24 views (last 30 days)
Show older comments
Anandasubramanian Pranatharthy
on 13 Oct 2018
Edited: Walter Roberson
on 14 Oct 2018
I have a peculiar frustrating situation: am trying to set the headers of a 1x19 table from an excel table to provide as an input to a Machine learning struct. As per the documentation of the struct i need to provide a MATLAB Table with 19 rows of numeric data with the following headers.
gender seniorcitizen partner dependents tenure phoneservice multiplelines internetservice onlinesecurity onlinebackup deviceprotection techsupport streamingtv streamingmovies contract paperlessbilling paymentmethod monthlycharges totalcharges
10 0 200 200 16 100 200 200 300 300 300 300 300 300 502 200 630 18.95 326.8
However, the problem is, for some reason, i get the dreaded, unintelligible error message:
Function 'subsindex' is not defined for values of class 'table'.
1. I tried importing the excel as a variable using the import facility. Does not work. 2. I tried creating the table from a cell array of headers using the cell2table. That gives another of the infamous cryptic errors. (seriously, wonder who writes those error messages in MATLAB? they seem to have a special training for making it as unfriendly as possible)
I found that the Properties of the table in MATLAB workspace look like below:
description: ''
userdata: []
dimensionnames: {'row' 'variables'}
variablenames: {1×19 cell}
variabledescriptions: {}
variableunits: {}
rownames: {}
I need to set the VariableNames directly as values instead of a 1x19 cell. How do i do it? (And pls avoid referring me to MATLAB table documentation. Does not work)
7 Comments
dpb
on 14 Oct 2018
What you DON'T show us is how you created
willCustomerChurn
so we can figure out what it is, specifically.
I'm guessing you use one of the ML prepackaged applications and Exported its results to the workspace?
If so, which app? Then we can look at the doc for it and follow along. Without, we're still guessing...
Accepted Answer
dpb
on 14 Oct 2018
Edited: dpb
on 14 Oct 2018
OK, there was enough of a hint that you used Regression Learner App. I'd never even opened it before, but since you used "Churn" in the model name and had the list of variables, I was able to build a model.
Let's see if can predict something; I just called the exported model TM for brevity--
>> TM.predictFcn(T(1:3,:)) % predict first three values from existing table
ans =
200.0000
200.0000
155.5556
>>
The easiest way to use the model is something like, presuming the original table is T
TT=T(1:3,:); % just make a copy of the first three rows; gives the variables as exist
TT.SeniorCitizen=ones(3,1); % let's change a variable to something different...
>> TM.predictFcn(TT) % and see what we now predict for those...
ans =
100.0000
200.0000
111.1111
>>
So, did make a difference in prediction for two of the three; not sure w/o more in-depth digging as to why the second is the same but seems to work as advertised.
Not sure just what you actually tried to do; perhaps building a table that doesn't match the original but only includes the predictors doesn't work??? I dunno, didn't try it.
Again, show just the exact code you tried in sequence without any effort to analyze that we can see and can probably tell where that went wrong, too.
ADDENDUM
To create a new table that just has the required variables in it, use the facility of table addressing...
>> Tpred=T(1:3,TM.RequiredVariables); % just three rows for brevity
Tpred =
3×19 table
gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges
______ _____________ _______ __________ ______ ____________ _____________ _______________ ______________ ____________ ________________ ___________ ___________ _______________ ________ ________________ _____________ ______________ ____________
20 0 100 200 1 200 700 400 200 100 200 200 200 200 500 100 600 29.85 29.85
10 0 200 200 34 100 200 400 100 200 100 200 200 200 501 200 610 56.95 1889.5
10 0 200 200 2 100 200 400 100 100 200 200 200 200 500 100 610 53.85 108.15
>>
As this shows, the cell array of names is just fine for addressing variables. I don't know where you came up against the categorical problem unless you had converted some of the input table variables to categorical earlier before importing into the App.
BTW, this is also OK for predicting with; the model object clearly matches variable names in the table; I'd venture undoubtedly one could have other variables added as well as long as the originals are there and have the correct data types as were extant when the model was fitted.
table addressing is extremely flexible but have to get the syntax right for what are trying to do; the detailed info is at Access-data-in-a-table
ADDENDUM SECOND
BTW, if you were to want or need to do so, you can build a new table from scratch with the required variables with table; if the data are in variables of the desired name, those will be the default names, otherwise the optional 'VariableNames' property will accept the cell array or you can redefine the names after table creation by setting them under the 'Properties' structure.
5 Comments
dpb
on 14 Oct 2018
Glad to help...once one can get to the root of the problem it's generally not too bad! :)
"I ditched he numeric data option..."
I don't know what the means, but OK (I presume has something to do with inside the ML App, most likely).
NB: Using the '.RequiredVariables' array from the original table was the obvious way here where you already had the existing table; as noted if were building from scratch of an external data set for prediction, as noted it (the table) can be built directly as well.
Anandasubramanian Pranatharthy
on 14 Oct 2018
Edited: Walter Roberson
on 14 Oct 2018
More Answers (2)
Walter Roberson
on 14 Oct 2018
Function 'subsindex' is not defined for values of class 'table'.
That means you cannot use a table as a subscript.
I need to set the VariableNames directly as values instead of a 1x19 cell. How do i do it?
The VariableNames property of a table may be set to either a string array or a cell array of character vectors; if a string array is used then it will be converted to a cell array of character vectors.
aaa = array2table(randi(10,5,3));
aaa.Properties.VariableNames = {'gender', 'seniorcitizen', 'partner'};
"Attached is the excel itself. The 'Numeric Data' tab is the one am trying to use."
data = readtable('Telco Customer Churn2.xlsx', 'Sheet', 'Numeric Data');
data(1:4, :)
ans =
4×21 table
customerID gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges Churn
____________ ______ _____________ _______ __________ ______ ____________ _____________ _______________ ______________ ____________ ________________ ___________ ___________ _______________ ________ ________________ _____________ ______________ ____________ _____
'7590-VHVEG' 20 0 100 200 1 200 700 400 200 100 200 200 200 200 500 100 600 29.85 29.85 200
'5575-GNVDE' 10 0 200 200 34 100 200 400 100 200 100 200 200 200 501 200 610 56.95 1889.5 200
'3668-QPYBK' 10 0 200 200 2 100 200 400 100 100 200 200 200 200 500 100 610 53.85 108.15 100
'7795-CFOCW' 10 0 200 200 45 200 700 400 100 200 100 100 200 200 501 200 620 42.3 1840.75 200
"cannot automatically convert a double variable to categorical values."
Is it possible that you have an existing table object that you are trying to add more data to?
1 Comment
dpb
on 14 Oct 2018
"Is it possible that you have an existing table object that you are trying to add more data to?"
I think it's clear OP does have some sort of other object is trying to set/update, but just what seems more than reluctant to share...out of not understanding what it takes to ask a question that can be answered and not thinking about what the readers are able to know/infer from posting alone vis a vis his knowledge of everything associated with the problem on his end is the likely cause. Pushing for that so far hasn't yielded what we really need to know...
Guillaume
on 14 Oct 2018
Right, this is one case of a badly explained problem and one case of a user not really understanding the error message he's given or the code is using.
As stated by the by the HowToPredict field of the willCustomerChurn structure, the code requires a table with
- variable names that are identical to those in WillCustomerChurn.RequiredVariables. Now if we look at the variable names of the table, they match. So that's not the problem and asking how to change the variable names (trivial to do b.t.w) is barking up the wrong tree.v
- "variable formats (e.g. matrix/vector, datatype) must match the original training data". Looking at the error, the code expects a variable to be categorical. Instead it is double. That's the problem that needs fixing.
As it is, we don't have enough information to know which variable should be categorical. The best thing to do would be to use the same code that was used to load the training data into a table to load the current file.
Failing that, we can take a guess. Loading the given excel file, I get a table with 21 variables (not 19) with only 4 variables that are of type double. Among these 'SeniorCitizen' is the most likely candidate to be converted to categorical, so
yourtable.SeniorCitizen = categorical(yourtable.SeniorCitizen);
may fix the problem. However, it's possible that some of the text variables may need converting to categorical as well. Looking at the training table and comparing the type of each variable would be the best way to know.
Note: "So i created a categorical of table2 into table3" What does that mean? You cannot convert a table to categorical. You can only convert variables into categorical.
3 Comments
Walter Roberson
on 14 Oct 2018
"Loading the given excel file, I get a table with 21 variables (not 19) with only 4 variables that are of type double."
data = readtable('Telco Customer Churn2.xlsx', 'Sheet', 'Numeric Data');
Everything except the first column comes out as double.
Guillaume
on 14 Oct 2018
Everything except the first column comes out as double
I just used a plain readtable and didn't check the excel file to see if there was more than one sheet. That's probably why it's different.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!