MATLAB Answers

Define a collection of variables

48 views (last 30 days)
alpedhuez
alpedhuez on 16 Jan 2021
Commented: alpedhuez on 17 Jan 2021
Suppose I have a table
date visitor city dummy_for_January dummy_for_February dummy_for_March
----------------------------------------------------------------------------------------------------------
I want to define a collection of such dummy variables so that I do not need to write them one by one each time. Please advise.

  2 Comments

Cris LaPierre
Cris LaPierre on 16 Jan 2021
Assuming you already have a table defined, you can extract the variable names from the table and store them in a variable.
varNames = table.Properties.VariableNames
You can then save that variable to a mat file if you want to have the variable accessible from one matlab session to the next.
Image Analyst
Image Analyst on 16 Jan 2021
I have no idea. To create a variable, you're going to have to write it - you'll have to see its name down in your script somewhere, which means you typed it. It won't just magically create itself somehow without being written. Try this:

Sign in to comment.

Accepted Answer

Matt J
Matt J on 17 Jan 2021
Since you already have the data in table form, you can just do things like,
mdl=fitlm(T(:,["Albania","Afghanistan","sales"]))

  5 Comments

Show 2 older comments
alpedhuez
alpedhuez on 17 Jan 2021
I meant:
in the first regression, indep variables are 'population', 'dummy for all countries'.
in the second regression, indep variables are 'area', 'dummy for all countries.'
in the third regression, indep variables are 'income', 'dummy for all countries'
What is a concise way to write indep variables?
Matt J
Matt J on 17 Jan 2021
First, create a copy of the table containing only the countries,
Tcountries=T;
Tcountries(:,["population" "area" "income"])=[]; %discard non-country variables
Now, you can do things like,
mdl1=fitlm([Tcountries, T(:,"population")]);
mdl2=fitlm([Tcountries, T(:,"area")]);
mdl3=fitlm([Tcountries, T(:,"income")]);

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 16 Jan 2021
mons = {'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'};
vars = [string({'date', 'visitor', 'city'}), "dummy_for_" + string(mons)];
nvars = length(vars);
T = array2table(zeros(0,nvars), 'VariableNames', vars)
T = 0x15 empty table
T.Properties.VariableNames
ans = 1x15 cell array
{'date'} {'visitor'} {'city'} {'dummy_for_January'} {'dummy_for_February'} {'dummy_for_March'} {'dummy_for_April'} {'dummy_for_May'} {'dummy_for_June'} {'dummy_for_July'} {'dummy_for_August'} {'dummy_for_September'} {'dummy_for_October'} {'dummy_for_November'} {'dummy_for_December'}

  3 Comments

alpedhuez
alpedhuez on 17 Jan 2021
I like to follow up. This was my use case
I want to run a regression with dummy variables for each country such as
(sales) = a+b*(population)+c*(dummy variable for Afganistan)+d*(dummy variable for Albania) + ....
According to the example I can write
X1=[T.population, T.dummy_for_Afganistan, T.dummy_for_Albania,...]
and run
fitlm(X1, T.sales)
But I want to run regressions with various specifications like
X2=[T.temperature, T.dummy_for_Afganistan, T.dummy_for_Albania]
Then it is a bit hard to write down every country dummy variable for each X. Thus I want to know whether there is a simpler way to do by defining one variable for all country dummy variables.
Steven Lord
Steven Lord on 17 Jan 2021
There are a number of different ways to retrieve data from a table array. Let's make a sample table.
load patients
patients = table(LastName,Gender,Age,Height,Weight,Smoker,Systolic,Diastolic);
head(patients)
ans = 8x8 table
LastName Gender Age Height Weight Smoker Systolic Diastolic ____________ __________ ___ ______ ______ ______ ________ _________ {'Smith' } {'Male' } 38 71 176 true 124 93 {'Johnson' } {'Male' } 43 69 163 false 109 77 {'Williams'} {'Female'} 38 64 131 false 125 83 {'Jones' } {'Female'} 40 67 133 false 117 75 {'Brown' } {'Female'} 49 64 119 false 122 80 {'Davis' } {'Female'} 46 68 142 false 121 70 {'Miller' } {'Female'} 33 64 142 true 130 88 {'Wilson' } {'Male' } 40 68 180 false 115 82
I can retrieve variables to a normal double array using curly braces and specifying the variables either by name or by number.
T1 = patients{1:8, ["Age", "Height", "Weight"]}
T1 = 8×3
38 71 176 43 69 163 38 64 131 40 67 133 49 64 119 46 68 142 33 64 142 40 68 180
T2 = patients{1:8, 3:5}
T2 = 8×3
38 71 176 43 69 163 38 64 131 40 67 133 49 64 119 46 68 142 33 64 142 40 68 180
Or I can find all variables whose names end with the letter t (as a bit of a silly example.) Note here I'm indexing into the patients table with parentheses so the result is a table (with variable names.)
VN = patients.Properties.VariableNames;
T3 = patients(1:8, endsWith(VN, 't'))
T3 = 8x2 table
Height Weight ______ ______ 71 176 69 163 64 131 67 133 64 119 68 142 64 142 68 180
You could use startsWith, endsWith, contains, or other string processing functions to select specific variables from your table.
alpedhuez
alpedhuez on 17 Jan 2021
I simply like to define a new variable that is a set of dummy variables in the above example. Is it possible?

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!