Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
How to create dummy variables from categorical data?

Subject: How to create dummy variables from categorical data?

From: Economist

Date: 21 Feb, 2011 15:39:25

Message: 1 of 7

I have a fairly large dataset with a variable that represents host country, e.g. “Finland”. How do I create dummy variables from this country variable, i.e. one dummy for each country? I plan to use the dummies in survival analysis.

Subject: How to create dummy variables from categorical data?

From: someone

Date: 21 Feb, 2011 15:48:04

Message: 2 of 7

"Economist" wrote in message <iju0vd$6fj$1@fred.mathworks.com>...
> I have a fairly large dataset with a variable that represents host country, e.g. “Finland”. How do I create dummy variables from this country variable, i.e. one dummy for each country? I plan to use the dummies in survival analysis.

% perhaps use a structure?

doc struct

Subject: How to create dummy variables from categorical data?

From: ImageAnalyst

Date: 21 Feb, 2011 15:54:03

Message: 3 of 7

On Feb 21, 10:39 am, "Economist "
<starfaic...@dunflimblag.mailexpire.com> wrote:
> I have a fairly large dataset with a variable that represents host country, e.g. “Finland”. How do I create dummy variables from this country variable, i.e. one dummy for each country? I plan to use the dummies in survival analysis.

-------------------------------------------------------------------------
Not sure I understand. What data type is this variable? Is it a
structure, a string, a numerical array, a class?

Maybe the variable is logical array of 220 elements (or however many
countries there are), and it's true if you have data for that
country. So like maybe logicalCountryList(23) = true if Finland is
country #23.

Maybe it's a structure like
countryInfo.CountryName = 'Finland"
Then you can just add other members dynamically. Like let's say you
want population and GDP. Just assign it
countryInfo.Population = 5300000;
countryInfo.GDP = 183e9;

Generally it's possible to just create variables in MATLAB
immediately, when you want to use them - no declaration or
preallocation is necessary. I guess we'd need more explicit
description of your situation.

Subject: How to create dummy variables from categorical data?

From: Economist

Date: 21 Feb, 2011 16:08:04

Message: 4 of 7

The data consists of strings, i.e. the values for the 10 first observations are:

Albania
Albania
Albania
Argentina
Argentina
Argentina
Argentina
Argentina
Argentina
Argentina

In total there are 88 different countries, from which there are observations. Thus, I would like to have 88-1=87 dummy variables.

Subject: How to create dummy variables from categorical data?

From: ImageAnalyst

Date: 21 Feb, 2011 16:13:01

Message: 5 of 7

On Feb 21, 11:08 am, "Economist "
<starfaic...@dunflimblag.mailexpire.com> wrote:
> The data consists of strings, i.e. the values for the 10 first observations are:
>
> Albania
> Albania
> Albania
> Argentina
> Argentina
> Argentina
> Argentina
> Argentina
> Argentina
> Argentina
>
> In total there are 88 different countries, from which there are observations. Thus, I would like to have 88-1=87 dummy variables.

-----------------------------------------------------------
% Preallocate dummy variable
myDummyVariable = zeros(88);
% Start using it, let's say Finland is the 23rd country.
myDummyVariable = 42; % Just assign something to the element
belonging to Finland.

Subject: How to create dummy variables from categorical data?

From: Richard Startz

Date: 21 Feb, 2011 21:22:31

Message: 6 of 7

On Mon, 21 Feb 2011 16:08:04 +0000 (UTC), "Economist "
<starfaichoo@dunflimblag.mailexpire.com> wrote:

>The data consists of strings, i.e. the values for the 10 first observations are:
>
>Albania
>Albania
>Albania
>Argentina
>Argentina
>Argentina
>Argentina
>Argentina
>Argentina
>Argentina
>
>In total there are 88 different countries, from which there are observations. Thus, I would like to have 88-1=87 dummy variables.

You need to assign a number to each country. This is one approach,
although not terribly elegant.
dummyMatrix = zeros(length(data),88);
for i=1:length(data)
switch countryName(i)
 case 'Albania'
   countryNumber = 1
 case 'Argentina'
   countryNumber = 2
end
dummyMatrix(i,countryNumber)= 1;
end

Subject: How to create dummy variables from categorical data?

From: Peter Perkins

Date: 22 Feb, 2011 03:05:24

Message: 7 of 7

On 2/21/2011 10:39 AM, Economist wrote:
> I have a fairly large dataset with a variable that represents host
> country, e.g. “Finland”. How do I create dummy variables
> from this country variable, i.e. one dummy for each country? I plan to
> use the dummies in survival analysis.

Have you looked at the DUMMYVAR function in the Statistics Toolbox?

Hope this helps.

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us