Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Label-based addressing

Subject: Label-based addressing

From: kj

Date: 8 Apr, 2013 23:05:37

Message: 1 of 5




Let me start with a hypothetical (but representative) use-case.
Suppose I have data stratified over 5 brands of beer, 8 yearly
quarters, 4 age groups, and 50 states. Furthermore, let's say that
for each possible combination of beer brand, yearly quarter, age
group, and state, I have computed a heterogenous k-vector of
statistics (e.g. total consumption, consumption per capita, etc.).
Therefore, all told, I'm talking about a four-dimensional
(5 x 8 x 4 x 50) array of k-vectors.

I would like to store all this data in an object that would allow
me to address subsets of it using *labels* instead of numeric
indices. By "labels" I mean the names of the "dimensions" (in this
case "brand", "quarter", "age", "state"), the names of the possible
values for each dimension, (e.g. for state, I'd have "AK", "AL",
"AR", ..., "WI", "WY"), and the components of the vector of data
associated with each combination of factors, (in this case these
would include "total consumption", "consumption per capita", etc.).

For example, supposing that the variable D held such a data structure,
label-based addressing would allow me to use something like

    E = D.extract(state="AK", age="30-45")

to store in E the subset of the data in D corresponding to the
specified values. (The value in E, by the way, would be a similar
data structure, but it would have different shape, since two of
its dimensions now have the minimum depth of 1.)

One can imagine many elaborations of this theme, but I think the
above gives a flavor of what I have in mind.

It is my understanding that MATLAB does not have this functionality.
(But please correct me if I'm wrong!) Therefore I'd have to
implement it myself.

My naive idea is to define an object (class, that is) that internally
stores one or more n-dimensional "boxes" for the data (as standard
MATLAB arrays), as well as hash tables to map between labels and
numeric indices. Methods like "extract" above would convert its
arguments to numeric addresses, and would use these numeric addresses
to fetch the data from the internal data boxes.

I am fairly new to MATLAB, so I'd appreciate your comments on the
above, and any words of wisdom you may give me on how best to do
this.

In particular, are there any existing packages I could use as
*models* for this sort of project?

Thanks in advance!

kj

Subject: Label-based addressing

From: dpb

Date: 8 Apr, 2013 23:48:42

Message: 2 of 5

On 4/8/2013 6:05 PM, kj wrote:
> Let me start with a hypothetical (but representative) use-case.
> Suppose I have data stratified over 5 brands of beer, 8 yearly
> quarters, 4 age groups, and 50 states. Furthermore, let's say that
> for each possible combination of beer brand, yearly quarter, age
> group, and state, I have computed a heterogenous k-vector of
> statistics (e.g. total consumption, consumption per capita, etc.).
> Therefore, all told, I'm talking about a four-dimensional
> (5 x 8 x 4 x 50) array of k-vectors.
>
> I would like to store all this data in an object that would allow
> me to address subsets of it using *labels* instead of numeric
> indices. By "labels" I mean the names of the "dimensions" (in this
> case "brand", "quarter", "age", "state"), the names of the possible
> values for each dimension, (e.g. for state, I'd have "AK", "AL",
> "AR", ..., "WI", "WY"), and the components of the vector of data
> associated with each combination of factors, (in this case these
> would include "total consumption", "consumption per capita", etc.).
>
> For example, supposing that the variable D held such a data structure,
> label-based addressing would allow me to use something like
>
> E = D.extract(state="AK", age="30-45")
>
> to store in E the subset of the data in D corresponding to the
> specified values. (The value in E, by the way, would be a similar
> data structure, but it would have different shape, since two of
> its dimensions now have the minimum depth of 1.)
>
> One can imagine many elaborations of this theme, but I think the
> above gives a flavor of what I have in mind.
>
> It is my understanding that MATLAB does not have this functionality.
> (But please correct me if I'm wrong!) Therefore I'd have to
> implement it myself.
>
> My naive idea is to define an object (class, that is) that internally
> stores one or more n-dimensional "boxes" for the data (as standard
> MATLAB arrays), as well as hash tables to map between labels and
> numeric indices. Methods like "extract" above would convert its
> arguments to numeric addresses, and would use these numeric addresses
> to fetch the data from the internal data boxes.
>
> I am fairly new to MATLAB, so I'd appreciate your comments on the
> above, and any words of wisdom you may give me on how best to do
> this.
>
> In particular, are there any existing packages I could use as
> *models* for this sort of project?

The quickest solution is probably the most expensive...the Database
Toolbox...

<http://www.mathworks.com/products/database/>

You can make named fields in structures but they don't have the
flexibility you're looking for out of the box. In theory certainly one
could build the functionality; how much you want to write code as
opposed to enter and use/analyze data you'll have to judge.

I've not searched File Exchange--probably worth a look there as well.

--

Subject: Label-based addressing

From: Loren Shure

Date: 9 Apr, 2013 08:30:09

Message: 3 of 5


"kj" <no.email@please.post> wrote in message
news:kjvig1$1i9$1@reader1.panix.com...
>
>
>
> Let me start with a hypothetical (but representative) use-case.
> Suppose I have data stratified over 5 brands of beer, 8 yearly
> quarters, 4 age groups, and 50 states. Furthermore, let's say that
> for each possible combination of beer brand, yearly quarter, age
> group, and state, I have computed a heterogenous k-vector of
> statistics (e.g. total consumption, consumption per capita, etc.).
> Therefore, all told, I'm talking about a four-dimensional
> (5 x 8 x 4 x 50) array of k-vectors.
>
> I would like to store all this data in an object that would allow
> me to address subsets of it using *labels* instead of numeric
> indices. By "labels" I mean the names of the "dimensions" (in this
> case "brand", "quarter", "age", "state"), the names of the possible
> values for each dimension, (e.g. for state, I'd have "AK", "AL",
> "AR", ..., "WI", "WY"), and the components of the vector of data
> associated with each combination of factors, (in this case these
> would include "total consumption", "consumption per capita", etc.).
>
> For example, supposing that the variable D held such a data structure,
> label-based addressing would allow me to use something like
>
> E = D.extract(state="AK", age="30-45")
>
> to store in E the subset of the data in D corresponding to the
> specified values. (The value in E, by the way, would be a similar
> data structure, but it would have different shape, since two of
> its dimensions now have the minimum depth of 1.)
>
> One can imagine many elaborations of this theme, but I think the
> above gives a flavor of what I have in mind.
>
> It is my understanding that MATLAB does not have this functionality.
> (But please correct me if I'm wrong!) Therefore I'd have to
> implement it myself.
>
> My naive idea is to define an object (class, that is) that internally
> stores one or more n-dimensional "boxes" for the data (as standard
> MATLAB arrays), as well as hash tables to map between labels and
> numeric indices. Methods like "extract" above would convert its
> arguments to numeric addresses, and would use these numeric addresses
> to fetch the data from the internal data boxes.
>
> I am fairly new to MATLAB, so I'd appreciate your comments on the
> above, and any words of wisdom you may give me on how best to do
> this.
>
> In particular, are there any existing packages I could use as
> *models* for this sort of project?
>
> Thanks in advance!
>
> kj
>
>

If you have access to the Statistics Toolbox, you might check out the
dataset array. You can "index" with labels if you set things up
appropriately. I am not sure your data fit that scenario though.

--Loren

http://blogs.mathworks.com/loren

Subject: Label-based addressing

From: Yair Altman

Date: 9 Apr, 2013 13:16:05

Message: 4 of 5

kj <no.email@please.post> wrote in message <kjvig1$1i9$1@reader1.panix.com>...
>
>
>
> Let me start with a hypothetical (but representative) use-case.
> Suppose I have data stratified over 5 brands of beer, 8 yearly
> quarters, 4 age groups, and 50 states. Furthermore, let's say that
> for each possible combination of beer brand, yearly quarter, age
> group, and state, I have computed a heterogenous k-vector of
> statistics (e.g. total consumption, consumption per capita, etc.).
> Therefore, all told, I'm talking about a four-dimensional
> (5 x 8 x 4 x 50) array of k-vectors.
>
> I would like to store all this data in an object that would allow
> me to address subsets of it using *labels* instead of numeric
> indices. By "labels" I mean the names of the "dimensions" (in this
> case "brand", "quarter", "age", "state"), the names of the possible
> values for each dimension, (e.g. for state, I'd have "AK", "AL",
> "AR", ..., "WI", "WY"), and the components of the vector of data
> associated with each combination of factors, (in this case these
> would include "total consumption", "consumption per capita", etc.).
>
> For example, supposing that the variable D held such a data structure,
> label-based addressing would allow me to use something like
>
> E = D.extract(state="AK", age="30-45")
>
> to store in E the subset of the data in D corresponding to the
> specified values. (The value in E, by the way, would be a similar
> data structure, but it would have different shape, since two of
> its dimensions now have the minimum depth of 1.)
>
> One can imagine many elaborations of this theme, but I think the
> above gives a flavor of what I have in mind.
>
> It is my understanding that MATLAB does not have this functionality.
> (But please correct me if I'm wrong!) Therefore I'd have to
> implement it myself.
>
> My naive idea is to define an object (class, that is) that internally
> stores one or more n-dimensional "boxes" for the data (as standard
> MATLAB arrays), as well as hash tables to map between labels and
> numeric indices. Methods like "extract" above would convert its
> arguments to numeric addresses, and would use these numeric addresses
> to fetch the data from the internal data boxes.
>
> I am fairly new to MATLAB, so I'd appreciate your comments on the
> above, and any words of wisdom you may give me on how best to do
> this.
>
> In particular, are there any existing packages I could use as
> *models* for this sort of project?
>
> Thanks in advance!
>
> kj


Try using a hashtable data structure (or possibly a hashtable that contains hashtables as its data).

Two options to consider: java.util.Hashtable, containers.Map

Yair Altman
http://UndocumentedMatlab.com
 

Subject: Label-based addressing

From: Scott Koch

Date: 9 Apr, 2013 19:04:06

Message: 5 of 5

kj <no.email@please.post> wrote in message <kjvig1$1i9$1@reader1.panix.com>...
>
>
>
> Let me start with a hypothetical (but representative) use-case.
> Suppose I have data stratified over 5 brands of beer, 8 yearly
> quarters, 4 age groups, and 50 states. Furthermore, let's say that
> for each possible combination of beer brand, yearly quarter, age
> group, and state, I have computed a heterogenous k-vector of
> statistics (e.g. total consumption, consumption per capita, etc.).
> Therefore, all told, I'm talking about a four-dimensional
> (5 x 8 x 4 x 50) array of k-vectors.
>
> I would like to store all this data in an object that would allow
> me to address subsets of it using *labels* instead of numeric
> indices. By "labels" I mean the names of the "dimensions" (in this
> case "brand", "quarter", "age", "state"), the names of the possible
> values for each dimension, (e.g. for state, I'd have "AK", "AL",
> "AR", ..., "WI", "WY"), and the components of the vector of data
> associated with each combination of factors, (in this case these
> would include "total consumption", "consumption per capita", etc.).
>
> For example, supposing that the variable D held such a data structure,
> label-based addressing would allow me to use something like
>
> E = D.extract(state="AK", age="30-45")
>
> to store in E the subset of the data in D corresponding to the
> specified values. (The value in E, by the way, would be a similar
> data structure, but it would have different shape, since two of
> its dimensions now have the minimum depth of 1.)
>
> One can imagine many elaborations of this theme, but I think the
> above gives a flavor of what I have in mind.
>
> It is my understanding that MATLAB does not have this functionality.
> (But please correct me if I'm wrong!) Therefore I'd have to
> implement it myself.
>
> My naive idea is to define an object (class, that is) that internally
> stores one or more n-dimensional "boxes" for the data (as standard
> MATLAB arrays), as well as hash tables to map between labels and
> numeric indices. Methods like "extract" above would convert its
> arguments to numeric addresses, and would use these numeric addresses
> to fetch the data from the internal data boxes.
>
> I am fairly new to MATLAB, so I'd appreciate your comments on the
> above, and any words of wisdom you may give me on how best to do
> this.
>
> In particular, are there any existing packages I could use as
> *models* for this sort of project?
>
> Thanks in advance!
>
> kj

Hi KJ -

Not sure it does exactly what you want but you might take a look at the DataSet Object I put on the file exchange a few months ago:

http://www.mathworks.com/matlabcentral/fileexchange/39336-dataset-object

It allows index into data via label names. For example:

myds = dataset(rand(5,5,3));%Create a dataset.
%Add different formatted labels.
myds.label{1,1} = str2cell(sprintf('Row_%d\n',[1:5]'));%Add labels to mode 1.
myds.label{2,1} = str2cell(sprintf('Column %d\n',[1:5]'));%Add labels to mode 2.
myds.label{3,1} = str2cell(sprintf('Slab%d\n',[1:3]'));%Add labels to mode 3.

myslab = myds.Slab1;%Get first "slab" of the cube (5x5x1).
mycol = myds.Slab1.('Column 1');%Get first column from cube (5x1x1).


Scott

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us