Representing Expression Data Values in DataMatrix Objects

Overview of DataMatrix Objects

The toolbox includes functions, objects, and methods for creating, storing, and accessing microarray data.

The object constructor function, DataMatrix, lets you create a DataMatrix object to encapsulate data and metadata (row and column names) from a microarray experiment. A DataMatrix object stores experimental data in a matrix, with rows typically corresponding to gene names or probe identifiers, and columns typically corresponding to sample identifiers. A DataMatrix object also stores metadata, including the gene names or probe identifiers (as the row names) and sample identifiers (as the column names).

You can reference microarray expression values in a DataMatrix object the same way you reference data in a MATLAB® array, that is, by using linear or logical indexing. Alternately, you can reference this experimental data by gene (probe) identifiers and sample identifiers. Indexing by these identifiers lets you quickly and conveniently access subsets of the data without having to maintain additional index arrays.

Many MATLAB operators and arithmetic functions are available to DataMatrix objects by means of methods. These methods let you modify, combine, compare, analyze, plot, and access information from DataMatrix objects. Additionally, you can easily extend the functionality by using general element-wise functions, dmarrayfun and dmbsxfun, and by manually accessing the properties of a DataMatrix object.

    Note:   For tables describing the properties and methods of a DataMatrix object, see the DataMatrix object reference page.

Constructing DataMatrix Objects

  1. Load the MAT-file, provided with the Bioinformatics Toolbox™ software, that contains yeast data. This MAT-file includes three variables: yeastvalues, a 614-by-7 matrix of gene expression data, genes, a cell array of 614 GenBank® accession numbers for labeling the rows in yeastvalues, and times, a 1-by-7 vector of time values for labeling the columns in yeastvalues.

    load filteredyeastdata
    
  2. Create variables to contain a subset of the data, specifically the first five rows and first four columns of the yeastvalues matrix, the genes cell array, and the times vector.

    yeastvalues = yeastvalues(1:5,1:4);
    genes = genes(1:5,:);
    times = times(1:4);
  3. Import the microarray object package so that the DataMatrix constructor function will be available.

    import bioma.data.*
    
  4. Use the DataMatrix constructor function to create a small DataMatrix object from the gene expression data in the variables you created in step 2.

    dmo = DataMatrix(yeastvalues,genes,times)
    
    dmo = 
    
                      0       9.5     11.5      13.5  
        SS DNA     -0.131    1.699    -0.026     0.365
        YAL003W     0.305    0.146    -0.129    -0.444
        YAL012W     0.157    0.175     0.467    -0.379
        YAL026C     0.246    0.796     0.384     0.981
        YAL034C    -0.235    0.487    -0.184    -0.669

Getting and Setting Properties of a DataMatrix Object

You use the get and set methods to retrieve and set properties of a DataMatrix object.

  1. Use the get method to display the properties of the DataMatrix object, dmo.

    get(dmo)
                Name: ''
            RowNames: {5x1 cell}
            ColNames: {'   0'  ' 9.5'  '11.5'  '13.5'}
               NRows: 5
               NCols: 4
               NDims: 2
        ElementClass: 'double'
  2. Use the set method to specify a name for the DataMatrix object, dmo.

    dmo = set(dmo,'Name','MyDMObject');
  3. Use the get method again to display the properties of the DataMatrix object, dmo.

    get(dmo)
                Name: 'MyDMObject'
            RowNames: {5x1 cell}
            ColNames: {'   0'  ' 9.5'  '11.5'  '13.5'}
               NRows: 5
               NCols: 4
               NDims: 2
        ElementClass: 'double'

    Note:   For a description of all properties of a DataMatrix object, see the DataMatrix object reference page.

Accessing Data in DataMatrix Objects

DataMatrix objects support the following types of indexing to extract, assign, and delete data:

  • Parenthesis ( ) indexing

  • Dot . indexing

Parentheses () Indexing

Use parenthesis indexing to extract a subset of the data in dmo and assign it to a new DataMatrix object dmo2:

dmo2 = dmo(1:5,2:3)
dmo2 = 
                9.5     11.5  
    SS DNA     1.699    -0.026
    YAL003W    0.146    -0.129
    YAL012W    0.175     0.467
    YAL026C    0.796     0.384
    YAL034C    0.487    -0.184

Use parenthesis indexing to extract a subset of the data using row names and column names, and assign it to a new DataMatrix object dmo3:

dmo3 = dmo({'SS DNA','YAL012W','YAL034C'},'11.5')

dmo3 = 

               11.5  
    SS DNA     -0.026
    YAL012W     0.467
    YAL034C    -0.184

    Note:   If you use a cell array of row names or column names to index into a DataMatrix object, the names must be unique, even though the row names or column names within the DataMatrix object are not unique.

Use parenthesis indexing to assign new data to a subset of the elements in dmo2:

dmo2({'SS DNA', 'YAL003W'}, 1:2) = [1.700 -0.030; 0.150 -0.130]
dmo2 = 

                9.5     11.5  
    SS DNA       1.7     -0.03
    YAL003W     0.15     -0.13
    YAL012W    0.175     0.467
    YAL026C    0.796     0.384
    YAL034C    0.487    -0.184

Use parenthesis indexing to delete a subset of the data in dmo2:

dmo2({'SS DNA', 'YAL003W'}, :) = []
dmo2 = 

                9.5     11.5  
    YAL012W    0.175     0.467
    YAL026C    0.796     0.384
    YAL034C    0.487    -0.184

Dot . Indexing

    Note:   In the following examples, notice that when using dot indexing with DataMatrix objects, you specify all rows or all columns using a colon within single quotation marks, (':').

Use dot indexing to extract the data from the 11.5 column only of dmo:

timeValues = dmo.(':')('11.5')
timeValues =

   -0.0260
   -0.1290
    0.4670
    0.3840
   -0.1840

Use dot indexing to assign new data to a subset of the elements in dmo:

dmo.(1:2)(':') = 7
dmo = 

                  0       9.5     11.5      13.5  
    SS DNA          7        7         7         7
    YAL003W         7        7         7         7
    YAL012W     0.157    0.175     0.467    -0.379
    YAL026C     0.246    0.796     0.384     0.981
    YAL034C    -0.235    0.487    -0.184    -0.669

Use dot indexing to delete an entire variable from dmo:

dmo.YAL034C = []
dmo = 

                  0      9.5     11.5     13.5  
    SS DNA         7        7        7         7
    YAL003W        7        7        7         7
    YAL012W    0.157    0.175    0.467    -0.379
    YAL026C    0.246    0.796    0.384     0.981

Use dot indexing to delete two columns from dmo:

dmo.(':')(2:3)=[] 

dmo = 

                  0     13.5  
    SS DNA         7         7
    YAL003W        7         7
    YAL012W    0.157    -0.379
    YAL026C    0.246     0.981
Was this topic helpful?