Relatively complex matrix indexing question

Question

Sam on 9 Jun 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/133453-relatively-complex-matrix-indexing-question

Commented: Sam on 11 Jun 2014

Hi,

I have a reasonably complex matrix indexing question that I'm struggling to solve. I have an nxn matrix of float values. The matrix is fairly large (n~30000). The matrix is upper (or lower) triangular with the other values being NaN. Each of the rows represented has a vector associated with it, so the matrix rows has a secondary matrix associated with it that is nxk, where each row has k characteristics. Row 1 has the same characteristics as column 1, row 2 has the same characteristics as column2, etc.

I want to plot a histogram of all of the elements in the nxn matrix whose rows/columns match certain characteristics.

To illustrate, let's say I had the following 4x4 value matrix

Values = [...
NaN  4.0  3.1   2.3
NaN  NaN  1.2   5.2
NaN  NaN  NaN   7.1
NaN  NaN  NaN   NaN]

Each row/column has three characteristics and this characteristics matrix has the values

Charac = [...
    4 2 3
    1 4 5
   -3 5 1
   10 6 2]

I only want to get histogram elements for rows/columns that have the second characteristic greater than 1 and the third characteristic less than 4. In this case only the first, third and fourth rows would meet the criterion and the values that I would take from the matrix Values are 3.1, 2.3 and 7.1.

Is there a compact and computationally efficient way to logically index this?

Thanks in advance!

8 Comments
Show 6 older commentsHide 6 older comments

Sam on 10 Jun 2014

So the columns and rows are selected by the same criteria, so if the valid indices for the rows are 1,3 and 4, the only valid (row,column) combinations are (1,3), (1,4) and (3,4). In other words, column 2 and row 2 get selected out. I will look into using sparse matrices. My understanding was that since the memory had to store the value and the indices of each element that you only save memory if the matrix has much more than 50% zeros. I would have to have a uint16 values to store the indices of the matrix (so row and column would mean an extra 4 bytes), whereas each element is only 1 byte. So I think the fraction of zeros would have to be something greater than 80% to save any memory.

Cedric on 10 Jun 2014

See my answer below.

Sign in to comment.

Sign in to answer this question.

Answer 1

Cedric on 10 Jun 2014

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/133453-relatively-complex-matrix-indexing-question#answer_140248

Edited: Cedric on 10 Jun 2014

Open in MATLAB Online

You are correct about sparse matrices, and I had completely misunderstood your setup! So the lower triangular part is NaN, but the upper part has a density of 1(?) If so, there is no point in using sparse matrices. Reading your last explanation, I think that what you are looking for is along the line of what follows.

EDIT - I provide a second approach (B), which could work if you had only non-NaN elements in a vector. It is just a quick test though and it would have to be tested/profiled.

 values = [NaN  4.0  3.1   2.3; ...
           NaN  NaN  1.2   5.2; ...
           NaN  NaN  NaN   7.1; ...
           NaN  NaN  NaN   NaN] ;
 charac = [4 2 3; ...
           1 4 5; ...
          -3 5 1; ...
          10 6 2] ;
 %%%A. Values is square num. array with NaNs.
 % - Find valid row IDs.     
 isValid = charac(:,2) > 1 & charac(:,3) < 4 ;
 validId = find( isValid ) ;
 % - Build all possible row/col pairs.
 validRC = nchoosek( validId, 2 ) ;
 isValid = validRC(:,2) > validRC(:,1) ;
 validRC = validRC(isValid,:) ;
 % - Extract relevant values.
 validId = sub2ind( size(values), validRC(:,1), validRC(:,2)) ;
 extract = values(validId) ;
 %%%B. Values is lin num array with non-NaNs only.
 % - Build what could be values if you were not storing NaNs. Makes little 
 %  sense to build it this way (by creating the large, square array and 
 %  reducing it), but maybe you can build this vector directly instead 
 %  of the array.
 valuesLin = values(~isnan(values)) ;
 % - Build sub2linInd function.
 cSum = [0, cumsum( 1 : (size(values,2)-1) )] ;
 sub2linInd = @(row, col) row + cSum(col-1).' ;
 % - Extract relevant values.
 validId = sub2linInd( validRC(:,1), validRC(:,2) ) ;
 extractLin = valuesLin(validId) ;

7 Comments
Show 5 older commentsHide 5 older comments

Cedric on 11 Jun 2014

Edited: Cedric on 11 Jun 2014

Yes, you are absolutely right, I thought that it was "with replacement" and it is not (sorry, I should have checked!).

How are you building the first large matrix (?), would it be difficult to build directly a vector of non-NaN values ( double or even uintx) and avoid building the large array?

If you must build an array, could you use single elements instead of double? The single type/class supports NaN.

If you need e.g. uint8 because you are short on RAM, do you really need the 256 possible values; couldn't you, as mentioned above, use e.g. 255 to code for NaN?

All solutions have advantages and bottlenecks: memory for the double/NaN approach, practicality and speed for the nchoosek/non-NaN approach (nchoosek won't work if you have too many rows matching conditions, even with a k as small as 2), etc. I guess that at this stage, we could proceed by elimination: how much RAM do you have available, and how many rows matching conditions could you have?

I'd say that your short term best bet (reasonable memory usage, precision, computation time, and development time) is likely to be Image Analyst's approach, with your modification, applied to a class single Values array.

Sam on 11 Jun 2014

I think I'll use the uint8 because of RAM (I might need to go to 30000x30000). I have 16 GB of RAM. With the check of making sure the columns are greater than the rows, I think I don't need the NaN on the lower triangular part (since I will ignore any values there anyway). That way I can also use all 0 to 255 values.

Thanks. I think you and Image Analyst have solved my question!

Sign in to comment.

Answer 2

Image Analyst on 9 Jun 2014

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/133453-relatively-complex-matrix-indexing-question#answer_140155

Open in MATLAB Online

Try this and see if it's what you want:

Values = [...
NaN  4.0  3.1   2.3
NaN  NaN  1.2   5.2
NaN  NaN  NaN   7.1
NaN  NaN  NaN   NaN]
Charac = [...
    4 2 3
    1 4 5
   -3 5 1
   10 6 2]
% Find rows where column 2 > 1 and column 3 < 4
rowsToExtract = Charac(:, 2) > 1 & Charac(:, 3) < 4
% Extract only those rows and no others.
extractedRows = Values(rowsToExtract,:)
% Get the values from that which are not NaN.
extractedValues = extractedRows(~isnan(extractedRows(:)))

3 Comments
Show 1 older commentHide 1 older comment

Image Analyst on 10 Jun 2014

I don't see why my way would not work. Your suggestion will end up eliminating some columns also, and you didn't say anything about eliminating columns.

Sam on 10 Jun 2014

So the rows and columns are "equivalent" with the same characteristics. That is why in the example given, only 3.1, 2.3 and 7.1 would survive because the columns are subject to the same constraints as the rows.

So I think your solution works but I had just not explained the problem clearly enough. I will test tomorrow to make sure everything works.

Thanks!

Sign in to comment.

Relatively complex matrix indexing question

8 Comments
Show 6 older commentsHide 6 older comments

Accepted Answer

7 Comments
Show 5 older commentsHide 5 older comments

More Answers (1)

3 Comments
Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Community Treasure Hunt

Relatively complex matrix indexing question

8 Comments Show 6 older commentsHide 6 older comments

Accepted Answer

7 Comments Show 5 older commentsHide 5 older comments

More Answers (1)

3 Comments Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Community Treasure Hunt

8 Comments
Show 6 older commentsHide 6 older comments

7 Comments
Show 5 older commentsHide 5 older comments

3 Comments
Show 1 older commentHide 1 older comment