How can I make smaller matrices (size unknown) from a large matrix?
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Show older comments
0 votes
I have a matrix that has 51 columns and 46999 rows. The 8th column has values 1 to 36. I want to create a matrix for each value in that column (36 total). I want to stack rows that have the same value to make these matrices.
Thus far I know can use a for loop to set up a matrix 36 times then I was trying to use a while loop to "stack" the rows that share the same value in the 8th column, but I can't get that while loop right. Any suggestions?
1 Comment
John Hunt
on 4 Oct 2017
That worked awesome! Thank you, could you explain it a bit so I can understand a bit better whats going on please?
Accepted Answer
James Tursa
on 3 Oct 2017
M = your matrix;
result = cellfun(@(x)M(M(:,8)==x,:),num2cell(1:36),'uni',false);
8 Comments
result = arrayfun(@(x)M(M(:,8)==x,:),1:36,'uni',false);
John Hunt
on 4 Oct 2017
That worked awesome! Thank you, could you explain it a bit so I can understand a bit better whats going on please?
James Tursa
on 4 Oct 2017
Edited: James Tursa
on 4 Oct 2017
They both essentially do the same thing. For each of the numbers in the 1:36 range, a sub-matrix of M is extracted where the 8th column of M matches this number. Results are stored in a cell array ... one for each of the numbers in the 1:36 range.
I.e., in this formula for the anonymous function:
@(x)M(M(:,8)==x,:)
The x values are taken, one by one, from the 1:36 range. So result{1} is the result of M(M(:,8)==1,:) ... i.e. the sub-matrix of M where the 8th column has the value of 1. Same for the other values in the 1:36 range.
I had written this elsewhere.
-----------------------------------------
Don't forget to [Accept] his answer if it helped.
I'll explain the ARRAYFUN version, because I think that it is what James had in mind actually.
result = arrayfun( ref_to_function, array, options )
applies the function given by its reference (1st arg) to each element of the array. The default behavior assumes that the function outputs a scalar, and ARRAYFUN outputs an array of all these scalars.
We often use it passing some values of indices as the array in fact, to apply a function on selected parts of a data array. It is not the best way to do it, but if you wanted to compute the mean per column of some data array
>> data = randi( 10, 4, 5 )
data =
3 8 6 6 9
7 1 7 9 5
1 4 2 2 10
6 5 8 1 9
you could do it as follows:
>> result = arrayfun( @(colId) mean(data(:,colId)), 1:size(data, 2) )
result =
4.2500 4.5000 5.7500 4.5000 8.2500
here the array on which ARRAYFUN operates (iterates) is
1:size(data, 2)
which is a vector of all column IDs, and the function is
@(colId) mean(data(:,colId))
which is an anonymous function of one parameter (named colId), which internally calls MEAN on the relevant column of the data array.
This is equivalent to writing the following loop, but it is more concise:
nCols = size( data, 2 ) ;
result = zeros( 1, nCols ) ;
for colId = 1 : nCols
result(colId) = mean(data(:,colId)) ;
end
Now you want to group rows of your data array M and not compute the mean. You probably understand that
@(valCol8) M(M(:,8)==valCol8,:)
selects all relevant rows when given an integer in 1 to 36. To test, you can name this function:
>> selector = @(valCol8) M(M(:,8)==valCol8,:)
and check passing a value:
>> selector(36)
This should have selected all rows with column 8 equal to 36. So you understand most of the call to ARRAYFUN (and why we pass the array 1:36). Selected rows are not "a scalar output" though and we must tell ARRAYFUN that it is allowed to output a cell array of all non-scalar outputs, which is what is done with the options
'UniformOutput', false
where the function understands shorter versions of the parameter name (e.g. 'uni').
James Tursa
on 4 Oct 2017
Edited: James Tursa
on 4 Oct 2017
@John: And finally, Cedric's use of arrayfun() is cleaner in that it avoids the num2cell(1:36) that I have with my version that uses cellfun().
(Cedric was just being nice in letting me get the points :)
I was just being fair, because if we are both roughly in the same timezone, it is getting late (!) and the probability that we implement extra calls to num2cell induced by an inadvertent use of cellfun is nearing the unit ;)
Jan
on 4 Oct 2017
+1 for both of you.
@Cedric: Now you know the reason, why I "boost" sometimes. It is for the cases, where the standard voting system is to rough. :-)
Cedric
on 4 Oct 2017
:-)
It has already been proposed, but I really think that allowing voting for comment would help. This would lead people to add much more value to others' answers by taking time to write well developed comments.
More Answers (0)
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Tags
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)