Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Matching data based on columns

Subject: Matching data based on columns

From: R L

Date: 15 Mar, 2009 02:55:02

Message: 1 of 11

Hi all,

I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail.

I want to sort data in the following format:

A = [1 1 3 4 98 97 96; ...
       1 1 3 4 98 93 92; ...
       1 1 3 4 92 94 90; ...
       2 3 4 5 79 88 90; ...
       2 4 5 7 89 78 99; ...
       2 6 9 7 89 78 99];

I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following:

B = [1 1 3 4 96 94.7 92.7; ...
       2 3 4 5 79 88 90; ...
       2 4 5 7 89 78 99; ...
       2 6 9 7 89 78 99];

where
96 = 1/3 * (98 + 98 + 92)
94.7 = 1/3 * (97 + 93 + 94)
92.7 = 1/3 * (96 + 92 + 90)
and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).

Thank you so much!

PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches).

Subject: Matching data based on columns

From: Sergioval

Date: 15 Mar, 2009 05:28:38

Message: 2 of 11

Hi, I am not sure to understand what you mean with "the conditions of
the first four columns".

In the case you want to emulate the function in Excel VLOOKUP
(1,a2:c10,2), you could do something like:
  A(2, find(A(1,:)<=3DA(1,1),1) )

So for matching the first three columns you just have to remove the
constraint in the function find to work out only one result.
  [c,d]=3Dfind(A(:,1:3) - A(ones(size(A,1),1),1:3) =3D=3D 0);

Pay attention to how I am substracting the matrices. As they have to
be the same size I am replicating the first row with function "ones".
If you prefer to search approximately instead of exactly, just
substitute the =3D=3D for a choosen tolerance or set the right logical
condition.

Then get only the useful (not repeated) rows which are:
  c=3Dc(1:find(d~=3Dd(1),1)-1);

Finally, calculate the mean:
  B=3Dzeros(size(A,1)-length(c)+1,size(A,2));
  B(1,:) =3D mean(A(c,:));

And add the rest of the matrix:
  A(c,:)=3D[];
  B(2:1+size(A,1),:)=3DA;


Summaryzing, if I understood your goal, this code should help:

[c,d]=3Dfind(A(:,1:3) - A(ones(size(A,1),1),1:3) =3D=3D 0);
c=3Dc(1:find(d~=3Dd(1),1)-1);
B=3Dzeros(size(A,1)-length(c)+1,size(A,2));
B(1,:) =3D mean(A(c,:));
A(c,:)=3D[];
B(2:1+size(A,1),:)=3DA;


Cheers!
Sergio



On 14 mar, 22:55, "R L" <rcw...@hotmail.com> wrote:
> Hi all,
>
> I apologize in advance if this is a novice and trivial question but I wou=
ld appreciate all the help I can get! I must have spent at least three days=
 on trying to solve this problem and poured many hours on googling this sol=
ution but still to no avail.
>
> I want to sort data in the following format:
>
> A =3D [1 1 3 4 98 97 96; ...
> =A0 =A0 =A0 =A01 1 3 4 98 93 92; ...
> =A0 =A0 =A0 =A01 1 3 4 92 94 90; ...
> =A0 =A0 =A0 =A02 3 4 5 79 88 90; ...
> =A0 =A0 =A0 =A02 4 5 7 89 78 99; ...
> =A0 =A0 =A0 =A02 6 9 7 89 78 99];
>
> I want to sort the data based on the conditions of the first four columns=
 of the matrix A. Moreover, for the rows that match the criteria, is it pos=
sible to get an average? Perhaps it is easier to explain with an example. T=
hat is, the resulting matrix should look like the following:
>
> B =3D [1 1 3 4 96 94.7 92.7; ...
> =A0 =A0 =A0 =A02 3 4 5 79 88 90; ...
> =A0 =A0 =A0 =A02 4 5 7 89 78 99; ...
> =A0 =A0 =A0 =A02 6 9 7 89 78 99];
>
> where
> 96 =3D 1/3 * (98 + 98 + 92)
> 94.7 =3D 1/3 * (97 + 93 + 94)
> 92.7 =3D 1/3 * (96 + 92 + 90) =A0
> and whether the values in the last three columns match or not is irreleva=
nt (i.e. as illustrated by the last two rows of matrix A). =A0
>
> Thank you so much!
>
> PS. In effect, this is sort of like Excel's vlookup function (not quite, =
since I want three elements to match whereas Excel only allows for one and =
also that I want a calculation based on the matches).

Subject: Matching data based on columns

From: Roger Stafford

Date: 15 Mar, 2009 08:43:01

Message: 3 of 11

"R L" <rcwlhk@hotmail.com> wrote in message <gphqm6$eh5$1@fred.mathworks.com>...
> Hi all,
>
> I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail.
>
> I want to sort data in the following format:
>
> A = [1 1 3 4 98 97 96; ...
> 1 1 3 4 98 93 92; ...
> 1 1 3 4 92 94 90; ...
> 2 3 4 5 79 88 90; ...
> 2 4 5 7 89 78 99; ...
> 2 6 9 7 89 78 99];
>
> I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following:
>
> B = [1 1 3 4 96 94.7 92.7; ...
> 2 3 4 5 79 88 90; ...
> 2 4 5 7 89 78 99; ...
> 2 6 9 7 89 78 99];
>
> where
> 96 = 1/3 * (98 + 98 + 92)
> 94.7 = 1/3 * (97 + 93 + 94)
> 92.7 = 1/3 * (96 + 92 + 90)
> and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).
>
> Thank you so much!
>
> PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches).

  The following should do the job for you, R L, if I have understood you correctly. I am not completely satisfied with it because it does a sort twice, once with 'sortrows' and the second time with 'unique', but so far I haven't figured a way around this. We assume the N by p+q array A is given:

 p = 4; % no. cols. to check;
 [N,q] = size(A);
 q = q-p; % no. cols to do averaging on
 T = sortrows(A,1:p); % Sort according to first p columns
 [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
 r = repmat(n,q,1); % Row nos. for accumarray
 c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
 T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
 T = accumarray([r,c],T)); % Do the required addition
 d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
 B = [B,T./d]; % Append mean values in last q columns

B is the desired result.

Roger Stafford

Subject: Matching data based on columns

From: R L

Date: 15 Mar, 2009 11:31:01

Message: 4 of 11

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpif2l$rg2$1@fred.mathworks.com>...
> "R L" <rcwlhk@hotmail.com> wrote in message <gphqm6$eh5$1@fred.mathworks.com>...
> > Hi all,
> >
> > I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail.
> >
> > I want to sort data in the following format:
> >
> > A = [1 1 3 4 98 97 96; ...
> > 1 1 3 4 98 93 92; ...
> > 1 1 3 4 92 94 90; ...
> > 2 3 4 5 79 88 90; ...
> > 2 4 5 7 89 78 99; ...
> > 2 6 9 7 89 78 99];
> >
> > I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following:
> >
> > B = [1 1 3 4 96 94.7 92.7; ...
> > 2 3 4 5 79 88 90; ...
> > 2 4 5 7 89 78 99; ...
> > 2 6 9 7 89 78 99];
> >
> > where
> > 96 = 1/3 * (98 + 98 + 92)
> > 94.7 = 1/3 * (97 + 93 + 94)
> > 92.7 = 1/3 * (96 + 92 + 90)
> > and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).
> >
> > Thank you so much!
> >
> > PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches).
>
> The following should do the job for you, R L, if I have understood you correctly. I am not completely satisfied with it because it does a sort twice, once with 'sortrows' and the second time with 'unique', but so far I haven't figured a way around this. We assume the N by p+q array A is given:
>
> p = 4; % no. cols. to check;
> [N,q] = size(A);
> q = q-p; % no. cols to do averaging on
> T = sortrows(A,1:p); % Sort according to first p columns
> [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
> r = repmat(n,q,1); % Row nos. for accumarray
> c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
> T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
> T = accumarray([r,c],T)); % Do the required addition
> d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
> B = [B,T./d]; % Append mean values in last q columns
>
> B is the desired result.
>
> Roger Stafford

Thank you to both of you! This is exactly the result that I need! Very efficient and simple! Great help to my project!

R L

Subject: Matching data based on columns

From: R L

Date: 15 Mar, 2009 16:33:02

Message: 5 of 11

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpif2l$rg2$1@fred.mathworks.com>...
> "R L" <rcwlhk@hotmail.com> wrote in message <gphqm6$eh5$1@fred.mathworks.com>...
> > Hi all,
> >
> > I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail.
> >
> > I want to sort data in the following format:
> >
> > A = [1 1 3 4 98 97 96; ...
> > 1 1 3 4 98 93 92; ...
> > 1 1 3 4 92 94 90; ...
> > 2 3 4 5 79 88 90; ...
> > 2 4 5 7 89 78 99; ...
> > 2 6 9 7 89 78 99];
> >
> > I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following:
> >
> > B = [1 1 3 4 96 94.7 92.7; ...
> > 2 3 4 5 79 88 90; ...
> > 2 4 5 7 89 78 99; ...
> > 2 6 9 7 89 78 99];
> >
> > where
> > 96 = 1/3 * (98 + 98 + 92)
> > 94.7 = 1/3 * (97 + 93 + 94)
> > 92.7 = 1/3 * (96 + 92 + 90)
> > and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).
> >
> > Thank you so much!
> >
> > PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches).
>
> The following should do the job for you, R L, if I have understood you correctly. I am not completely satisfied with it because it does a sort twice, once with 'sortrows' and the second time with 'unique', but so far I haven't figured a way around this. We assume the N by p+q array A is given:
>
> p = 4; % no. cols. to check;
> [N,q] = size(A);
> q = q-p; % no. cols to do averaging on
> T = sortrows(A,1:p); % Sort according to first p columns
> [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
> r = repmat(n,q,1); % Row nos. for accumarray
> c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
> T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
> T = accumarray([r,c],T)); % Do the required addition
> d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
> B = [B,T./d]; % Append mean values in last q columns
>
> B is the desired result.
>
> Roger Stafford


Hi Roger,

Thank you very much for your proposed solution --- it works very well for the purpose of my project! Thank you!

If I may --- and only if it doesn't trouble you too much --- ask for another favor from you?

I realized that I need to do averaging on some of the columns but summation on other columns. Perhaps I will illustrate it with an example.

Let the matrix A be the following:

A = [1 1 3 4 98 97 96 1000 900; ...
       1 1 3 4 98 93 92 200 800; ...
       1 1 3 4 92 94 90 400 700; ...
       2 3 4 5 79 88 90 500 600; ...
       2 4 5 7 89 78 99 600 300; ...
       2 6 9 7 89 78 99 800 100];

My desired result is the matrix B,

B = [1 1 3 4 96 94.7 92.7 1600 2400; ...
       2 3 4 5 79 88 90 500 600; ...
       2 4 5 7 89 78 99 600 300; ...
       2 6 9 7 89 78 99 800 100];

where
96 = 1/3 * (98 + 98 + 92)
94.7 = 1/3 * (97 + 93 + 94)
92.7 = 1/3 * (96 + 92 + 90)

and
1600 = 1000 + 200 + 400
2400 = 900 + 800 + 700

So effectively, the procedure is the same one as you had proposed but in addition to averaging columns 5 - 7 subject to the matching conditions of columns 1 - 4, is it possible to amend the code such that, subject to the matching conditions of columns 1 - 4, columns 8 - 9 can be summed?

I wish I could amend your fabulous code myself but unfortunately my MATLAB programming skills are still rudimentary. I thank you in advance for all your help and patience!

R L

Subject: Matching data based on columns

From: Roger Stafford

Date: 15 Mar, 2009 16:49:01

Message: 6 of 11

"R L" <rcwlhk@hotmail.com> wrote in message <gpjaju$pkq$1@fred.mathworks.com>...
> "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpif2l$rg2$1@fred.mathworks.com>...
> > "R L" <rcwlhk@hotmail.com> wrote in message <gphqm6$eh5$1@fred.mathworks.com>...
> > > Hi all,
> > >
> > > I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail.
> > >
> > > I want to sort data in the following format:
> > >
> > > A = [1 1 3 4 98 97 96; ...
> > > 1 1 3 4 98 93 92; ...
> > > 1 1 3 4 92 94 90; ...
> > > 2 3 4 5 79 88 90; ...
> > > 2 4 5 7 89 78 99; ...
> > > 2 6 9 7 89 78 99];
> > >
> > > I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following:
> > >
> > > B = [1 1 3 4 96 94.7 92.7; ...
> > > 2 3 4 5 79 88 90; ...
> > > 2 4 5 7 89 78 99; ...
> > > 2 6 9 7 89 78 99];
> > >
> > > where
> > > 96 = 1/3 * (98 + 98 + 92)
> > > 94.7 = 1/3 * (97 + 93 + 94)
> > > 92.7 = 1/3 * (96 + 92 + 90)
> > > and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).
> > >
> > > Thank you so much!
> > >
> > > PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches).
> >
> > The following should do the job for you, R L, if I have understood you correctly. I am not completely satisfied with it because it does a sort twice, once with 'sortrows' and the second time with 'unique', but so far I haven't figured a way around this. We assume the N by p+q array A is given:
> >
> > p = 4; % no. cols. to check;
> > [N,q] = size(A);
> > q = q-p; % no. cols to do averaging on
> > T = sortrows(A,1:p); % Sort according to first p columns
> > [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
> > r = repmat(n,q,1); % Row nos. for accumarray
> > c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
> > T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
> > T = accumarray([r,c],T)); % Do the required addition
> > d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
> > B = [B,T./d]; % Append mean values in last q columns
> >
> > B is the desired result.
> >
> > Roger Stafford
>
>
> Hi Roger,
>
> Thank you very much for your proposed solution --- it works very well for the purpose of my project! Thank you!
>
> If I may --- and only if it doesn't trouble you too much --- ask for another favor from you?
>
> I realized that I need to do averaging on some of the columns but summation on other columns. Perhaps I will illustrate it with an example.
>
> Let the matrix A be the following:
>
> A = [1 1 3 4 98 97 96 1000 900; ...
> 1 1 3 4 98 93 92 200 800; ...
> 1 1 3 4 92 94 90 400 700; ...
> 2 3 4 5 79 88 90 500 600; ...
> 2 4 5 7 89 78 99 600 300; ...
> 2 6 9 7 89 78 99 800 100];
>
> My desired result is the matrix B,
>
> B = [1 1 3 4 96 94.7 92.7 1600 2400; ...
> 2 3 4 5 79 88 90 500 600; ...
> 2 4 5 7 89 78 99 600 300; ...
> 2 6 9 7 89 78 99 800 100];
>
> where
> 96 = 1/3 * (98 + 98 + 92)
> 94.7 = 1/3 * (97 + 93 + 94)
> 92.7 = 1/3 * (96 + 92 + 90)
>
> and
> 1600 = 1000 + 200 + 400
> 2400 = 900 + 800 + 700
>
> So effectively, the procedure is the same one as you had proposed but in addition to averaging columns 5 - 7 subject to the matching conditions of columns 1 - 4, is it possible to amend the code such that, subject to the matching conditions of columns 1 - 4, columns 8 - 9 can be summed?
>
> I wish I could amend your fabulous code myself but unfortunately my MATLAB programming skills are still rudimentary. I thank you in advance for all your help and patience!
>
> R L

  Yes, that sounds quite possible but it will have to wait until I return from a hike in our local mountains.

Roger Stafford

Subject: Matching data based on columns

From: R L

Date: 15 Mar, 2009 19:41:01

Message: 7 of 11

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpjbht$l97$1@fred.mathworks.com>...
> "R L" <rcwlhk@hotmail.com> wrote in message <gpjaju$pkq$1@fred.mathworks.com>...
> > "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpif2l$rg2$1@fred.mathworks.com>...
> > > "R L" <rcwlhk@hotmail.com> wrote in message <gphqm6$eh5$1@fred.mathworks.com>...
> > > > Hi all,
> > > >
> > > > I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail.
> > > >
> > > > I want to sort data in the following format:
> > > >
> > > > A = [1 1 3 4 98 97 96; ...
> > > > 1 1 3 4 98 93 92; ...
> > > > 1 1 3 4 92 94 90; ...
> > > > 2 3 4 5 79 88 90; ...
> > > > 2 4 5 7 89 78 99; ...
> > > > 2 6 9 7 89 78 99];
> > > >
> > > > I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following:
> > > >
> > > > B = [1 1 3 4 96 94.7 92.7; ...
> > > > 2 3 4 5 79 88 90; ...
> > > > 2 4 5 7 89 78 99; ...
> > > > 2 6 9 7 89 78 99];
> > > >
> > > > where
> > > > 96 = 1/3 * (98 + 98 + 92)
> > > > 94.7 = 1/3 * (97 + 93 + 94)
> > > > 92.7 = 1/3 * (96 + 92 + 90)
> > > > and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).
> > > >
> > > > Thank you so much!
> > > >
> > > > PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches).
> > >
> > > The following should do the job for you, R L, if I have understood you correctly. I am not completely satisfied with it because it does a sort twice, once with 'sortrows' and the second time with 'unique', but so far I haven't figured a way around this. We assume the N by p+q array A is given:
> > >
> > > p = 4; % no. cols. to check;
> > > [N,q] = size(A);
> > > q = q-p; % no. cols to do averaging on
> > > T = sortrows(A,1:p); % Sort according to first p columns
> > > [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
> > > r = repmat(n,q,1); % Row nos. for accumarray
> > > c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
> > > T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
> > > T = accumarray([r,c],T)); % Do the required addition
> > > d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
> > > B = [B,T./d]; % Append mean values in last q columns
> > >
> > > B is the desired result.
> > >
> > > Roger Stafford
> >
> >
> > Hi Roger,
> >
> > Thank you very much for your proposed solution --- it works very well for the purpose of my project! Thank you!
> >
> > If I may --- and only if it doesn't trouble you too much --- ask for another favor from you?
> >
> > I realized that I need to do averaging on some of the columns but summation on other columns. Perhaps I will illustrate it with an example.
> >
> > Let the matrix A be the following:
> >
> > A = [1 1 3 4 98 97 96 1000 900; ...
> > 1 1 3 4 98 93 92 200 800; ...
> > 1 1 3 4 92 94 90 400 700; ...
> > 2 3 4 5 79 88 90 500 600; ...
> > 2 4 5 7 89 78 99 600 300; ...
> > 2 6 9 7 89 78 99 800 100];
> >
> > My desired result is the matrix B,
> >
> > B = [1 1 3 4 96 94.7 92.7 1600 2400; ...
> > 2 3 4 5 79 88 90 500 600; ...
> > 2 4 5 7 89 78 99 600 300; ...
> > 2 6 9 7 89 78 99 800 100];
> >
> > where
> > 96 = 1/3 * (98 + 98 + 92)
> > 94.7 = 1/3 * (97 + 93 + 94)
> > 92.7 = 1/3 * (96 + 92 + 90)
> >
> > and
> > 1600 = 1000 + 200 + 400
> > 2400 = 900 + 800 + 700
> >
> > So effectively, the procedure is the same one as you had proposed but in addition to averaging columns 5 - 7 subject to the matching conditions of columns 1 - 4, is it possible to amend the code such that, subject to the matching conditions of columns 1 - 4, columns 8 - 9 can be summed?
> >
> > I wish I could amend your fabulous code myself but unfortunately my MATLAB programming skills are still rudimentary. I thank you in advance for all your help and patience!
> >
> > R L
>
> Yes, that sounds quite possible but it will have to wait until I return from a hike in our local mountains.
>
> Roger Stafford

Thank you very much Roger! I hope you'll have a great and safe trip. I look forward to hearing back from you!

R L

Subject: Matching data based on columns

From: Darren Rowland

Date: 16 Mar, 2009 01:36:01

Message: 8 of 11


> > > > p = 4; % no. cols. to check;
> > > > [N,q] = size(A);
> > > > q = q-p; % no. cols to do averaging on
> > > > T = sortrows(A,1:p); % Sort according to first p columns
> > > > [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
> > > > r = repmat(n,q,1); % Row nos. for accumarray
> > > > c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
> > > > T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
> > > > T = accumarray([r,c],T)); % Do the required addition
> > > > d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
> > > > B = [B,T./d]; % Append mean values in last q columns
> > > >
> > > > B is the desired result.
> > > >
> > > > Roger Stafford

 > So effectively, the procedure is the same one as you had proposed but in addition to averaging columns 5 - 7 subject to the matching conditions of columns 1 - 4, is it possible to amend the code such that, subject to the matching conditions of columns 1 - 4, columns 8 - 9 can be summed?

Roger's code already calculates the sum as required, we just need to stop the average being taken. We can do this by changing d

 d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
d(end-1:end) = 1; % Do not average the last two columns
B = [B,T./d]; % Append mean and summed values in last q columns

Hth
Darren

Subject: Matching data based on columns

From: Roger Stafford

Date: 16 Mar, 2009 04:30:04

Message: 9 of 11

"R L" <rcwlhk@hotmail.com> wrote in message <gpjlkd$j8e$1@fred.mathworks.com>...
> Thank you very much Roger! I hope you'll have a great and safe trip. I look forward to hearing back from you!
> R L

  Yes I returned safe but tired, R L. As an octogenarian I guess I am getting a little old for this hiking stuff.

  Darren's solution to your alteration looks valid. The other way to do it would be to create d just three columns wide and divide by it in the proper three column section of T. The remaining two columns of T could simply be appended as they are. I am referring to the T produced by 'accumarray' with sums in it, not the earlier T.

 ......
 d = repmat(diff(find([true;diff(n)~=0;true])),1,3);
 B = [B,T(;,1:3)./d,T(:,4:5)];

  I'll let you figure out the generalization in terms of p, q, and one more initially defined parameter: the width of the section to be simply added.

Roger Stafford

Subject: Matching data based on columns

From: R L

Date: 18 Mar, 2009 18:33:00

Message: 10 of 11

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpkkkc$l9e$1@fred.mathworks.com>...
> "R L" <rcwlhk@hotmail.com> wrote in message <gpjlkd$j8e$1@fred.mathworks.com>...
> > Thank you very much Roger! I hope you'll have a great and safe trip. I look forward to hearing back from you!
> > R L
>
> Yes I returned safe but tired, R L. As an octogenarian I guess I am getting a little old for this hiking stuff.
>
> Darren's solution to your alteration looks valid. The other way to do it would be to create d just three columns wide and divide by it in the proper three column section of T. The remaining two columns of T could simply be appended as they are. I am referring to the T produced by 'accumarray' with sums in it, not the earlier T.
>
> ......
> d = repmat(diff(find([true;diff(n)~=0;true])),1,3);
> B = [B,T(;,1:3)./d,T(:,4:5)];
>
> I'll let you figure out the generalization in terms of p, q, and one more initially defined parameter: the width of the section to be simply added.
>
> Roger Stafford

Sorry for the late reply --- been busy in the last few days.

Great solution and thanks for the pointers to all of you! Thanks again!

R L

Subject: Matching data based on columns

From: R L

Date: 18 Mar, 2009 18:34:01

Message: 11 of 11

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpkkkc$l9e$1@fred.mathworks.com>...
> "R L" <rcwlhk@hotmail.com> wrote in message <gpjlkd$j8e$1@fred.mathworks.com>...
> > Thank you very much Roger! I hope you'll have a great and safe trip. I look forward to hearing back from you!
> > R L
>
> Yes I returned safe but tired, R L. As an octogenarian I guess I am getting a little old for this hiking stuff.
>
> Darren's solution to your alteration looks valid. The other way to do it would be to create d just three columns wide and divide by it in the proper three column section of T. The remaining two columns of T could simply be appended as they are. I am referring to the T produced by 'accumarray' with sums in it, not the earlier T.
>
> ......
> d = repmat(diff(find([true;diff(n)~=0;true])),1,3);
> B = [B,T(;,1:3)./d,T(:,4:5)];
>
> I'll let you figure out the generalization in terms of p, q, and one more initially defined parameter: the width of the section to be simply added.
>
> Roger Stafford

Sorry for the late reply --- been busy in the last few days.

Great solution and thanks for the pointers to all of you! Thanks again!

R L

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us