i have some objects that, over the course of my experiment,
will be growing larger. in an effort to minimize any
initial bias, i want to sort the objects before my
experiment begins, such that the average size of the objects
is nearly the same across my experimental groups.
for example, if my objects were plants, and i wanted to
monitor the rate at which the plants grow across a set of 4
experimental conditions, i might start with 40 seedlings. i
want to measure the lengths of the seedlings, and then
assign them into 4 groups such that the starting lengths are
approximately equal between the 4 groups.
anyone have some insight as to how this could be
accomplished in matlab?
In article <g1hqbm$4f4$1@fred.mathworks.com>, Bryan <cssmwbs@gmail.com> wrote:
>i have some objects that, over the course of my experiment,
>will be growing larger. in an effort to minimize any
>initial bias, i want to sort the objects before my
>experiment begins, such that the average size of the objects
>is nearly the same across my experimental groups.
You can use something like this:
clusteridx = kmeans(sizedata,numberofgroups);
clusters = cell(numberofgroups,1);
for K=1:numberofgroups
clusters{K} = objectinformation(clusteridx == K);
end
Note that the initial points for growing the clusters are chosen
at random, so if the clusters are not well seperated, the clustering
may differ from run to run. You may wish to use some of the optional
kmeans() parameters to control this. And of course the absolute cluster
numbers don't mean anything, so what comes out as cluster index 1
in one run might come out as cluster index 4 in another.
--
"Product of a myriad various minds and contending tongues, compact of
obscure and minute association, a language has its own abundant and
often recondite laws, in the habitual and summary recognition of
which scholarship consists." -- Walter Pater
On May 27, 4:19=A0pm, "Bryan " <cssm...@gmail.com> wrote:
> hi all,
>
> i have some objects that, over the course of my experiment,
> will be growing larger. =A0in an effort to minimize any
> initial bias, i want to sort the objects before my
> experiment begins, such that the average size of the objects
> is nearly the same across my experimental groups.
>
> for example, if my objects were plants, and i wanted to
> monitor the rate at which the plants grow across a set of 4
> experimental conditions, i might start with 40 seedlings. =A0i
> want to measure the lengths of the seedlings, and then
> assign them into 4 groups such that the starting lengths are
> approximately equal between the 4 groups.
>
> anyone have some insight as to how this could be
> accomplished in matlab?
>
> thanks,
> bryan
y =3D sort(x);
for i =3D 1:4
z(:,i) =3D y(i:4:end):
end% for i
Greg Heath <heath@alumni.brown.edu> wrote in message
<32779933-71c2-46aa-aa1b-923b4a42471e@8g2000hse.googlegroups.com>...
> On May 27, 4:19=A0pm, "Bryan " <cssm...@gmail.com> wrote:
> > hi all,
> >
> > i have some objects that, over the course of my experiment,
> > will be growing larger. =A0in an effort to minimize any
> > initial bias, i want to sort the objects before my
> > experiment begins, such that the average size of the objects
> > is nearly the same across my experimental groups.
> >
> > for example, if my objects were plants, and i wanted to
> > monitor the rate at which the plants grow across a set of 4
> > experimental conditions, i might start with 40
seedlings. =A0i
> > want to measure the lengths of the seedlings, and then
> > assign them into 4 groups such that the starting lengths are
> > approximately equal between the 4 groups.
> >
> > anyone have some insight as to how this could be
> > accomplished in matlab?
> >
> > thanks,
> > bryan
>
> y =3D sort(x);
> for i =3D 1:4
> z(:,i) =3D y(i:4:end):
> end% for i
>
> Hope this helps.
>
> Greg
hi,
thanks for the suggestions... i have not yet tried the
kmeans clustering (clever tactic!). but i did want to point
out that the simple 'sort and bin' method described above
does not work at all. it certainly does not approach the
equivalence of means that i was searching for. this can
even be seen with random data:
a = sort(rand(100,1));
idx = repmat([1:10],1,10)';
[grpMeans grpSems] = grpstats(a,idx,{'mean','sem'});
note the lack of equivalence of the means... rather, they
are in ascending order.
basically what i ended up doing was making a whole bunch
(thousands) of randomized index variables (looping from 1:4
through length(x)), and then iteratively running through and
finding the index that gave the min difference between the
largest and smallest mean values. this seems to have worked
rather well, and the biggest difference in mean values in my
final result is less than 1% difference.
On May 28, 2:32 pm, "Bryan " <cssm...@gmail.com> wrote:
> Greg Heath <he...@alumni.brown.edu> wrote in message
> <32779933-71c2-46aa-aa1b-923b4a424...@8g2000hse.googlegroups.com>...
> > On May 27, 4:19=A0pm, "Bryan " <cssm...@gmail.com> wrote:
> > > hi all,
>
> > > i have some objects that, over the course of my experiment,
> > > will be growing larger. in an effort to minimize any
> > > initial bias, i want to sort the objects before my
> > > experiment begins, such that the average size of the objects
> > > is nearly the same across my experimental groups.
>
> > > for example, if my objects were plants, and i wanted to
> > > monitor the rate at which the plants grow across a set of 4
> > > experimental conditions, i might start with 40 seedlings.
>
> > > i want to measure the lengths of the seedlings, and then
> > > assign them into 4 groups such that the starting lengths are
> > > approximately equal between the 4 groups.
>
> > > anyone have some insight as to how this could be
> > > accomplished in matlab?
>
> > > thanks,
> > > bryan
>
> > y = sort(x);
> > for i = 1:4
> > z(:,i) = y(i:4:end):
> > end% for i
>
> > Hope this helps.
>
> > Greg
>
> hi,
>
> thanks for the suggestions... i have not yet tried the
> kmeans clustering (clever tactic!). but i did want to point
> out that the simple 'sort and bin' method described above
> does not work at all. it certainly does not approach the
> equivalence of means that i was searching for. this can
> even be seen with random data:
>
> a = sort(rand(100,1));
> idx = repmat([1:10],1,10)';
> [grpMeans grpSems] = grpstats(a,idx,{'mean','sem'});
>
> grpMeans =
>
> 0.4164
> 0.4209
> 0.4342
> 0.4467
> 0.4556
> 0.4642
> 0.4740
> 0.4908
> 0.4965
> 0.5108
>
> note the lack of equivalence of the means... rather, they
> are in ascending order.
A simple modification of one line in my code will cure that.
y = sort(x);
for i = 1:ngroups
% z(:,i) = y(i:ngroups:end);
z(:,i) = [y(i:2*ngroups:end); y(2*ngroups-i:2*ngroups:end)];
end% for i
> basically what i ended up doing was making a whole bunch
> (thousands) of randomized index variables (looping from 1:4
> through length(x)), and then iteratively running through and
> finding the index that gave the min difference between the
> largest and smallest mean values. this seems to have worked
> rather well, and the biggest difference in mean values in my
> final result is less than 1% difference.
when
length(y) = 1000
ngroups = 10
I get ~ 0.1%
clear all, clc
n = 1000
nbins = 10
y = sort(rand(n,1));
for i = 1:nbins
z1(:,i) = y(i:nbins:end);
z2(:,i) = [y(i:2*nbins:end); y(2*nbins-i:2*nbins:end)];
end% for i
m1 = mean(z1)'
m10 = mean(m1)
rng1 = max(m1)-min(m1)
d1 = 100*rng1/m10
s1 = std(m1)
cv1 = 100*s1/m10
Greg Heath <heath@alumni.brown.edu> wrote in message
<75b8618b-b8eb-4f05-9068-230bf46766ad@j22g2000hsf.googlegroups.com>...
>
> On May 28, 2:32 pm, "Bryan " <cssm...@gmail.com> wrote:
> > Greg Heath <he...@alumni.brown.edu> wrote in message
> >
<32779933-71c2-46aa-aa1b-923b4a424...@8g2000hse.googlegroups.com>...
> > > On May 27, 4:19=A0pm, "Bryan " <cssm...@gmail.com> wrote:
> > > > hi all,
> >
> > > > i have some objects that, over the course of my
experiment,
> > > > will be growing larger. in an effort to minimize any
> > > > initial bias, i want to sort the objects before my
> > > > experiment begins, such that the average size of the
objects
> > > > is nearly the same across my experimental groups.
> >
> > > > for example, if my objects were plants, and i wanted to
> > > > monitor the rate at which the plants grow across a
set of 4
> > > > experimental conditions, i might start with 40
seedlings.
> >
> > > > i want to measure the lengths of the seedlings, and then
> > > > assign them into 4 groups such that the starting
lengths are
> > > > approximately equal between the 4 groups.
> >
> > > > anyone have some insight as to how this could be
> > > > accomplished in matlab?
> >
> > > > thanks,
> > > > bryan
> >
> > > y = sort(x);
> > > for i = 1:4
> > > z(:,i) = y(i:4:end):
> > > end% for i
> >
> > > Hope this helps.
> >
> > > Greg
> >
> > hi,
> >
> > thanks for the suggestions... i have not yet tried the
> > kmeans clustering (clever tactic!). but i did want to point
> > out that the simple 'sort and bin' method described above
> > does not work at all. it certainly does not approach the
> > equivalence of means that i was searching for. this can
> > even be seen with random data:
> >
> > a = sort(rand(100,1));
> > idx = repmat([1:10],1,10)';
> > [grpMeans grpSems] = grpstats(a,idx,{'mean','sem'});
> >
> > grpMeans =
> >
> > 0.4164
> > 0.4209
> > 0.4342
> > 0.4467
> > 0.4556
> > 0.4642
> > 0.4740
> > 0.4908
> > 0.4965
> > 0.5108
> >
> > note the lack of equivalence of the means... rather, they
> > are in ascending order.
>
> A simple modification of one line in my code will cure that.
>
> y = sort(x);
> for i = 1:ngroups
> % z(:,i) = y(i:ngroups:end);
> z(:,i) = [y(i:2*ngroups:end);
y(2*ngroups-i:2*ngroups:end)];
> end% for i
>
> > basically what i ended up doing was making a whole bunch
> > (thousands) of randomized index variables (looping from 1:4
> > through length(x)), and then iteratively running through and
> > finding the index that gave the min difference between the
> > largest and smallest mean values. this seems to have worked
> > rather well, and the biggest difference in mean values in my
> > final result is less than 1% difference.
>
> when
>
> length(y) = 1000
> ngroups = 10
>
> I get ~ 0.1%
>
> clear all, clc
>
> n = 1000
> nbins = 10
> y = sort(rand(n,1));
> for i = 1:nbins
> z1(:,i) = y(i:nbins:end);
> z2(:,i) = [y(i:2*nbins:end); y(2*nbins-i:2*nbins:end)];
> end% for i
> m1 = mean(z1)'
> m10 = mean(m1)
> rng1 = max(m1)-min(m1)
> d1 = 100*rng1/m10
> s1 = std(m1)
> cv1 = 100*s1/m10
>
> m2 = mean(z2)'
> m20 = mean(m2)
> rng2 = max(m2)-min(m2)
> d2 = 100*rng2/m20
> s2 = std(m2)
> cv2 = 100*s2/m20
>
> resultd = [d1 d2]
> resultc = [cv1 cv2]
>
>
> Hope this helps.
>
> Greg
hi greg,
thanks so much for the modified version of your code. it
certainly works much better than the previous version, and
provides an equivalent result to my 'randomized search'
strategy with considerably less computation time.
regards,
bryan
Tags for this Thread
Add a New Tag:
Separated by commas
Ex.: root locus, bode
What are tags?
A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.
Anyone can tag a thread. Tags are public and visible to everyone.
Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for
all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content.
Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available
via MATLAB Central. Read the complete Disclaimer prior to use.