Path: news.mathworks.com!not-for-mail From: "CT " <cong-thanh.do@hotmail.fr> Newsgroups: comp.soft-sys.matlab Subject: Re: Bootstrapping multivariate data Date: Wed, 18 Aug 2010 15:55:28 +0000 (UTC) Organization: The MathWorks, Inc. Lines: 55 Message-ID: <i4gvpg$3bu$1@fred.mathworks.com> References: <i4d8f9$n3a$1@fred.mathworks.com> <i4dvkq$8ns$2@fred.mathworks.com> <i4e5mt$c2f$1@fred.mathworks.com> <i4eikn$ndn$1@fred.mathworks.com> <i4eo1o$c8b$1@fred.mathworks.com> <i4fttk$qpk$1@fred.mathworks.com> <i4g04r$fa8$1@fred.mathworks.com> <i4g0sl$a6k$1@fred.mathworks.com> Reply-To: "CT " <cong-thanh.do@hotmail.fr> NNTP-Posting-Host: webapp-05-blr.mathworks.com Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: fred.mathworks.com 1282146928 3454 172.30.248.35 (18 Aug 2010 15:55:28 GMT) X-Complaints-To: news@mathworks.com NNTP-Posting-Date: Wed, 18 Aug 2010 15:55:28 +0000 (UTC) X-Newsreader: MATLAB Central Newsreader 2447641 Xref: news.mathworks.com comp.soft-sys.matlab:663078 For instance, I have a matrix X(M,N) = X(3,500) of initial data. There are thus N = 500 observations of random vector tri-variate random vector x following the multivariate normal distribution. These data can be generated by the code: mu = [1 -1 -2]; Sigma = [2 -1 1; -1 2 -1; 1 -1 2]; X = mvnrnd(mu, Sigma, 500); I don't know if I can use 'bootstrp' to generate the data of the same nature, i.e. they follow (asymptotically) the multivariate normal distribution that I have used to generate X: [bootstat, bootsamp] = bootstrp(10, [], X); (I don't care about the stats of the data at the moment, I want to have the resampled data only). However, 'bootstrp' returns the matrix bootsamp of dimension 500x10, so 'bootstrp' has done only for one dimensional variable? And I don't know if 'bootstrp' can return the stats for multivariate distribution or not? (here are the mean vector and covariance matrix) "Rogelio " <rogelioa@math.uio.no> wrote in message <i4g0sl$a6k$1@fred.mathworks.com>... > By the way ...... what is the statistc that you are bootstraping? it will be nice if you post the code. > > "Rogelio " <rogelioa@math.uio.no> wrote in message <i4g04r$fa8$1@fred.mathworks.com>... > > If you are saying or have a feeling that your data might come from a multivariate distribution, then as far as I know 'bootstrp' will pool your data together, assuming they come from the same pdf which might be an erronous assumption. > > > I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think)< > > Why? can you tell us what is the mistake or post the code > > >As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?< > > What the bootstrapring does, roughly speaking, is to resample with replacement. We create pseudo random variables out from your original data. The empirical pdf will converge to the pdf, this is asymptotically. > > > > > > "CT " <cong-thanh.do@hotmail.fr> wrote in message <i4fttk$qpk$1@fred.mathworks.com>... > > > I mean that I have N observations of the random vectors x, the vector x has M elements, these are the seed data. So each variable here is a vector (of M elements). Their probability density distribution (pdf) might be multivariate distribution, e.g. Gaussian mixture model (GMM). Since the bootstrap here is non-parametric, the N observations will be used instead of a concrete pdf. > > > > > > I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think). > > > > > > If the generated data is only X(:,ceil(rand(1,N)*N)), I don't see anything new that the bootstrap can bring. As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong? > > > > > > "Rogelio " <rogelioa@math.uio.no> wrote in message <i4eo1o$c8b$1@fred.mathworks.com>... > > > > Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4eikn$ndn$1@fred.mathworks.com>... > > > > > On 8/17/2010 10:18 AM, Simon Preston wrote: > > > > > >> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat > > > > > >> /bootstrp.html> > > > > > > > > > > Sorry, for some reason that link was missing an "s" > > > > > <http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html> > > > > > > > > > > > Isn't this just: > > > > > > > > > > > > X(:,ceil(rand(1,N)*N)) > > > > > > > > > > > > where X is the sample matrix? > > > > > > > > > > That's the basis of it, yes. But: > > > > > > > > > > 1) It's kind of tedious to write the same loop over and over, regardless > > > > > of how simple that loop is, > > > > > 1) There is a good deal of flexibility in the arguments you can pass to > > > > > BOOTSTRP, so a single matrix isn't the only case it handles for you, and > > > > > 2) (in recent MATLAB releases) There is support for parallelizing the > > > > > computations using PARFOR (if your installation supports that) > > > > > > > > > > Just as an aside, since 2008b you might find it easier to use RANDI to > > > > > generate random integers. > > > > > > > > Just one thing to point out, you said that M is the dimention of the data. I thought that you ment different groups or different experiments where the data was collected, after all thats why your data is not of dimenation N*M x 1, for instance. If the columns of the matrix represent different groups, for some or another reason, you cannot pool the series. As far as know 'bootstrp' does not distinguishes among different groups. If this last statement is incorrect, can someone send me the link to read about it. > > > > Thanks