This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.


Randomly sample from data, with or without replacement


y = datasample(data,k)
y = datasample(data,k,dim)
[y,idx] = datasample(data,k,...)
[y,...] = datasample(s,data,k,...)
[y,...] = datasample(data,k,Name,Value)
[y,...] = datasample(data,k,dim,Name,Value)


y = datasample(data,k) returns k observations sampled uniformly at random, with replacement, from the data in data.

y = datasample(data,k,dim) returns a sample taken along dimension dim of data.

[y,idx] = datasample(data,k,...) returns an index vector indicating which values datasample sampled from data.

[y,...] = datasample(s,data,k,...) uses the random number stream s to generate random numbers.

[y,...] = datasample(data,k,Name,Value) or [y,...] = datasample(data,k,dim,Name,Value) samples with additional options specified by one or more Name,Value pair arguments.

Input Arguments


Vector, matrix, N-dimensional array, table, or dataset array representing the data from which to sample. By default, datasample regards the rows of a data matrix, or the first nonsingleton dimension of a data array, as data elements. Change this behavior with the dim argument.


Positive integer, the number of samples.


Integer specifying the dimension on which to take samples. For example, if data is a matrix and dim is 2, y contains a selection of columns in data. If data is a table or dataset array and dim is 2, y contains a selection of variables in data. Use dim to ensure sampling along a specific dimension regardless of whether data is a vector, matrix or N-dimensional array.

Default: 1


Random number stream. Create s using rng or RandStream.

Default: The global random number stream

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.


Select the sample with replacement if Replace is true, or without replacement if Replace is false. If Replace is false, k must not be larger than the number of data elements in data.

Default: true


Vector with the same number of elements as data elements in data, and with nonnegative elements. Sample with probability proportional to the elements of Weights.

Default: ones(datasize,1), where datasize is the number of data elements in data

Output Arguments


  • If data is a vector, y is a vector containing k elements selected from data.

  • If data is a matrix, y is a matrix containing k rows selected from data. Or, if dim = 2, y is a matrix containing k columns selected from data

  • If data is an N-dimensional array, datasample samples along its first non-singleton dimension. Or, if you give a dim name-value pair, datasample samples along the dimension dim.

When the sample is taken with replacement (default), y can contain repeated observations from data. Set the Replace name-value pair to false to sample without replacement.


Vector of indices indicating which elements datasample chose from data to create y. For example:

  • If data is a vector, y = data(idx).

  • If data is a matrix, y = data(idx,:).


Draw five unique values from the integers 1:10.

y = datasample(1:10,5,'Replace',false)

y =
     6     3     7     8     5

Generate a random sequence of the characters ACGT, with replacement, according to specified probabilities.

seq = datasample('ACGT',48,'Weights',[0.15 0.35 0.35 0.15])

seq =

Select a random subset of columns from a data matrix.

X = randn(10,1000);
Y = datasample(X,5,2,'Replace',false)

Y =
    0.7007    0.3382    2.1298   -0.1891    0.5026
    0.6520   -0.6693   -0.1961   -0.9915    1.9107
    0.1785    0.6640    2.3247   -1.1735   -1.0020
    1.6760    2.6102   -0.8902   -0.7735    1.8676
   -0.3251   -0.6415   -0.2572   -0.1629   -1.0523
    0.1011    0.9323   -1.3088   -0.4477    0.8036
   -0.5767   -0.5778   -0.8556    0.8672   -0.0727
   -0.0615   -0.9084    0.9020   -0.4185   -1.9520
    0.7256   -1.1228    0.7558    1.2691    2.4997
   -1.2273    0.5754   -0.8755   -0.8224   -1.2066

Resample observations from a dataset array to create a bootstrap replicate dataset.

load hospital
y = datasample(hospital,size(hospital,1));

Use the second output to sample "in parallel" from two data vectors.

x1 = randn(100,1);
x2 = randn(100,1);
[y1,idx] = datasample(x1,10);
y2 = x2(idx);


  • To sample random integers with replacement from a range, use randi.

  • To sample random integers without replacement, use randperm or datasample.

  • To randomly sample from data, with or without replacement, use datasample.


datasample uses randperm, rand, or randi to generate random values. Therefore, datasample changes the state of the MATLAB® global random number generator. Control the random number generator using rng.

For selecting weighted samples without replacement, datasample uses the algorithm of Wong and Easton [1].


You can use randi or randperm to generate indices for random sampling with or without replacement, respectively. However, datasample can be more convenient because it samples directly from your data. datasample also allows weighted sampling.


[1] Wong, C. K. and M. C. Easton. An Efficient Method for Weighted Sampling Without Replacement. SIAM Journal of Computing 9(1), pp. 111–113, 1980.

Extended Capabilities

See Also

| | | |


    Introduced in R2011b

    Was this topic helpful?