Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

seqsplitpe

Split merged paired-end sequences into separate files

Syntax

seqsplitpe(fastqFile)
seqsplitpe(___,Name,Value)
[outFiles,N] = seqsplitpe(___)

Description

example

seqsplitpe(fastqFile) splits merged paired-end sequences from fastqFile into two separate files. Each sequence is split in the middle. The first half of the sequence is saved in the first output file and the other half in the second output file. By default, each output file name consists of the input file name appended with a suffix '_1' or '_2' before the file extension.

example

seqsplitpe(___,Name,Value) uses additional options specified by one or more Name,Value pair arguments.

example

[outFiles,N] = seqsplitpe(___) returns the names of output files in a cell array outFiles. N represents a vector containing the numbers of sequences saved in each output file.

Examples

collapse all

Split each of the paired-end sequences in half, and store each half in separate output files.

[outFiles, N] = seqsplitpe('SXX123456_merged.fastq');

Check the number of sequences in each output file.

N
N =

    50
    50

Input Arguments

collapse all

Names of FASTQ files with sequence and quality information, specified as a character vector or cell array of character vectors.

Example: 'SRR005164_1_50.fastq'

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'OutputSuffix','PairedEnd_split' specifies to use the custom suffix in the output file names.

collapse all

Relative or absolute path to the output file directory, specified as a character vector. The default is the current directory.

Example: 'OutputDir','F:\results'

Custom suffix to use in the output file names, specified as a character vector. It is inserted after the input file name and before the suffix '_1' or '_2'. The default is ''.

Example: 'OutputSuffix','_MisMatches2'

Boolean indicating whether to perform computation in parallel, specified as true or false.

For parallel computing, you must have Parallel Computing Toolbox™. If a parallel pool does not exist, one is created automatically when the auto-creation option is enabled in your parallel preferences. Otherwise, computation runs in serial mode.

Note

There is a cost associated with sharing large input files across workers in a distributed environment. In some cases, running in parallel may not be beneficial in terms of performance.

Example: 'UseParallel',true

Output Arguments

collapse all

Output file names, returned as a cell array of character vectors. By default, the name of each output file consists of the input file name appended with a suffix '_1' or '_2' before the file extension.

Number of sequences saved in each output file, returned as an n-by-1 vector where n is the number of output files. If there are multiple output files, the order within N corresponds to the order of the output files.

Introduced in R2016b

Was this topic helpful?