Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

Data Import

Import Next-Generation Sequencing (NGS) data and feature annotations from SAM, BAM, FASTA, FASTQ, GTF, and GFF files

Import NGS data stored in different file formats such as FASTA, FASTQ, SAM, and BAM files. Read feature annotations from GTF and GFF files. Use various objects to access and manage NGS data. For instance, the BioIndexedFile object lets you efficiently access text files with nonuniform-size entries, such as sequences and annotations. Use the object to access individual entries or a subset of entries when the source file is too big to fit in memory. Use the BioMap and BioRead objects to store and manage sequence read data containing information on headers, qualities, and alignments.

Functions

fastainfoReturn information about FASTA file
fastareadRead data from FASTA file
fastawriteWrite to file using FASTA format
fastqinfoReturn information about FASTQ file
fastqreadRead data from FASTQ file
fastqwriteWrite to file using FASTQ format
saminfoReturn information about SAM file
samreadRead data from SAM file
baminfoReturn information about BAM file
bamreadRead data from BAM file
bamindexreadRead BAM Index, BAI, file

Classes

BioReadContain sequence and quality data
BioMapContain sequence, quality, alignment, and mapping data
BioReadQualityStatisticsQuality statistics from a short-read sequence file
BioIndexedFileAllow quick and efficient access to large text file with nonuniform-size entries
GFFAnnotationContain General Feature Format (GFF) annotations
GTFAnnotationContain Gene Transfer Format (GTF) annotations

Examples and How To

Work with Next-Generation Sequencing Data

Use BioIndexedFile objects to extract entries from large files using indices or keys, and parse data using custom functions.

Manage Sequence Read Data in Objects

Use BioMap and BioRead objects to access and manage Next-Generation Sequencing (NGS) data from various file formats, such as FASTQ, SAM, and BAM files.

Store and Manage Feature Annotations in Objects

Use GTF and GFF feature annotation objects to retrieve feature information from one or more reference sequences.

Concepts

Data Formats and Databases

Access online databases and repositories using various MATLAB® functions and import data to the workspace for further analyses.

Was this topic helpful?