Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

emblread

Read data from EMBL file

Syntax

EMBLData = emblread(File)
EMBLSeq = emblread (File, 'SequenceOnly', SequenceOnlyValue)

Input Arguments

File

Either of the following:

  • Character vector specifying a file name, a path and file name, or a URL pointing to a file. The referenced file is an EMBL-formatted file. If you specify only a file name, that file must be on the MATLAB® search path or in the MATLAB Current Folder.

  • Character vector that contains the text of an EMBL-formatted file

Tip

You can use the getembl function with the 'ToFile' property to retrieve data from the European Molecular Biology Laboratory (EMBL) database and create an EMBL-formatted file.

SequenceOnlyValueControls the reading of only the sequence without the metadata. Choices are true or false (default).

Output Arguments

EMBLData Structure with fields corresponding to EMBL data.
EMBLSeqCharacter vector representing the sequence.

Description

EMBLData = emblread(File) reads data from File, an EMBL-formatted file, and creates EMBLData, a MATLAB structure containing fields corresponding to the EMBL two-character line type code, based on release 107 of the EMBL-Bank flat file format. Each line type code is stored as a separate element in the structure. For a list of the EMBL two-character line type codes, see ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/usrman.txt.

Note

Topology information was not included in EMBL flat files before release 87 of the database. When reading a file created before release 87, EMBLREAD returns an empty Identification.Topology field.

Note

The entry name is no longer displayed in the ID line of EMBL flat files in release 87. When reading a file created in release 87, EMBLREAD returns the accession number in the Identification.EntryName field.

EMBLSeq = emblread (File, 'SequenceOnly', SequenceOnlyValue) controls the reading of only the sequence without the metadata. Choices are true or false (default).

Examples

collapse all

Download the sequence information from the web and save to a file.

out = getembl('X00558','ToFile','rat_protein.txt');

Read data from the EMBL file.

seqData = emblread('rat_protein.txt')
seqData = 

  struct with fields:

            Identification: [1×1 struct]
                 Accession: 'X00558'
           SequenceVersion: 'X00558.1'
               DateCreated: '13-JUN-1985  Rel. 06, Created '
               DateUpdated: '18-APR-2005  Rel. 83, Last updated, Version 4 '
               Description: 'Rat liver apolipoprotein A-I mRNA  apoA-I    ...'
                   Keyword: 'apolipoprotein; lipoprotein; signal peptide. ...'
           OrganismSpecies: 'Rattus norvegicus  Norway rat                ...'
    OrganismClassification: [3×75 char]
                 Organelle: ''
                 Reference: {[1×1 struct]}
    DatabaseCrossReference: [4×75 char]
                  Comments: ''
                  Assembly: ''
                   Feature: [22×75 char]
                 BaseCount: [1×1 struct]
                  Sequence: 'agctccgggggaggtcgcccacatccttcgggatgaaagctgcag...'

Introduced before R2006a

Was this topic helpful?