getblast

Retrieve BLAST report from NCBI Web site

Syntax

Data = getblast(RID)

Data = getblast(RID, ...'Descriptions', DescriptionsValue, ...)
Data = getblast(RID, ...'Alignments', AlignmentsValue, ...)
Data = getblast(RID, ...'ToFile', ToFileValue, ...)
Data = getblast(RID, ...'FileFormat', FileFormatValue, ...)
Data = getblast(RID, ...'WaitTime', WaitTimeValue, ...)

Input Arguments

RIDRequest ID for the NCBI BLAST report, such as returned by the blastncbi function.
DescriptionsValueInteger that specifies the number of descriptions in a report. Choices are any value ≥ 1 and ≤ 500. Default is 100.
AlignmentsValueInteger that specifies the number of alignments to include in the report. Choices are any value ≥ 1 and ≤ 500. Default is 50.

    Note:   This value must be ≤ the value you specified for the 'Alignments' property when creating RID using the blastncbi function.

ToFileValueString specifying a file name for saving report data.
FileFormatValueString specifying the format of the file. Choices are 'text' (default) or 'html'.
WaitTimeValue

Positive value that specifies a time (in minutes) for the MATLAB® software to wait for a report from the NCBI Web site to be available. If the report is still not available after the wait time, getblast returns an error message. Default behavior is to not wait for a report.

    Tip   Use the RTOE returned by the blastncbi function as the WaitTimeValue.

Output Arguments

DataMATLAB structure or array of structures (if multiple query sequences) containing fields corresponding to BLAST keywords and data from an NCBI BLAST report.

Description

The Basic Local Alignment Search Tool (BLAST) offers a fast and powerful comparative analysis of protein and nucleotide sequences against known sequences in online databases. getblast parses NCBI BLAST reports, including blastn, blastp, psiblast, blastx, tblastn, tblastx, and megablast reports.

Data = getblast(RID) reads RID, the Request ID for the NCBI BLAST report, and returns the report data in Data, a MATLAB structure or array of structures. The Request ID, RID, must be recently generated because NCBI purges reports after 24 hours.

Data = getblast(RID, ...'PropertyName', PropertyValue, ...) calls getblast with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


Data = getblast(RID, ...'Descriptions', DescriptionsValue, ...)
specifies the number of descriptions in a report. Choices are any integer ≥ 1 and ≤ 500. Default is 100.

Data = getblast(RID, ...'Alignments', AlignmentsValue, ...) specifies the number of alignments to include in the report. Choices are any integer ≥ 1 and ≤ 500. Default is 50.

    Note:   This value must be ≤ the value you specified for the 'Alignments' property when creating RID using the blastncbi function.

Data = getblast(RID, ...'ToFile', ToFileValue, ...) saves the NCBI BLAST report data to a specified file. The default format for the file is 'text', but you can specify 'html' with the 'FileFormat' property.

Data = getblast(RID, ...'FileFormat', FileFormatValue, ...) specifies the format for the report. Choices are 'text' (default) or 'html'.

Data = getblast(RID, ...'WaitTime', WaitTimeValue, ...) pauses the MATLAB software and waits a specified time (in minutes) for a report from the NCBI Web site to be available. If the report is still unavailable after the wait time, getblast returns an error message. Choices are any positive value. Default behavior is to not wait for a report.

    Tip   Use the RTOE returned by the blastncbi function as the WaitTimeValue.

For more information about reading and interpreting BLAST reports, see:

http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs

Data contains the following fields.

FieldDescription
RIDRequest ID for retrieving results for a specific NCBI BLAST search.
AlgorithmNCBI algorithm used to do a BLAST search.
QueryIdentifier of the query sequence submitted to a BLAST search.
DatabaseAll databases searched.
Hits.NameName of a database sequence (subject sequence) that matched the query sequence.
Hits.LengthLength of a subject sequence.
Hits.HSPs.ScorePairwise alignment score for a high-scoring sequence pair between the query sequence and a subject sequence.
Hits.HSPs.ExpectExpectation value for a high-scoring sequence pair between the query sequence and a subject sequence.
Hits.HSPs.IdentitiesIdentities (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence.
Hits.HSPs.Positives Identical or similar residues (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject amino acid sequence.

    Note:   This field applies only to translated nucleotide or amino acid query sequences and/or databases.

Hits.HSPs.Gaps Nonaligned residues (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence.
Hits.HSPs.FrameReading frame of the translated nucleotide sequence for a high-scoring sequence pair between the query sequence and a subject sequence.

    Note:   This field applies only when performing translated searches, that is, when using tblastx, tblastn, and blastx.

Hits.HSPs.Strand Sense (Plus = 5' to 3' and Minus = 3' to 5') of the DNA strands for a high-scoring sequence pair between the query sequence and a subject sequence.

    Note:   This field applies only when using a nucleotide query sequence and database.

Hits.HSPs.Alignment Three-row matrix showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence.
Hits.HSPs.QueryIndicesIndices of the query sequence residue positions for a high-scoring sequence pair between the query sequence and a subject sequence.
Hits.HSPs.SubjectIndicesIndices of the subject sequence residue positions for a high-scoring sequence pair between the query sequence and a subject sequence.
StatisticsSummary of statistical details about the performed search, such as lambda values, gap penalties, number of sequences searched, and number of hits.

Examples

  1. Create an NCBI BLAST report request using a GenPept accession number.

    RID = blastncbi('AAA59174','blastp','expect',1e-10)
    
    RID = 
    
        '1175088155-31624-126008617054.BLASTQ3'
  2. Pass the Request ID for the report to the getblast function to parse the report, and return the report data in a MATLAB structure, and save the report data to a text file.

    reportStruct = getblast(RID,'ToFile','AAA59174_BLAST.rpt')
    
    reportStruct = 
    
               RID: '1175093633-2786-174709873694.BLASTQ3'
         Algorithm: 'BLASTP 2.2.16 [Mar-11-2007]'
             Query: [1x63 char]
          Database: [1x96 char]
              Hits: [1x50 struct]
        Statistics: [1x1034 char]

      Note:   You may need to wait for the report to become available on the NCBI Web site before you can run the preceding command.

References

[1] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

[2] Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

For more information about reading and interpreting NCBI BLAST reports, see:

http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs
Was this topic helpful?