| Bioinformatics Toolbox™ | ![]() |
Data = blastreadlocal(BLASTReport, Format)
| BLASTReport | BLAST report specified by any of the following:
If you specify only a file name, that file must be on the MATLAB search path or in the current directory. |
| Format | Integer specifying the alignment format used to create BLASTReport.
Choices are:
|
| Data | MATLAB structure or array of structures (if multiple query sequences) containing fields corresponding to BLAST keywords and data from a local BLAST report. |
The Basic Local Alignment Search Tool (BLAST) offers a fast and powerful comparative analysis of protein and nucleotide sequences against known sequences in online and local databases. BLAST reports can be lengthy, and parsing the data from the various formats can be cumbersome.
Data = blastreadlocal(BLASTReport, Format) reads BLASTReport, a locally created BLAST report file, and returns Data, a MATLAB structure or array of structures (if multiple query sequences) containing fields corresponding to BLAST keywords and data from a local BLAST report. Format is an integer specifying the alignment format used to create BLASTReport.
Note The function assumes the BLAST report was produced using version 2.2.17 of the blastall executable. |
Data contains a subset of the following fields, based on the specified alignment format.
| Field | Description |
|---|---|
| Algorithm | NCBI algorithm used to do a BLAST search. |
| Query | Identifier of the query sequence submitted to a BLAST search. |
| Length | Length of the query sequence. |
| Database | All databases searched. |
| Hits.Name | Name of a database sequence (subject sequence) that matched the query sequence. |
| Hits.Score | Alignment score between the query sequence and the subject sequence. |
| Hits.Expect | Expectation value for the alignment between the query sequence and the subject sequence. |
| Hits.Length | Length of a subject sequence. |
| Hits.HSPs.Score | Pairwise alignment score for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Expect | Expectation value for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Identities | Identities (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Positives | Identical or similar residues (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject amino acid sequence. |
| Hits.HSPs.Gaps | Nonaligned residues (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Mismatches | Residues that are not similar to each other (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Frame | Reading frame of the translated nucleotide sequence for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Strand | Sense (Plus = 5' to 3' and Minus = 3' to 5') of the DNA strands for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Alignment | Three-row matrix showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.QueryIndices | Indices of the query sequence residue positions for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.SubjectIndices | Indices of the subject sequence residue positions for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.AlignmentLength | Length of the pairwise alignment for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Alignment | Entire alignment for the query sequence and the subject sequence(s). |
| Statistics | Summary of statistical details about the performed search, such as lambda values, gap penalties, number of sequences searched, and number of hits. |
Download the ecoli.nt.gz zip file from
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/
and then extract the ecoli.nt FASTA file to your MATLAB current directory.
Create a local blastable database from the ecoli.nt FASTA file.
blastformat('inputdb', 'ecoli.nt', 'protein', 'false');Use the getgenbank function to retrieve two sequences from the GenBank® database.
S1 = getgenbank('M28570.1');
S2 = getgenbank('M12565');Use the fastawrite function to create a FASTA file named query_multi_nt.fa from these two sequences, using the only accession number as the header.
Seqs(1).Header = S1.Accession;
Seqs(1).Sequence = S1.Sequence;
Seqs(2).Header = S2.Accession;
Seqs(2).Sequence = S2.Sequence;
fastawrite('query_multi_nt.fa', Seqs);Submit the query sequences in the query_multi_nt.fa FASTA file for a BLAST search of the local nucleotide database ecoli.nt. Specify the BLAST program blastn and a tabular alignment format. Save the contents of the BLAST report to a file named myecoli_nt8.txt, and then read the local BLAST report, displaying the results in the MATLAB Command Window.
blastlocal('inputquery', 'query_multi_nt.fa',...
'database', 'ecoli.nt',...
'tofile', 'myecoli_nt8.txt', 'program', 'blastn',...
'format', 8);
blastreadlocal('myecoli_nt8.txt', 8);
Submit the query sequences in the query_multi_nt.fa FASTA file for a BLAST search of the local nucleotide database ecoli.nt. Specify the BLAST program blastn and a query-anchored format. Save the contents of the BLAST report to a file named myecoli_nt1.txt, and then read the local BLAST report, saving the results in results, an array of structures.
blastlocal('inputquery', 'query_multi_nt.fa',...
'database', 'ecoli.nt',...
'tofile', 'myecoli_nt1.txt', 'program', 'blastn',...
'format', 1);
results = blastreadlocal('myecoli_nt1.txt', 1);
[1] Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
[2] Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
For more information about reading and interpreting BLAST reports, see:
http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html
Bioinformatics Toolbox™ functions: blastformat, blastlocal, blastncbi, blastread, getblast
![]() | blastread | blosum | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |