| Bioinformatics Toolbox™ | ![]() |
Perform search on local BLAST database to create BLAST report
blastlocal('InputQuery', InputQueryValue)
Data = blastlocal('InputQuery', InputQueryValue)
... blastlocal(..., 'Program', ProgramValue, ...)
... blastlocal(..., 'Database', DatabaseValue, ...)
... blastlocal(..., 'BlastPath', BlastPathValue, ...)
... blastlocal(..., 'Expect', ExpectValue, ...)
... blastlocal(..., 'Format', FormatValue, ...)
... blastlocal(..., 'ToFile', ToFileValue, ...)
... blastlocal(..., 'Filter', FilterValue, ...)
... blastlocal(..., 'GapOpen', GapOpenValue, ...)
... blastlocal(..., 'GapExtend', GapExtendValue, ...)
... blastlocal(..., 'BLASTArgs', BLASTArgsValue, ...)
| InputQueryValue | String specifying the file name or path and file name of a FASTA file containing query nucleotide or amino acid sequence(s). (This corresponds to the blastall option -i.) |
| ProgramValue | String specifying a BLAST program. Choices are:
(The ProgramValue argument corresponds to the blastall option -p.) |
| DatabaseValue | String specifying a file name or path and file name of a local BLAST database (formatted using the NCBI formatdb function) to search. Default is a local version of the nr database in the MATLAB® current directory. (This corresponds to the blastall option -d.) |
| BlastPathValue | String specifying the full path to the blastall executable file, including the name and extension of the executable file. Default is the system path. |
| ExpectValue | Value specifying the statistical significance threshold for matches against database sequences. Choices are any real number. Default is 10. (This corresponds to the blastall option -e.) |
| FormatValue | Integer specifying the alignment format of the BLAST search
results. Choices are:
(This corresponds to the blastall option -m.) |
| ToFileValue | String specifying a file name or path and file name in which to save the contents of the BLAST report. (This corresponds to the blastall option -o.) |
| FilterValue | Controls the application of a filter (DUST filter for the blastn program or SEG filter for other programs) to the query sequence(s). Choices are true (default) or false. (This corresponds to the blastall option -F.) |
| GapOpenValue | Integer that specifies the penalty for opening a gap in the alignment of sequences. Default is -1. (This corresponds to the blastall option -G.) |
| GapExtendValue | Integer that specifies the penalty for extending a gap in the alignment of sequences. Default is -1. (This corresponds to the blastall option -E.) |
| BLASTArgsValue | NCBI blastall command string, that is a string containing one or more instances of -x and the option associated with it, used to specify input arguments. For an example, see step 7 in Examples. |
| Data | MATLAB structure or array of structures (if multiple query sequences) containing fields corresponding to BLAST keywords and data from a local BLAST report. |
This function assumes that
The Basic Local Alignment Search Tool (BLAST) offers a fast and powerful comparative analysis of protein and nucleotide sequences against known sequences in online or local databases.
Note To use the blastlocal function, you must have a local copy of the NCBI blastall executable file (version 2.2.17) available from your system. You can download the blastall executable file by accessing http://www.ncbi.nlm.nih.gov/blast/download.shtml then clicking the download link under the blast column for your platform. Run the downloaded executable and configure it for your system. For more information, see the readme file on the NCBI ftp site at: ftp://ftp.ncbi.nih.gov/blast/documents/blast.html For convenience, consider placing the NCBI blastall executable file on your system path. |
blastlocal('InputQuery', InputQueryValue) submits query sequence(s) specified by InputQueryValue, a FASTA file containing nucleotide or amino acid sequence(s), for a BLAST search of a local BLAST database, by calling a local version of the NCBI blastall executable file. The BLAST search results are displayed in the MATLAB Command Window. (This corresponds to the blastall option -i.)
Data = blastlocal('InputQuery', InputQueryValue) returns the BLAST search results in Data, a MATLAB structure or array of structures (if multiple query sequences) containing fields corresponding to BLAST keywords and data from a local BLAST report.
Data contains a subset of the following fields, based on the specified alignment format.
| Field | Description |
|---|---|
| Algorithm | NCBI algorithm used to do a BLAST search. |
| Query | Identifier of the query sequence submitted to a BLAST search. |
| Length | Length of the query sequence. |
| Database | All databases searched. |
| Hits.Name | Name of a database sequence (subject sequence) that matched the query sequence. |
| Hits.Score | Alignment score between the query sequence and the subject sequence. |
| Hits.Expect | Expectation value for the alignment between the query sequence and the subject sequence. |
| Hits.Length | Length of a subject sequence. |
| Hits.HSPs.Score | Pairwise alignment score for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Expect | Expectation value for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Identities | Identities (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Positives | Identical or similar residues (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject amino acid sequence. |
| Hits.HSPs.Gaps | Nonaligned residues (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Mismatches | Residues that are not similar to each other (match, possible, and percent) for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Frame | Reading frame of the translated nucleotide sequence for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Strand | Sense (Plus = 5' to 3' and Minus = 3' to 5') of the DNA strands for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.Alignment | Three-row matrix showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.QueryIndices | Indices of the query sequence residue positions for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.SubjectIndices | Indices of the subject sequence residue positions for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Hits.HSPs.AlignmentLength | Length of the pairwise alignment for a high-scoring sequence pair between the query sequence and a subject sequence. |
| Alignment | Entire alignment for the query sequence and the subject sequence(s). |
| Statistics | Summary of statistical details about the performed search, such as lambda values, gap penalties, number of sequences searched, and number of hits. |
... blastlocal(..., 'PropertyName', PropertyValue, ...) calls blastlocal with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows.
... blastlocal(..., 'Program', ProgramValue, ...) specifies the BLAST program.
Choices are 'blastp' (default), 'blastn', 'blastx', 'tblastn', and 'tblastx'. (This corresponds to the blastall option -p.) For help in selecting an appropriate
BLAST program, visit:
http://www.ncbi.nlm.nih.gov/BLAST/producttable.shtml
... blastlocal(..., 'Database', DatabaseValue, ...) specifies the local BLAST database (formatted using the NCBI formatdb function) to search. Default is a local version of the nr database in the MATLAB current directory. (This corresponds to the blastall option -d.)
... blastlocal(..., 'BlastPath', BlastPathValue, ...) specifies the full path to the blastall executable file, including the name and extension of the executable file. Default is the system path.
... blastlocal(..., 'Expect', ExpectValue, ...) specifies a statistical significance threshold for matches against database sequences. Choices are any real number. Default is 10. (This corresponds to the blastall option -e.) You can learn more about the statistics of local sequence comparison at:
http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html#head2
... blastlocal(..., 'Format', FormatValue, ...) specifies the alignment format of the BLAST search results. Choices are:
0 (default) — Pairwise
1 — Query-anchored, showing identities
2 — Query-anchored, no identities
3 — Flat query-anchored, showing identities
4 — Flat query-anchored, no identities
5 — Query-anchored, no identities and blunt ends
6 — Flat query-anchored, no identities and blunt ends
7 — Not used
8 — Tabular
9 — Tabular with comment lines
(This corresponds to the blastall option -m.)
... blastlocal(..., 'ToFile', ToFileValue, ...) saves the contents of the BLAST report to the specified file. (This corresponds to the blastall option -o.)
... blastlocal(..., 'Filter', FilterValue, ...) specifies whether a filter (DUST filter for the blastn program or SEG filter for other programs) is applied to the query sequence(s). Choices are true (default) or false. (This corresponds to the blastall option -F.)
... blastlocal(..., 'GapOpen', GapOpenValue, ...) specifies the penalty for opening a gap in the alignment of sequences. Default is -1. (This corresponds to the blastall option -G.)
... blastlocal(..., 'GapExtend', GapExtendValue, ...) specifies the penalty for extending a gap in the alignment of sequences. Default is -1. (This corresponds to the blastall option -E.)
... blastlocal(..., 'BLASTArgs', BLASTArgsValue, ...) specifies options using the input arguments for the NCBI blastall function. BLASTArgsValue is a string containing one or more instances or -x and the option associated with it. For example, to specify the BLOSUM 45 matrix, you would use the following syntax:
blastlocal('InputQuery', ecoliquery.txt, 'BLASTArgs', '-M BLOSUM45')Tip Use the 'BlastArgs' property to specify blastall options for which there are no corresponding property name/property value pairs. |
Note For a complete list of valid input arguments for the NCBI blastall function, make sure that the blastall executable file is located on your system path or current directory, then type the following at your system's command prompt. blastall - |
You can also use the syntax and input arguments accepted by the NCBI blastall function, instead of the property name/property value pairs listed previously. To do so, supply a single string containing multiple options using the -x option syntax. For example, you can specify the ecoliquery.txt FASTA file as your query sequences, the blastp program, and the ecoli local database, by using
blastlocal('-i ecoliquery.txt -p blastp -d ecoli')Note For a complete list of valid input arguments for the NCBI blastall function, make sure that the blastall executable file is located on your system path or current directory, then type the following at your system's command prompt. blastall - |
Download the ecoli.nt.gz zip file from
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/
and then extract the ecoli.nt and ecoli.aa FASTA files to your MATLAB current directory.
Use the blastformat function to create local blastable databases from the ecoli.nt and ecoli.aa FASTA files.
blastformat('inputdb', 'ecoli.nt', 'protein', 'false');
blastformat('inputdb', 'ecoli.aa');Use the getgenbank function to retrieve sequence information for the E. coli threonine operon from the GenBank® database.
S = getgenbank('M28570.1');Use the fastawrite function to create a FASTA file named query_nt.fa from this sequence information, using only the accession number as the header.
S.Header = S.Accession;
fastawrite('query_nt.fa', S);Use MATLAB syntax to submit the query sequence in the query_nt.fa FASTA file for a BLAST search of the local amino acid database ecoli.aa. Specify the BLAST program blastx. Return the BLAST search results in results, a MATLAB structure.
results = blastlocal('inputquery', 'query_nt.fa',...
'database', 'ecoli.aa',...
'program', 'blastx')Use blastall syntax to submit the query sequence in the query_nt.fa FASTA file for a BLAST search of the local nucleotide database ecoli.nt. Specify the BLAST program blastn and an expectation value of 0.0001. Display the BLAST search results in the MATLAB Command Window.
blastlocal('-i query_nt.fa -d ecoli.nt -p blastn -e 0.0001')Submit the query sequence in the query_nt.fa FASTA file for a BLAST search of the local nucleotide database ecoli.nt. Specify the BLAST program blastn and a tabular alignment format. Save the contents of the BLAST report to a file named myecoli_nt.txt.
blastlocal('inputquery', 'query_nt.fa',...
'database', 'ecoli.nt', 'tofile', 'myecoli_nt.txt',...
'blastargs', '-p blastn -m 8')[1] Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
[2] Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
For more information on the NCBI blastall function, see:
http://www.ncbi.nlm.nih.gov/blast/docs/blastall.html
Bioinformatics Toolbox™ functions: blastformat, blastncbi, blastread, blastreadlocal, getblast
![]() | blastformat | blastncbi | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |