Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

blastformat

Create local BLAST database

Syntax

blastformat('Inputdb', InputdbValue)
blastformat(..., 'FormatPath', FormatPathValue, ...)
blastformat(..., 'Title', TitleValue, ...)
blastformat(..., 'Log', LogValue, ...)
blastformat(..., 'Protein', ProteinValue, ...)
blastformat(..., 'FormatArgs', FormatArgsValue, ...)

Arguments

InputdbValueCharacter vector specifying a file name or path and file name of a FASTA file containing a set of sequences to be formatted as a blastable database. If you specify only a file name, that file must be on the MATLAB® search path or in the current folder. (This corresponds to the formatdb option -i.)
FormatPathValueCharacter vector specifying the full path to the formatdb executable file, including the name and extension of the executable file. Default is the system path.
TitleValueCharacter vector specifying the title for the local database. Default is the input FASTA file name. (This corresponds to the formatdb option -t.)
LogValueCharacter vector specifying the file name or path and file name for the log file associated with the local database. Default is formatdb.log. (This corresponds to the formatdb option -l.)
ProteinValueSpecifies whether the sequences formatted as a local BLAST database are protein or not. Choices are true (default) or false. (This corresponds to the formatdb option -p.)
FormatArgsValueNCBI formatdb command, that is, a character vector containing one or more instances of -x and the option associated with it, used to specify input arguments.

Description

Note

To use the blastformat function, you must have a local copy of the NCBI formatdb executable file available from your system. You can download the formatdb executable file by accessing BLAST executables. Run the downloaded executable and configure it for your system. For convenience, consider placing the NCBI formatdb executable file on your system path.

blastformat('Inputdb', InputdbValue) calls a local version of the NCBI formatdb executable file with InputdbValue, a file name or path and file name of a FASTA file containing a set of sequences. If you specify only a file name, that file must be on the MATLAB search path or in the current folder. (This corresponds to the formatdb option -i.)

It then formats the sequences as a local, blastable database, by creating multiple files, each with the same name as the InputdbValue FASTA file, but with different extensions. The database files are placed in the same location as the FASTA file.

Note

If you rename the database files, make sure they all have the same name.

blastformat(..., 'PropertyName', PropertyValue, ...) calls blastformat with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows.

blastformat(..., 'FormatPath', FormatPathValue, ...) specifies the full path to the formatdb executable file, including the name and extension of the executable file. Default is the system path.

blastformat(..., 'Title', TitleValue, ...) specifies the title for the local database. Default is the input FASTA file name. (This corresponds to the formatdb option -t.)

Note

The 'Title' property does not change the file name of the database files. This title is used internally only, and appears in the report structure returned by the blastlocal function.

blastformat(..., 'Log', LogValue, ...) specifies the file name or path and file name for the log file associated with the local database. Default is formatdb.log. The log file captures the progress of the database creation and formatting. (This corresponds to the formatdb option -l.)

blastformat(..., 'Protein', ProteinValue, ...) specifies whether the sequences formatted as a local BLAST database are protein or not. Choices are true (default) or false. (This corresponds to the formatdb option -p.)

blastformat(..., 'FormatArgs', FormatArgsValue, ...) specifies options using the input arguments for the NCBI formatdb function. FormatArgsValue is a character vector containing one or more instances of -x and the option associated with it. For example, to specify that the input is a database in ASN.1 format, instead of a FASTA file, you would use the following syntax:

blastformat('Inputdb', 'ecoli.asn', 'FormatArgs', '-a T')

Tip

Use the 'FormatArgs' property to specify formatdb options for which there are no corresponding property name/property value pairs.

Note

For a complete list of valid input arguments for the NCBI formatdb function, make sure that the formatdb executable file is located on your system path or current folder, then type the following at your system's command prompt.

formatdb -

Using formatdb Syntax

You can also use the syntax and input arguments accepted by the NCBI formatdb function, instead of the property name/property value pairs listed previously. To do so, supply a character vector containing multiple options using the -x option syntax. For example, you can specify the ecoli.nt FASTA file, a title of myecoli, and that the sequences are not protein by using

blastformat('-i ecoli.nt -t myecoli -p F')

Note

For a complete list of valid input arguments for the NCBI formatdb function, make sure that the formatdb executable file is located on your system path or current folder, then type the following at your system's command prompt.

formatdb -

Examples

Example 1. Using blastformat with Property Name/Value Pairs

The following example assumes you have a FASTA nucleotide file, such as the E. coli file NC_004431.fna. For FASTA files from NCBI, visit ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/.

Create a local blastable database from the NC_004431.fna FASTA file and give it a title using the 'title' property.

blastformat('inputdb', 'NC_004431.fna', 'protein', 'false',...
            'title', 'myecoli_nt');
Example 2. Using blastformat with formatdb Syntax and Input Arguments

Create a local blastable database from the NC_004431.faa FASTA file and rename the title and log file using formatdb syntax and input arguments.

blastformat('inputdb', 'NC_004431.faa',...
            'formatargs', '-t myecoli_aa -l ecoli_aa.log');

References

[1] Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

[2] Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

Introduced in R2007b

Was this topic helpful?