goannotread

Read annotations from Gene Ontology annotated file

Syntax

Annotation = goannotread(File)
Annotation = goannotread(File, ...'Fields', FieldsValue, ...)
Annotation = goannotread(File, ...'Aspect', AspectValue, ...)

Input Arguments

File

String specifying a file name of a Gene Ontology (GO) annotated format (GAF) file.

FieldsValue

String or cell array of strings specifying one or more fields to read from the Gene Ontology annotated file. Default is to read all fields. Valid fields are listed below.

AspectValue

Character array specifying one or more characters. Valid aspects are:

  • P — Biological process

  • F — Molecular function

  • C — Cellular component

Default is 'CFP', which specifies to read all aspects.

Output Arguments

AnnotationMATLAB® array of structures containing annotations from a Gene Ontology annotated file.

Description

    Note:   The goannotread function supports GAF 1.0 and 2.0 file formats.

Annotation = goannotread(File) converts the contents of File, a Gene Ontology annotated file, into Annotation, an array of structures. Files should have the structure specified in:

http://www.geneontology.org/GO.annotation.shtml#file

A list with some annotated files can be found at:

http://www.geneontology.org/GO.current.annotations.shtml

Annotation = goannotread(File, ...'PropertyName', PropertyValue, ...) calls goannotread with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

Annotation = goannotread(File, ...'Fields', FieldsValue, ...) specifies the fields to read from the Gene Ontology annotated file. FieldsValue is a string or cell array of strings specifying one or more fields. Default is to read all fields. Valid fields are:

  • Database

  • DB_Object_ID

  • DB_Object_Symbol

  • Qualifier

  • GOid

  • DBReference

  • Evidence

  • WithFrom

  • Aspect

  • DB_Object_Name

  • Synonym

  • DB_Object_Type

  • Taxon

  • Date

  • Assigned_by

For more information on these fields, see:

http://www.geneontology.org/GO.format.annotation.shtml

Annotation = goannotread(File, ...'Aspect', AspectValue, ...) specifies the aspects to read from the Gene Ontology annotated file. AspectValue is a character array specifying one or more characters. Valid aspects are:

  • P — Biological process

  • F — Molecular function

  • C — Cellular component

Default is 'CFP', which specifies to read all aspects.

Examples

Reading All Annotations from a Gene Ontology Annotated File

  1. Open a Web browser to

    http://www.geneontology.org/GO.current.annotations.shtml
    
  2. Download gene_association.sgd.gz, the file containing GO annotations for the gene products of Saccharomyces cerevisiae, to your MATLAB Current Folder.

  3. Uncompress the file using the gunzip function.

    gunzip('gene_association.sgd.gz')
  4. Read the file into the MATLAB software.

    SGDGenes = goannotread('gene_association.sgd');
  5. Create a structure with GO annotations and display a list of the first five genes.

    S = struct2cell(SGDGenes);
    genes = S(3,1:5)'
    
    genes = 
    
        '15S_RRNA'
        '15S_RRNA'
        '15S_RRNA'
        '15S_RRNA'
        '21S_RRNA'

Reading a Subset of Annotations from a Gene Ontology Annotated File

  1. Open a Web browser to

    http://www.geneontology.org/GO.current.annotations.shtml
    
  2. Download gene_association.goa_human.gz, the file containing GO annotations for the gene products of Homo sapiens, to your MATLAB Current Folder.

  3. Uncompress the file using the gunzip function.

    gunzip('gene_association.goa_human.gz')
  4. Read the file into the MATLAB software, but limit the annotations to genes related to molecular function (F), and to the fields for the gene symbol and the associated ID, that is, DB_Object_Symbol and GOid.

    HumanStruct = goannotread('gene_association.goa_human', ...
                 'Aspect','F','Fields',{'DB_Object_Symbol','GOid'});
  5. Create a list of the Homo sapiens genes and a list of the associated GO terms.

    Humangenes = {HumanStruct.DB_Object_Symbol};
    HumanGO = [HumanStruct.GOid];
Was this topic helpful?