Main Content

Analyze Sentence Structure Using Grammatical Dependency Parsing

This example shows how to extract information from a sentence using grammatical dependency parsing.

Grammatical dependency parsing is the process of identifying the grammatic structure of a sentence by highlighting the dependencies between the words of the sentence. For example, you can indicate which adjectives modify which nouns.

You can use grammatical dependency analysis to extract information from a sentence. For example, for the sentence "If you see a blue light, then press the red button." you can parse the grammatical details of the sentence and programmatically attain that the button is red and should be pressed.

This plot shows the grammatical structure of a sentence.

DependencyChart.png

Add Grammatical Dependency Details

Create a string scalar containing the sentence to analyze.

str = "If the temperature reaches 100 degrees, then disable the heating element.";

Tokenize the text and add grammatical dependency to the documents. The addDependencyDetails function requires the Text Analytics Toolbox™ Model for UDify Data support package. If the support package is not installed, then the function provides a download link.

document = tokenizedDocument(str);
document = addDependencyDetails(document);

View the token details using the tokenDetails function. The addDependencyDetails function adds the variables Head and Dependency to the table.

tdetails = tokenDetails(document)
tdetails=13×8 table
        Token        DocumentNumber    SentenceNumber    LineNumber       Type        Language    Head    Dependency
    _____________    ______________    ______________    __________    ___________    ________    ____    __________

    "If"                   1                 1               1         letters           en         4      mark     
    "the"                  1                 1               1         letters           en         3      det      
    "temperature"          1                 1               1         letters           en         4      nsubj    
    "reaches"              1                 1               1         letters           en         9      advcl    
    "100"                  1                 1               1         digits            en         6      nummod   
    "degrees"              1                 1               1         letters           en         4      obj      
    ","                    1                 1               1         punctuation       en         4      punct    
    "then"                 1                 1               1         letters           en         9      advmod   
    "disable"              1                 1               1         letters           en         0      root     
    "the"                  1                 1               1         letters           en        12      det      
    "heating"              1                 1               1         letters           en        12      compound 
    "element"              1                 1               1         letters           en         9      obj      
    "."                    1                 1               1         punctuation       en         9      punct    

Visualize Grammatical Dependencies

Visualize the grammatical dependencies in a sentence chart.

figure
sentenceChart(document)

Extract Information from Grammatical Dependency Tree

You can use the tree structure to extract information from the sentence.

The form of the sentence is "If <condition>, then <action>.".

Find the root of the sentence. In this case, the root is the verb "disable" in the action.

idxRoot = find(tdetails.Dependency == "root");
tokenRoot = tdetails.Token(idxRoot)
tokenRoot = 
"disable"

To parse the condition of the sentence, find the adverbial clause of the sentence. In this case, the root of the adverbial clause is the verb "reaches".

idxRoot = find(tdetails.Dependency == "root");
idxAdvcl = (tdetails.Head == idxRoot) & (tdetails.Dependency == "advcl");

tokenAdvcl = tdetails.Token(idxAdvcl)
tokenAdvcl = 
"reaches"

To parse the subclause of the form "<subject> reaches <object>", find the subject and object of the word "reaches".

Find the nominal subject of the word "reaches". In this case, the nominal subject is the word "temperature".

idxToken = find(tdetails.Token == "reaches");

idxNsubj = (tdetails.Head == idxToken) & (tdetails.Dependency == "nsubj");
tokenNsubj = tdetails.Token(idxNsubj)
tokenNsubj = 
"temperature"

Find the object of the verb "reaches". In this case, the object is the word "degrees".

idxToken = find(tdetails.Token == "reaches");

idxConditionObject = (tdetails.Head == idxToken) & (tdetails.Dependency == "obj");
tokenConditionObject = tdetails.Token(idxConditionObject)
tokenConditionObject = 
"degrees"

To parse the subclause of the form "<number> degrees", find the numeric modifier of the word "degrees". In this case, the numeric modifier is "100".

idxToken = find(tdetails.Token == "degrees");

idxNummod = (tdetails.Head == idxToken) & (tdetails.Dependency == "nummod");
tokenNummod = tdetails.Token(idxNummod)
tokenNummod = 
"100"

To parse the action of the sentence, find the object of the verb "disable". In this case, the object is the word "element".

idxToken = find(tdetails.Token == "disable");

idxActionObject = (tdetails.Head == idxToken) & (tdetails.Dependency == "obj");
tokenActionObject = tdetails.Token(idxActionObject)
tokenActionObject = 
"element"

To parse the subclause of the form "<type> element", find the tokens with a compound relation. In this case the modifier is the word "heating".

idxToken = find(tdetails.Token == "element");

idxObject = (tdetails.Head == idxToken) & (tdetails.Dependency == "compound");
tokenCompound = tdetails.Token(idxObject)
tokenCompound = 
"heating"

See Also

| | | | | |

Related Topics