Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Modeling and Prediction

Develop predictive models using topic models and word embeddings.

To find clusters and extract features from high-dimensional text datasets, you can use machine learning techniques and models such as LSA, LDA, and word embeddings. You can combine features created with Text Analytics Toolbox™ with features from other data sources. With these features, you can build machine learning models that take advantage of textual, numeric, and other types of data.

Functions

expand all

fitldaFit latent Dirichlet allocation (LDA) model
fitlsaFit LSA model
resumeResume fitting LDA model
logpDocument log-probabilities and goodness of fit of LDA model
predictPredict top LDA topics of documents
transformTransform documents into lower-dimensional space
fastTextWordEmbeddingPretrained fastText word embedding
readWordEmbeddingRead word embedding from file
trainWordEmbeddingTrain word embedding
writeWordEmbeddingWrite word embedding file
word2vecMap word to embedding vector
vec2wordMap embedding vector to word
ismemberTest word is member of word embedding
wordcloudCreate word cloud chart from text, bag-of-words model, bag-of-n-grams model, or LDA model
textscatter2-D scatter plot of text
textscatter33-D scatter plot of text

Objects

expand all

bagOfWordsBag-of-words model
bagOfNgramsBag-of-n-grams model
ldaModelLatent Dirichlet allocation (LDA) model
lsaModelLatent semantic analysis (LSA) model
wordEmbeddingMap words to vectors and back

Topics

Create Simple Text Model for Classification

This example shows how to train a simple text classifier on word frequency counts using a bag-of-words model.

Analyze Text Data Using Topic Models

This example shows how to use the Latent Dirichlet Allocation (LDA) topic model to analyze text data.

Choose Number of Topics for LDA Model

This example shows how to decide on a suitable number of topics for a latent Dirichlet allocation (LDA) model.

Compare LDA Solvers

This example shows how to compare latent Dirichlet allocation (LDA) solvers by comparing the goodness of fit and the time taken to fit the model.

Analyze Text Data Using Multiword Phrases

This example shows how to analyze text using n-gram frequency counts.

Visualize Word Embeddings Using Text Scatter Plots

This example shows how to visualize word embeddings using 2-D and 3-D t-SNE and text scatter plots.

Classify Text Data Using Deep Learning

This example shows how to classify text descriptions of weather reports using a deep learning long short-term memory (LSTM) network.

Featured Examples

Was this topic helpful?