Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

wordcloud

Create word cloud chart from text data

Text Analytics Toolbox™ extends the functionality of the wordcloud (MATLAB®) function. It adds support for creating word clouds directly from string arrays, and creating word clouds from bag-of-words models and LDA topics. For the wordcloud (Text Analytics Toolbox) reference page, see wordcloud.

Syntax

wc = wordcloud(C)
wc = wordcloud(words,sizeData)
wc = wordcloud(parent,___)
wc = wordcloud(___,Name,Value)

Description

example

wc = wordcloud(C) creates a word cloud chart from the elements of categorical array C.

example

wc = wordcloud(words,sizeData) creates a word cloud chart from elements of words with word sizes specified by SizeData.

wc = wordcloud(parent,___) creates the word cloud in the figure, panel, or tab specified by parent.

example

wc = wordcloud(___,Name,Value) specifies additional WordCloudChart properties using one or more name-value pair arguments.

Examples

collapse all

Create a word cloud from plain text by reading it into a string array, preprocessing it, and passing it to the wordcloud function.

Read the text from Shakespeare's Sonnets with the fileread function.

sonnets = fileread('sonnets.txt');
sonnets(1:35)
ans = 
    'THE SONNETS
     
     by William Shakespeare'

Convert the text to a string using the string function. Then, split it on newline characters using the splitlines function.

sonnets = string(sonnets);
sonnets = splitlines(sonnets);
sonnets(10:14)
ans = 5×1 string array
    "  From fairest creatures we desire increase,"
    "  That thereby beauty's rose might never die,"
    "  But as the riper should by time decease,"
    "  His tender heir might bear his memory:"
    "  But thou, contracted to thine own bright eyes,"

Replace some punctuation marks with space characters.

p = ["." "?" "!" "," ";" ":"];
sonnets = replace(sonnets,p," ");
sonnets(10:14)
ans = 5×1 string array
    "  From fairest creatures we desire increase "
    "  That thereby beauty's rose might never die "
    "  But as the riper should by time decease "
    "  His tender heir might bear his memory "
    "  But thou  contracted to thine own bright eyes "

Split sonnets into a string array whose elements contain individual words. To do this, join all the string elements into a 1-by-1 string and then split on the space characters.

sonnets = join(sonnets);
sonnets = split(sonnets);
sonnets(7:12)
ans = 6×1 string array
    "From"
    "fairest"
    "creatures"
    "we"
    "desire"
    "increase"

Remove words with fewer than five characters.

sonnets(strlength(sonnets)<5) = [];

Convert sonnets to a categorical array and then plot using wordcloud.

sonnets = categorical(sonnets);
figure
wordcloud(sonnets);
title("Sonnets Word Cloud")

Create a word cloud from plain text by reading it into a string array, preprocessing it, and passing it to the wordcloud function.

Read the text from Shakespeare's Sonnets with the fileread function.

sonnets = fileread('sonnets.txt');

Convert the text to a string using the string function. Then, split it on newline characters using the splitlines function.

sonnets = string(sonnets);
sonnets = splitlines(sonnets);

Replace some punctuation marks with space characters.

p = ["." "?" "!" "," ";" ":"];
sonnets = replace(sonnets,p," ");

Split sonnets into a string array whose elements contain individual words. Join all the string elements into a 1-by-1 string and then split on the space characters.

sonnets = join(sonnets);
sonnets = split(sonnets);

Remove words with fewer than 5 characters.

sonnets(strlength(sonnets)<5) = [];

Find the unique words in sonnets and count their frequency.

[words,~,idx] = unique(sonnets);
numOccurrences = histcounts(idx,numel(words));

Create a word cloud using the frequency counts as size data.

wordcloud(words,numOccurrences);
title("Sonnets Word Cloud")

Create a word cloud from plain text by reading it into a string array, preprocessing it, and passing it to the wordcloud function.

Read the text from Shakespeare's Sonnets with the fileread function.

sonnets = fileread('sonnets.txt');

Convert the text to a string using the string function. Then, split it on newline characters using the splitlines function.

sonnets = string(sonnets);
sonnets = splitlines(sonnets);

Replace some punctuation marks with space characters.

p = ["." "?" "!" "," ";" ":"];
sonnets = replace(sonnets,p," ");

Split sonnets into a string array whose elements contain individual words. Join all the string elements into a 1-by-1 string and then split on the space characters.

sonnets = join(sonnets);
sonnets = split(sonnets);

Remove words with fewer than 5 characters.

sonnets(strlength(sonnets)<5) = [];

Convert sonnets to a categorical array and then plot using wordcloud.

sonnets = categorical(sonnets);
figure
wc = wordcloud(sonnets);
title("Sonnets Word Cloud")

Set the word colors to random values. To do this, set the Color property to be a random matrix or RGB triplets with one row for each word.

numWords = numel(wc.WordData)
numWords = 2819
colors = rand(numWords,3);
wc.Color = colors;

Input Arguments

collapse all

Input data, specified as a categorical array. The function plots each unique element of C with a size corresponding to histcounts(C).

Data Types: categorical

Input words, specified as a string vector or cell array of character vectors.

Data Types: string | cell

Word size data, specified as a numeric vector.

Parent specified as a figure, panel or tab.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'HighlightColor','red' sets the highlight color to red.

The WordCloudChart properties listed here are only a subset. For a complete list, see WordCloudChart Properties.

collapse all

Maximum number of words to display, specified as a non-negative integer.

Word color, specified as an RGB triplet, a character vector containing a color name, or an N-by-3 matrix where N is the length of WordData. If Color is a matrix, then each row corresponds to an RGB triplet for the corresponding word in WordData.

An RGB triplet is a three-element row vector whose elements specify the intensities of the red, green, and blue components of the color. The intensities must be in the range [0,1]; for example, [0.4 0.6 0.7]. Alternatively, you can specify some common colors by name. This table lists the long and short color name options and the equivalent RGB triplet values.

OptionDescriptionEquivalent RGB Triplet
'red' or 'r'Red[1 0 0]
'green' or 'g'Green[0 1 0]
'blue' or 'b'Blue[0 0 1]
'yellow' or 'y'Yellow[1 1 0]
'magenta' or 'm'Magenta[1 0 1]
'cyan' or 'c'Cyan[0 1 1]
'white' or 'w'White[1 1 1]
'black' or 'k'Black[0 0 0]

Example: 'blue'

Example: [0 0 1]

Word highlight color, specified as an RGB triplet, or a character vector containing a color name.

An RGB triplet is a three-element row vector whose elements specify the intensities of the red, green, and blue components of the color. The intensities must be in the range [0,1]; for example, [0.4 0.6 0.7]. Alternatively, you can specify some common colors by name. This table lists the long and short color name options and the equivalent RGB triplet values.

OptionDescriptionEquivalent RGB Triplet
'red' or 'r'Red[1 0 0]
'green' or 'g'Green[0 1 0]
'blue' or 'b'Blue[0 0 1]
'yellow' or 'y'Yellow[1 1 0]
'magenta' or 'm'Magenta[1 0 1]
'cyan' or 'c'Cyan[0 1 1]
'white' or 'w'White[1 1 1]
'black' or 'k'Black[0 0 0]

Example: 'blue'

Example: [0 0 1]

Shape of word cloud chart, specified as 'oval' or 'rectangle'.

Example: 'rectangle'

Word placement layout, specified as a nonnegative integer. If you repeatedly call wordcloud with the same inputs, then the word placement layouts will be the same each time. To get different word placement layouts, use different values of LayoutNum.

Output Arguments

collapse all

WordCloudChart object. You can modify the properties of a WordCloudChart after it is created.

Introduced in R2017b

Was this topic helpful?