The function WORDS2NUM converts a string (with a number given in English words) into a numeric value, e.g. 'one thousand and twenty-four' -> 1024. Optional arguments control many string formatting and dialect options. The options are explained in this document, together with examples.
The string format is based on http://www.blackwasp.co.uk/NumberToWords.aspx
For many integer and decimal values WORDS2NUM can be called without any options. WORDS2NUM will match integers, decimal digits following the word 'point', and multipliers ('million', 'billion', etc.) in sequence:
words2num('zero') words2num('infinity') words2num('negative one thousand and twenty-four') words2num('one point two three') words2num('nine point eight million'), format longg words2num('five billion, six million, seven thousand and eight'), format shortg
ans = 0 ans = Inf ans = -1024 ans = 1.23 ans = 9800000 ans = 5006007008
The class of the numeric output can be selected using the class option. All relevant internal numeric operations are performed in this class, but the string detection (based on REGEXP) does not change. This means information may be lost during conversion from string to numeric value:
words2num('one centillion', 'class','double') % default words2num('one centillion', 'class','uint8') words2num('infinity', 'class','uint8')
ans = 1e+303 ans = 255 ans = 255
Because the string detection is based on REGEXP it is possible to detect any number strings inside of longer strings. WORDS2NUM returns a vector of the converted numbers, and a cell array of the input string parts that were split by the detected number strings:
[num,spl] = words2num('HelloOneThousandAndTwenty-FourWorld!') [num,spl] = words2num('before one hundred middle two hundred after')
num = 1024 spl = 'Hello' 'World!' num = 100 200 spl = 'before ' ' middle ' ' after'
The number strings can be matched depending on the character case:
words2num('One Thousand and TWENTY-four', 'case','ignore') % default words2num('One Thousand and TWENTY-four', 'case','title') words2num('One Thousand and TWENTY-four', 'case','upper') words2num('One Thousand and TWENTY-four', 'case','lower')
ans = 1024 ans = 1000 ans = 20 ans = 4
By default the words 'positive' or 'negative' are automatically detected. It is possible to select to require the sign, or to ignore it:
words2num('positive one, two, negative three','sign',) % default words2num('positive one, two, negative three','sign',true) % require words2num('positive one, two, negative three','sign',false) % ignore
ans = 1 2 -3 ans = 1 -3 ans = 1 2 3
Other features or string formatting may be selected to be required or excluded from the number strings. These features are optional by default, but may be excluded by specifying the corresponding option:
words2num('nine million, eight thousand', 'comma',true) % require words2num('nine million, eight thousand', 'comma',false) % exclude words2num('one thousand and twenty-four', 'hyphen',false) % exclude words2num('one thousand and twenty-four', 'space',false) % exclude words2num('one thousand and twenty-four', 'and',false) % exclude
ans = 9008000 ans = 9000000 8000 ans = 1020 4 ans = 1 24 ans = 1000 24
One or more whitespace characters may also be specified:
words2num('one_thousand_and_twenty_four', 'white','_') words2num('one+thousand and twenty-four', 'white',' +')
ans = 1024 ans = 1024
Using REGEXP allows the number string to only be matched when the requested prefix and/or suffix is also present. Note that these are not interpreted literally, but are interpreted as regular expressions, which means that it is possible to specify lookarounds that must be matched:
[num,spl] = words2num('two cats three hats') [num,spl] = words2num('two cats three hats','prefix','^') % only match start of string [num,spl] = words2num('two cats three hats','suffix','?= h') % lookaround: ' h'
num = 2 3 spl = '' ' cats ' ' hats' num = 2 spl = '' ' cats three hats' num = 3 spl = 'two cats ' ' hats'
Several common and not-so-common number scales are supported:
- short and long scales are explained in many location on the internet. Most contemporary english dialects use the short scale (and is the WORDS2NUM default).
- peletier scale is used in many non-english speaking european countries.
- rowlett scale was designed to avoid the ambiguity of the short and long scales.
- knuth scale (aka -yllion) uses a logarithmic naming system to use very few names to cover a very wide range of values.
words2num('one billion', 'scale','short') words2num('one thousand million', 'scale','long') words2num('one milliard', 'scale','peletier') words2num('one gillion', 'scale','rowlett') words2num('ten myllion', 'scale','knuth')
ans = 1e+009 ans = 1e+009 ans = 1e+009 ans = 1e+009 ans = 1e+009
This is still a little bit experimental, but there is an option to allow parsing of compound multipliers:
words2num('one million', 'mult','simple') % default words2num('one thousand thousand', 'mult','compound') words2num('two point three trillion trillion trillion', 'mult','compound')
ans = 1000000 ans = 1000000 ans = 2.3e+036
The function NUM2WORDS converts a numeric scalar into a string with the number value given in English words.