Code covered by the BSD License  

Highlights from
wordhit

Be the first to rate this file! 4 Downloads (last 30 days) File Size: 11.26 KB File ID: #31748

wordhit

by Ben Petschel

 

10 Jun 2011 (Updated 22 Jul 2011)

determines probabilities and expected times to produce words in a random character stream

| Watch this File

File Information
Description

Consider the following problems:

- is HH or TH more likely to appear first in a series of coin flips?
- how long does it take on average for a monkey to type out the phrase "to be or not to be"?

WORDHIT solves the general problem for any list of reasonably-sized words. For example,

wordhit('HH','TH') % returns [1/4,3/4]
[P,T]=wordhit('HH','TH') % T = [0.5,2.5]
T./P % conditional hitting times [2,10/3]
sum(T) % total hitting time [3]

% optional symbolic probability values (requires Symbolic Toolbox)
[~,t]=wordhit(repmat('H',1,5),'',sym('p')) % (1+p+p^2+p^3+p^4)/p^5

The algorithm works by determining the transition matrix of the Markov chain where each state represents a given combination of matches to the first few elements of each word, with the absorbing states being where a full word is matched. This allows the hitting probabilities and hitting times to be expressed as limits of matrix powers.

The values can be challenging to compute even for some relatively small examples, so several methods have been implemented.

For some examples the results are accurate to values close to REALMIN, e.g.

log2(wordhit(repmat('H',1,1000),'TH')) % get [-1000,0]

For other examples, a modified Kramer's method is more accurate due to poor conditioning of the transition matrix:

[~,t]=wordhit(repmat('H',1,500)) % accuracy warning
[~,t]=wordhit(repmat('H',1,500),'','kramer') % get 2^501 as expected
[~,t]=wordhit('to be or not to be','','kramer') % 5.815e25
% adjust for unequal key probabilites:
[~,t]=wordhit('to be or not to be','',[32*ones(1,26),190]/2800,'kramer') % 1.225e31

See the help file for more examples and extensive documentation of the options, theory and methods.

Let me know if you find any bugs or interesting examples.

MATLAB release MATLAB 7.10 (2010a)
Other requirements Symbolic Toolbox optional
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Please login to add a comment or rating.
Updates
22 Jul 2011

(previously htseqprob.m) - added calculation of hitting times; support for alphabets of size > 2; symbolic inputs; additional methods

Tag Activity for this File
Tag Applied By Date/Time
probability Ben Petschel 10 Jun 2011 11:16:41
coin toss Ben Petschel 10 Jun 2011 11:16:41
random string Ben Petschel 22 Jul 2011 14:06:25
markov chain Ben Petschel 22 Jul 2011 14:06:25
transition matrix Ben Petschel 22 Jul 2011 14:06:25

Contact us at files@mathworks.com