How to make this long if-else chain compact?

8 views (last 30 days)
DEVANAND
DEVANAND on 8 Nov 2014
Commented: DEVANAND on 10 Nov 2014
I am trying to generate a markov chain based text generator of 2nd order. As a first step, I did a zeroth order Markov chain.But it has a long if-else chain. How to make this compact. When I am proceeding to 1st and 2nd order markov chains, I don't know how to manage this. And the code is given below:
%Zeroth order markov chain generator from a training data of a text file
close all;
clear ;
clc; % close all figure windows, clear variables, clear screen
%read text file and split it into characters in a cell
text = fileread('pp');
[~, text] = strsplit(text,'.','DelimiterType', ...
'RegularExpression','CollapseDelimiters',false);%split the string
%one big cell is returned, unwrap it and convert to lowercase
text = lower(text);
%delete white spaces (and keep a single space where continuous
%double spaces come)
ws = cellfun(@(x)any(x),isstrprop(text,'wspace'));
spIndex = strfind(ws, ones(1,2));
text(spIndex)=[];
%remove punctuation and any non-alpha characters
punc = cellfun(@(x)any(x),isstrprop(text,'punct'));
pIndex = strfind(punc, ones(1,1));
text(pIndex)=[];
%replace char(10) with space
text=strrep(text, char(10), ' ');
% for easy processing I am converting the alphabets to
% numbers (cell to number array) like a-z as 1-26 and space as 27.
seq=cell2mat(text);
seq=double(seq)-(double('a')-1);%alphabets a-z has ids 1-26
seq(seq == -64)=27;%space symbol has id 27
%find distribution of the sequence
xRange = 1:27; %# Range of integers to compute a probability for
N = hist(seq,xRange); %# Bin the data
dist=N./numel(seq);
cdist=cumsum(dist);
%generate sequence according to distribution
fileID = fopen('chain','w');
for k=1:numel(seq)
p=rand;
if ((p >= 0) && (p <= cdist(1)))
fprintf(fileID,'%s\n','1');
elseif ((p > cdist(1)) && (p <=cdist(2)))
fprintf(fileID,'%s\n','2');
elseif ((p > cdist(2)) && (p <=cdist(3)))
fprintf(fileID,'%s\n','3');
elseif ((p > cdist(3)) && (p <=cdist(4)))
fprintf(fileID,'%s\n','4');
elseif ((p > cdist(4)) && (p <=cdist(5)))
fprintf(fileID,'%s\n','5');
elseif ((p > cdist(5)) && (p <=cdist(6)))
fprintf(fileID,'%s\n','6');
elseif ((p > cdist(6)) && (p <=cdist(7)))
fprintf(fileID,'%s\n','7');
elseif ((p > cdist(7)) && (p <=cdist(8)))
fprintf(fileID,'%s\n','8');
elseif ((p > cdist(8)) && (p <=cdist(9)))
fprintf(fileID,'%s\n','9');
elseif ((p > cdist(9)) && (p <=cdist(10)))
fprintf(fileID,'%s\n','10');
elseif ((p > cdist(10)) && (p <=cdist(11)))
fprintf(fileID,'%s\n','11');
elseif ((p > cdist(11)) && (p <=cdist(12)))
fprintf(fileID,'%s\n','12');
elseif ((p > cdist(12)) && (p <=cdist(13)))
fprintf(fileID,'%s\n','13');
elseif ((p > cdist(13)) && (p <=cdist(14)))
fprintf(fileID,'%s\n','14');
elseif ((p > cdist(14)) && (p <=cdist(15)))
fprintf(fileID,'%s\n','15');
elseif ((p > cdist(15)) && (p <=cdist(16)))
fprintf(fileID,'%s\n','16');
elseif ((p > cdist(16)) && (p <=cdist(17)))
fprintf(fileID,'%s\n','17');
elseif ((p > cdist(17)) && (p <=cdist(18)))
fprintf(fileID,'%s\n','18');
elseif ((p > cdist(18)) && (p <=cdist(19)))
fprintf(fileID,'%s\n','19');
elseif ((p > cdist(19)) && (p <=cdist(20)))
fprintf(fileID,'%s\n','20');
elseif ((p > cdist(20)) && (p <=cdist(21)))
fprintf(fileID,'%s\n','21');
elseif ((p > cdist(21)) && (p <=cdist(22)))
fprintf(fileID,'%s\n','22');
elseif ((p > cdist(22)) && (p <=cdist(23)))
fprintf(fileID,'%s\n','23');
elseif ((p > cdist(23)) && (p <=cdist(24)))
fprintf(fileID,'%s\n','24');
elseif ((p > cdist(24)) && (p <=cdist(25)))
fprintf(fileID,'%s\n','25');
elseif ((p > cdist(25)) && (p <=cdist(26)))
fprintf(fileID,'%s\n','26');
elseif ((p > cdist(26)) && (p <=cdist(27)))
fprintf(fileID,'%s\n','27');
end
end
fclose(fileID);
fileID = fopen('chain','r');
gen_text = fscanf(fileID,'%f');%zeroth order markov generated text
fclose(fileID);
gen_text=gen_text+96;
gen_text(gen_text==(27+96))=32;
gen_text=gen_text';
gen_text=char(gen_text);
fileID = fopen('pp_zero_mc','w');
fprintf(fileID, '%s',gen_text);
fclose(fileID);
Otherwise I have to look for different method of generating sequence. Please help.

Answers (1)

dpb
dpb on 8 Nov 2014
Several alternatives -- probably bestest is table lookup
state=floor(interp1(cdist,[1:nStates],p));
Can get there also with
doc histc % optional second output
Either of these can be vectorized and extended to higher dimensions as well.
  3 Comments
dpb
dpb on 10 Nov 2014
As I noted, either can be extended to higher dimensions. Under
help interp1
one finds in the "See also" section
See also interp1q, interpft, ... interp2, interp3, interpn, ...
DEVANAND
DEVANAND on 10 Nov 2014
Let me see. Thanks for the reply.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!