creating vectors showing the non zero values

2 views (last 30 days)
Jwana
Jwana on 25 Oct 2012
Hi all,
I have a code that can generate 3 vectors (GO_Terms, is_a_relations and part_of_relations) from a text file . The code works as follows: the text file contains paragraphs, each paragraph starts with word: [Term].I need from each [Term] paragraph to take the values of GO_Terms, is_a_relations and part_of_relations. some [Term] paragraphs doesn't contains is_a_relations and some doesn't contain part_of_relations (but all have GO_Term values).
my question is that how can I show a zero value in the vector if there is no such value in each paragraph(for example; if there is no is_a_relations in the [Term] paragraph). Now, my code shows only the nonzero values for the vectors (and this is doesn't work with me since I need the length of vectors to be equal in order to put them in an array and make some process on them)
my code is:
s={};
fid = fopen('gos.txt');
tline = fgetl(fid);
while ischar(tline)
s=[s;tline];
tline = fgetl(fid);
end
% find start and end positions of every [Term] marker in s
terms = [find(~cellfun('isempty', regexp(s, '\[Term\]'))); numel(s)+1];
% for every [Term] section, run the previously implemented regexps
% and save the results into a map - a cell array with 3 columns
GO_Terms=[];
is_a_relations=[];
part_of_relations=[];
%map = cell(0,3);
for term=1:numel(terms)-1
% extract single [Term] data
s_term = s(terms(term):terms(term+1)-1);
% match regexps
%To generate the GO_Terms vector from the text file
tok = regexp(s_term, '^id: (GO:\w*)', 'tokens');
idx = ~cellfun('isempty', tok);
GO_Terms=[GO_Terms , cellfun(@(x)x{1}, {tok{idx}})];
%To generate the is_a relations vector from the text file
tok = regexp(s_term, '^is_a: (GO:\w*)', 'tokens');
idx = ~cellfun('isempty', tok);
is_a_relations =[is_a_relations , cellfun(@(x)x{1}, {tok{idx}})];
%To generate the part_of relaions vector from the text file
tok = regexp(s_term, '^relationship: part_of (GO:\w*)', 'tokens');
idx = ~cellfun('isempty', tok);
part_of_relations =[part_of_relations ,cellfun(@(x)x{1}, {tok{idx}})];
%part_of_relations(cellfun(@isempty, part_of_relations)) = [0];
% map. note the end+1 - here we create a new map row. Only once!
% map{end+1,1} = GO_Terms;
%map{end, 2} = is_a_relations;
%map{end, 3} = part_of_relations;
end
GO_Terms=GO_Terms'
is_a_relations=is_a_relations'
part_of_relations=part_of_relations'
the results of the code show as follows
GO_Terms =
'GO:0008150'
'GO:0016740'
'GO:0016787'
'GO:0006810'
'GO:0006412'
'GO:0004672'
'GO:0016779'
'GO:0004386'
'GO:0003774'
'GO:0016298'
'GO:0016192'
'GO:0005215'
'GO:0030533'
is_a_relations =
'GO:0008150'
'GO:0016740'
'GO:0016787'
'GO:0008150'
'GO:0016740'
'GO:0016740'
'GO:0016787'
'GO:0016787'
'GO:0016787'
'GO:0006810'
'GO:0006412'
'GO:0004672'
part_of_relations =
'GO:0008150'
'GO:0008150'
'GO:0006810'
'GO:0016192'
'GO:0006810'
'GO:0005215'

Answers (0)

Categories

Find more on Programming in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!