Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
optimize code

Subject: optimize code

From: fckool

Date: 3 May, 2010 17:03:59

Message: 1 of 12

hi.. I have this code, and i want optimize it.. any suggestion?

thank you in advance.


clear all;
File=input('Ficheiro: ','s');
    fid=fopen(File,'rt'); %modo texto - elimina o CR à partida
    A=fread(fid);%,'ubit1');
Comp=length(A);
Simb=[];
Ocorr=[];
for i=1:Comp
    %length(Ocorr)+1
    if isempty(Simb)
        Simb(1)=A(i,1);
        Ocorr(1)=1;
    else
        j=length(Simb);
        while j~=0
            if A(i,1)==Simb(j)
                break;
            end
            j=j-1;
        end
        if j==0
            %length(Simb)+1;
            Simb(length(Simb)+1)=A(i,1);
            Ocorr(length(Ocorr)+1)=1;
        else
            Simb(j)=A(i,1);
            Ocorr(j)=Ocorr(j)+1;
        end
    end
end
Prob=Ocorr/sum(Ocorr);
P=sum(Prob); %para efeitos de teste: tem de ser 1
Simb=Simb';
Simb=char(Simb);
Simb=cellstr(Simb);
Simb=Simb';

Subject: optimize code

From: Roger Stafford

Date: 3 May, 2010 19:01:05

Message: 2 of 12

fckool <xtrangekid@sapo.pt> wrote in message <1976149983.64477.1272906270072.JavaMail.root@gallium.mathforum.org>...
> hi.. I have this code, and i want optimize it.. any suggestion?
> .......

  I've done this in haste and haven't tried it out, but here is my idea. If it isn't right, something similar should work.

 [u,m,n] = unique(A,'first');
 c = histc(A,u);
 [t,p] = sort(m);
 Simb = u(p);
 Ocorr = c(p);

This just calculates 'Simb' and 'Ocorr'. The rest looks straightforward.

Roger Stafford

Subject: optimize code

From: fckool

Date: 3 May, 2010 19:36:37

Message: 3 of 12

wow... it seems work.

Can you do a small explanation ?


Thank you

Subject: optimize code

From: Roger Stafford

Date: 3 May, 2010 20:14:06

Message: 4 of 12

fckool <xtrangekid@sapo.pt> wrote in message <845054490.65201.1272915427662.JavaMail.root@gallium.mathforum.org>...
> wow... it seems work.
> Can you do a small explanation ?
> Thank you

  If I understand your code correctly, what it does is to list in 'Simb' the first appearances of unique elements in A in the order of those appearances. In 'Ocorr' is placed a count of the total number of appearances for each corresponding value in 'Simb'. However, as you apparently surmised, the algorithm is not very efficient.

  The calls "[u,m,n] = unique(A,'first')" and "c = histc(A,u)" do this in a more efficient manner, with the unique elements of A being placed in u and corresponding counts in c. However they are in sorted order of u elements instead of order of occurrence in A. Fortunately, the values in m give the indices for the first occurrences in A of the elements in u if the 'first' option is used.

  Therefore we must do a sort on m to find the necessary permutation, p, that needs to be performed on u and c elements to reorder them in the order of their original occurrences in A instead of their numerical order in u.

Roger Stafford

Subject: optimize code

From: fckool

Date: 13 May, 2010 01:38:56

Message: 5 of 12

hello

Ok, i think i understood that.

I have other question:

who can i detect the ASCII code 10 (line-feed) and subtract it in 'Simb' and one value on 'Occur'?

Thank you

Subject: optimize code

From: Roger Stafford

Date: 13 May, 2010 02:24:07

Message: 6 of 12

fckool <xtrangekid@sapo.pt> wrote in message <1896069628.127144.1273714766212.JavaMail.root@gallium.mathforum.org>...
> hello
>
> Ok, i think i understood that.
>
> I have other question:
>
> who can i detect the ASCII code 10 (line-feed) and subtract it in 'Simb' and one value on 'Occur'?
>
> Thank you

  With the linefeed occurring in 'Simb' as the numeric value 10 before you have changed it to type 'char', just do a 'find' on it at that point (before the change to 'char'.)

 q = find(Simb==10); % Find index of linefeed value, if one is present
 Simb(q) = []; % Remove the linefeed entry
 Ocorr(q) = []; % Remove its corresponding count

Roger Stafford

Subject: optimize code

From: fckool

Date: 13 May, 2010 12:30:26

Message: 7 of 12

but i wanted find all linefeed's, not only one. it's because i have 3 files .txt with same text but with different number of linefeed's. And I need to have the same result in the 3 files..

I had try to change the vector Simb to the vetor A, but it's doesn't work.

I have this error:

"??? Index of element to remove exceeds matrix dimensions."

Can you help me?

Subject: optimize code

From: Roger Stafford

Date: 13 May, 2010 15:46:19

Message: 8 of 12

fckool <xtrangekid@sapo.pt> wrote in message <1141511283.129365.1273753856984.JavaMail.root@gallium.mathforum.org>...
> but i wanted find all linefeed's, not only one. it's because i have 3 files .txt with same text but with different number of linefeed's. And I need to have the same result in the 3 files..
>
> I had try to change the vector Simb to the vetor A, but it's doesn't work.
>
> I have this error:
>
> "??? Index of element to remove exceeds matrix dimensions."
>
> Can you help me?

  You have me puzzled. Given the way you processed the data to obtain Simb, it can have only one of each different character. Therefore you can remove one linefeed at most from it. If you want all linefeeds taken out of A, you should deal with A directly, not by way of Simb.

  You state that, "I had try to change the vector Simb to the vetor A", but the information contained in A is not present in Simb or Ocorr, only the list of different characters present and their counts, so there would be no way of changing Simb to A. What is it you are actually trying to do?

Roger Stafford

Subject: optimize code

From: fckool

Date: 13 May, 2010 16:33:18

Message: 9 of 12

ok, my English is bad and I have not explained well.

I have several txt files, the only difference they have is the number of linefeeds.

I have to calculate the entropy and proprietary information of those files, and must all have the same result, for that I have to remove all linefeeds.

I was more clear?

Subject: optimize code

From: Roger Stafford

Date: 13 May, 2010 17:27:04

Message: 10 of 12

fckool <xtrangekid@sapo.pt> wrote in message <169058824.130614.1273768428891.JavaMail.root@gallium.mathforum.org>...
> ok, my English is bad and I have not explained well.
> I have several txt files, the only difference they have is the number of linefeeds.
> I have to calculate the entropy and proprietary information of those files, and must all have the same result, for that I have to remove all linefeeds.
> I was more clear?

  Yes that is clearer, but the only thing I can think of is to remove linefeeds directly from the A vector before computing Simb and Ocorr.

  I am curious about how computing Simb and Ocorr could have any bearing on the entropy or proprietary information contained in your files.

Roger Stafford

Subject: optimize code

From: fckool

Date: 13 May, 2010 18:10:57

Message: 11 of 12

I have to calculate the entropy and information of each symbol, I do it later on with two formulas.

but for that I have to have the same number of symbols in the files. txt

Subject: optimize code

From: fckool

Date: 13 May, 2010 19:29:43

Message: 12 of 12

forget it... your code is correct because when you remove the corresponding value on occur vetor you automatically have the correct value on the Prob vector.

I have other question.. (sorry)

I have a for cycle where a enter 2 values manually de Simbs and their Probably .. I want do a condition that don't let me introduce a value bigger than 1 like its Sum. when it happens I want the user introduce again the value..

it's possible?

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us