Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

To resolve issues starting MATLAB on Mac OS X 10.10 (Yosemite) visit: http://www.mathworks.com/matlabcentral/answers/159016

c4.5

Asked by FIR on 17 Apr 2012
Latest activity Commented on by jijar13 smith on 8 May 2014

hi can anyone tell how to implement c4.5 algorithm plz for fisheris data ,is there any code for it

0 Comments

FIR

Tags

Products

No products are associated with this question.

3 Answers

Answer by Ilya on 17 Apr 2012

Statistics Toolbox provides a decision tree implementation based on the book Classification and Regression Trees by Breiman et al (CART). If you have MATLAB 11a or later, do 'doc ClassificationTree' and 'doc RegressionTree'. If you have an older version, do 'doc classregtree'.

The CART algorithm is different from C4.5.

3 Comments

FIR on 18 Apr 2012

But Illaya how to process c4.5 then can i use t = classregtree(X,y) for c4.5

Ilya on 18 Apr 2012

I don't understand your question. Try asking it again using proper grammar.

FIR on 19 Apr 2012

u said c4.5 is different from cart,can i use calssregtree for c4.5 classification

second question is

i get error
Error using classperf (line 244)
When class labels of the CP object is a cell array of strings and
the classifier output is a numeric array, it must contain valid
indices of the class labels or NaNs for inconclusive results.

i posted this question many times but i did not get reply please answer

Ilya
Answer by Ilya on 19 Apr 2012

I don't know if you can use classregtree for "c4.5 classification". If you are looking for a decision tree implementation, you can use classregtree. If you are looking specifically for the C4.5 algorithm, obviously you cannot use classregtree. If you don't know enough to choose one algorithm over the other, perhaps you should use whatever is readily available, that is, classregtree.

I am not going to answer your 2nd question in this thread. I'll note however that the error message gives you a clue. classperf expects class indices, and you are giving it what exactly?

1 Comment

FIR on 20 Apr 2012

Illay a i tried everything but could not find a solution thats y posted this question

Ilya
Answer by Muhammad Aasem on 25 May 2012

I found the following source code for C4.5 algorithm. I hope it works for you:

function test_targets = C4_5(train_patterns, train_targets, test_patterns, inc_node)
% Classify using Quinlan's C4.5 algorithm
% Inputs:
% 	training_patterns   - Train patterns
%	training_targets	- Train targets
%   test_patterns       - Test  patterns
%	inc_node            - Percentage of incorrectly assigned samples at a node
%
% Outputs
%	test_targets        - Predicted targets
%NOTE: In this implementation it is assumed that a pattern vector with fewer than 10 unique values (the parameter Nu)
%is discrete, and will be treated as such. Other vectors will be treated as continuous
[Ni, M]		= size(train_patterns);
inc_node    = inc_node*M/100;
Nu          = 10;
%Find which of the input patterns are discrete, and discretisize the corresponding
%dimension on the test patterns
discrete_dim = zeros(1,Ni);
for i = 1:Ni,
    Ub = unique(train_patterns(i,:));
    Nb = length(Ub);
    if (Nb <= Nu),
        %This is a discrete pattern
        discrete_dim(i)	= Nb;
        dist            = abs(ones(Nb ,1)*test_patterns(i,:) - Ub'*ones(1, size(test_patterns,2)));
        [m, in]         = min(dist);
        test_patterns(i,:)  = Ub(in);
    end
end
%Build the tree recursively
disp('Building tree')
tree            = make_tree(train_patterns, train_targets, inc_node, discrete_dim, max(discrete_dim), 0);
%Classify test samples
disp('Classify test samples using the tree')
test_targets    = use_tree(test_patterns, 1:size(test_patterns,2), tree, discrete_dim, unique(train_targets));
%END
function targets = use_tree(patterns, indices, tree, discrete_dim, Uc)
%Classify recursively using a tree
targets = zeros(1, size(patterns,2));
if (tree.dim == 0)
    %Reached the end of the tree
    targets(indices) = tree.child;
    return
end
%This is not the last level of the tree, so:
%First, find the dimension we are to work on
dim = tree.dim;
dims= 1:size(patterns,1);
%And classify according to it
if (discrete_dim(dim) == 0),
    %Continuous pattern
    in				= indices(find(patterns(dim, indices) <= tree.split_loc));
    targets		= targets + use_tree(patterns(dims, :), in, tree.child(1), discrete_dim(dims), Uc);
    in				= indices(find(patterns(dim, indices) >  tree.split_loc));
    targets		= targets + use_tree(patterns(dims, :), in, tree.child(2), discrete_dim(dims), Uc);
else
    %Discrete pattern
    Uf				= unique(patterns(dim,:));
    for i = 1:length(Uf),
        if any(Uf(i) == tree.Nf) %Has this sort of data appeared before? If not, do nothing
            in   	= indices(find(patterns(dim, indices) == Uf(i)));
            targets	= targets + use_tree(patterns(dims, :), in, tree.child(find(Uf(i)==tree.Nf)), discrete_dim(dims), Uc);
        end
    end
end
%END use_tree
function tree = make_tree(patterns, targets, inc_node, discrete_dim, maxNbin, base)
%Build a tree recursively
[Ni, L]    					= size(patterns);
Uc         					= unique(targets);
tree.dim					= 0;
%tree.child(1:maxNbin)	= zeros(1,maxNbin);
tree.split_loc				= inf;
if isempty(patterns),
    return
end
%When to stop: If the dimension is one or the number of examples is small
if ((inc_node > L) | (L == 1) | (length(Uc) == 1)),
    H					= hist(targets, length(Uc));
    [m, largest] 	= max(H);
    tree.Nf         = [];
    tree.split_loc  = [];
    tree.child	 	= Uc(largest);
    return
end
%Compute the node's I
for i = 1:length(Uc),
    Pnode(i) = length(find(targets == Uc(i))) / L;
end
Inode = -sum(Pnode.*log(Pnode)/log(2));
%For each dimension, compute the gain ratio impurity
%This is done separately for discrete and continuous patterns
delta_Ib    = zeros(1, Ni);
split_loc	= ones(1, Ni)*inf;
for i = 1:Ni,
    data	= patterns(i,:);
    Ud      = unique(data);
    Nbins	= length(Ud);
    if (discrete_dim(i)),
        %This is a discrete pattern
        P	= zeros(length(Uc), Nbins);
        for j = 1:length(Uc),
            for k = 1:Nbins,
                indices 	= find((targets == Uc(j)) & (patterns(i,:) == Ud(k)));
                P(j,k) 	= length(indices);
            end
        end
        Pk          = sum(P);
        P           = P/L;
        Pk          = Pk/sum(Pk);
        info        = sum(-P.*log(eps+P)/log(2));
        delta_Ib(i) = (Inode-sum(Pk.*info))/-sum(Pk.*log(eps+Pk)/log(2));
    else
        %This is a continuous pattern
        P	= zeros(length(Uc), 2);
          %Sort the patterns
          [sorted_data, indices] = sort(data);
          sorted_targets = targets(indices);
          %Calculate the information for each possible split
          I	= zeros(1, L-1);
          for j = 1:L-1,
              %for k =1:length(Uc),
              %    P(k,1) = sum(sorted_targets(1:j)        == Uc(k));
              %    P(k,2) = sum(sorted_targets(j+1:end)    == Uc(k));
              %end
              P(:, 1) = hist(sorted_targets(1:j) , Uc);
              P(:, 2) = hist(sorted_targets(j+1:end) , Uc);
              Ps		= sum(P)/L;
              P		= P/L;
              Pk      = sum(P);            
              P1      = repmat(Pk, length(Uc), 1);
              P1      = P1 + eps*(P1==0);
              info	= sum(-P.*log(eps+P./P1)/log(2));
              I(j)	= Inode - sum(info.*Ps);
          end
          [delta_Ib(i), s] = max(I);
          split_loc(i) = sorted_data(s);
      end
  end
%Find the dimension minimizing delta_Ib
[m, dim]    = max(delta_Ib);
dims        = 1:Ni;
tree.dim    = dim;
%Split along the 'dim' dimension
Nf		= unique(patterns(dim,:));
Nbins	= length(Nf);
tree.Nf = Nf;
tree.split_loc      = split_loc(dim);
%If only one value remains for this pattern, one cannot split it.
if (Nbins == 1)
    H				= hist(targets, length(Uc));
    [m, largest] 	= max(H);
    tree.Nf         = [];
    tree.split_loc  = [];
    tree.child	 	= Uc(largest);
    return
end
if (discrete_dim(dim)),
    %Discrete pattern
    for i = 1:Nbins,
        indices         = find(patterns(dim, :) == Nf(i));
        tree.child(i)	= make_tree(patterns(dims, indices), targets(indices), inc_node, discrete_dim(dims), maxNbin, base);
    end
else
    %Continuous pattern
    indices1		   	= find(patterns(dim,:) <= split_loc(dim));
    indices2	   		= find(patterns(dim,:) > split_loc(dim));
    if ~(isempty(indices1) | isempty(indices2))
        tree.child(1)	= make_tree(patterns(dims, indices1), targets(indices1), inc_node, discrete_dim(dims), maxNbin, base+1);
        tree.child(2)	= make_tree(patterns(dims, indices2), targets(indices2), inc_node, discrete_dim(dims), maxNbin, base+1);
    else
        H				= hist(targets, length(Uc));
        [m, largest] 	= max(H);
        tree.child	 	= Uc(largest);
        tree.dim                = 0;
    end
end

1 Comment

jijar13 smith on 8 May 2014

can you explain this prog in general please

Muhammad Aasem

Contact us