Rank: 4136 based on 24 downloads (last 30 days) and 1 file submitted
photo

Julio Zaragoza

E-mail
Company/University
The University of Adelaide

Personal Profile:

 

Watch this Author's files

 

Files Posted by Julio Zaragoza
Updated   File Tags Downloads
(last 30 days)
Comments Rating
31 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza statistics, mathematics 24 6
  • 4.66667
4.7 | 3 ratings
Comments and Ratings by Julio Zaragoza View all
Updated File Comments Rating
31 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza

You need to develop the C/C++ version of the code, otherwise it will take long time

30 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza

Yeah it is -1 (n = number of cutting points - 1).
Thanks a lot for your comment, Rahul.

15 May 2013 Discretization algorithms: Class-Attribute Contingency Coefficient To discrete continuous data, CACC is a promising discretization scheme proposed in 2008 Author: Guangdi Li

I closed my Karin Zachinelly account. My CACC implementation files are in Julio Zaragoza's account now.

Please, if you find any bugs in my implementation, let me know.

Comments and Ratings on Julio Zaragoza's Files View all
Updated File Comment by Comments Rating
26 Jun 2014 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza Adrian__

Well done for spotting that Guangdi Li's implementation of the CACC discretization algorithm was wrong. As Rahul pointed out, discretization of large datasets will take a lot of time using your code. This computational burden is a direct consequence of the way the method was coded. I implemented the method in Matlab and achieved (on some test examples) a substantial increase in speed (about 45 times). The speed can be farther improved. At the moment, I don't have the time to write a documentation for it but should you agree, I can let you have my files so you can update your code.

31 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza Rahul

That is what I thought, also it seems like your version is O(M^2) where M is the distinct values as you have nested loops when you are adding the inner boundaries. I'm not sure how the paper is achieving O(m log m).

31 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza Julio Zaragoza

You need to develop the C/C++ version of the code, otherwise it will take long time

30 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza Rahul

This works great on smaller datasets, but have you tried on larger datasets, I'm trying to discretize Gene Expression data, which has 1.5 million samples and 20000 unique classes.

30 Dec 2013 Discretization methods: Class-Attribute Contingency Coefficient (CACC - MATLAB) Correct Implementation of the CACC Discretization Method.http://cs.adelaide.edu.au/~jzaragoza Author: Julio Zaragoza Julio Zaragoza

Yeah it is -1 (n = number of cutting points - 1).
Thanks a lot for your comment, Rahul.

Contact us