Rank: 3457 based on 36 downloads (last 30 days) and 3 files submitted
photo

Colin Clarke

E-mail
Company/University
Cranfield University
Lat/Long
52.25830078125, -7.11899995803833

Personal Profile:

Education:
BSC applied Biology
MSC Bioinformatics
Currently PhD student at Cranfield.

Subject: Bio-nanotechnology and Flow Cytometry.

Professional Interests:
Bioinformatics, Nanotechnology, machine learning

 

Watch this Author's files

 

Files Posted by Colin View all
Updated   File Tags Downloads
(last 30 days)
Comments Rating
10 Nov 2006 Screenshot medoultierfilt code to remove outliers from a mulitvariate dataset using the median Author: Colin Clarke statistics, probability, median outlier remova... 16 3
  • 4.0
4.0 | 2 ratings
02 Oct 2006 Screenshot dbSNP tool GUI for the retrieval of single nucleotide polymorphism data from dbSNP Author: Colin Clarke biotech, pharmaceutical, dbsnp, snp, bioinformatics, gui 18 0
20 Sep 2006 Screenshot gethgvbase Extraction of single nucleotide polymorphism data from HGVBASE Author: Colin Clarke biotech, pharmaceutical, bioinformatics, hgvbase, snp, polymorphism 2 0
Comments and Ratings on Colin's Files View all
Updated File Comment by Comments Rating
18 Feb 2009 medoultierfilt code to remove outliers from a mulitvariate dataset using the median Author: Colin Clarke Hemanshu

what do i do if i dont want to delete the outliers but i want to change the value...

19 May 2008 medoultierfilt code to remove outliers from a mulitvariate dataset using the median Author: Colin Clarke Y, Ido

In general the code produces the expected results. Except for the above comments, I would suggest changing the plot lines to boxplot(x,'notch','on', 'whisker',outlier_cut). This way the user's choice of the outlier cut is visualized.

01 Apr 2007 medoultierfilt code to remove outliers from a mulitvariate dataset using the median Author: Colin Clarke D'Errico, John

Fairly good help, missing at least one item of importance. The one I noted is that while the variable outlier_cut has a default value, this default is not indicated in the help. What good is an undocumented default? I'd also like to be told if the outlier_cut variable must be a scalar, and what legal range of values it can tke on.

Next, suppose that someone wishes to supply a value for plot_state, but is willing to allow outlier_cut to take on its default value. They would like to call your code as

x_filt = medoutlierfilt(x,[],plot_state)

This would be consistent with the operation of most functions in matlab. The default checks in this code are purely in the form of

if nargin < 3
plot_state = 1;
end

Better is to use a check like

if (nargin < 3) || isempty(plot_state)
plot_state = 1;
end
if (nargin < 2) || isempty(outlier_cut)
outlier_cut = 1.5;
end

I did like that an example was provided, as well as an H1 line, although the H1 line was wrapped, cutting off part of it from the sight of lookfor. I also liked that the author included his name, plus an attribution to a prior code. There are a reasonable number of internal comments to make this code readable, even by my standards.

One comment about the example provided - it fails to run. The example in the help has two output arguments, but medoutlierfilt only returns one. Oops. I'll bet that an earlier version of the code had two outputs.

The arguments could also benefit from error checking. What legal values can these variables take on? What sizes may they be? What if someone accidentally passes in a vector?

One of the things we all should do is look at the mlint output on our codes. Mlint flags quite a few lines in this code. (Mlint is a very helpful tool, and since mlint flags can now be seen in the editor, there is no reason to not use it.)

Mlint points out that a few variables are built without benefit of preallocation. Some of these variables are clearly of a known size, so preallocate them.

Finally, I'll note that while the user can turn off the plots when they are unwanted, it might be useful to allow the user to also turn off the stats display that is written out at the end. This could be done easily enough by allowing the plot_state variable to take on one of 4 values: [0,1,2,3]. Dec2bin will unencode this input, and then will allow any combination of plots and display to be generated.

Contact us