File Exchange

image thumbnail

Feature Selection Based on Interaction Information

version 1.0 (23.8 KB) by

Self-contained package for feature selection based on mutual information/interaction information.

4.66667
3 Ratings

17 Downloads

Updated

View License

This is a self-contained package for running feature selection filters: Given a (usually large) number of noisy and partly redundant variables and a target choose a small but indicative subset as input to a classification or regression technique.
For background information, see e.g: Gavin Brown, 'A New Perspective for Information Theoretic Feature Selection', Artificial Intelligence and Statistics, 2009.

The Matlab function select_features.m includes several previously published methods as special cases, such as FOU, MRMR, MIFS-U, JMI, and CMIM. It allows for higher-order interaction terms, forward and backward search, priors, several redundancy weighting options, and pessimistic estimates.

Auxiliary functions for discretization, construction and marginalization of probability tables, conditional entropy, mutual information and interaction information are included and are usable by themselves. See demo_feature_select.m for examples.

Comments and Ratings (12)

Attempt to execute SCRIPT varargin as a function:
C:\Program Files\MATLAB\R2015b\toolbox\matlab\lang\varargin.m

Shamnad

Shamnad (view profile)

If you are getting Error in marginal (line 72)
mrg = repmat(mrg,missing_dim);
then transpose the missing_dim variable.
This should help,
mrg = remat(mrg,missing_dim');

shahla

shahla (view profile)

shahla

shahla (view profile)

Hi
I would appreciate if you could let me know how to fix this error:

>> demo_feature_select
using settings: degree 2, pessimistic 0, dir_fwd 1, dir_bwd 0, prior 0.000000, red_wt 1.000000, cond_red_wt 1.000000
Error using repmat
Replication factors must be a row vector of integers or integer scalars.

Error in marginal (line 72)
mrg = repmat(mrg,missing_dim);

Error in mutual_info (line 108)
mrg_xz = marginal(prob_smooth, ~dim2); % P(X,Z)

Error in select_features (line 240)
rel(i) = mutual_info(p,'prior',prior_wt);

Error in demo_feature_select (line 52)
[steps,sel_flag,rel,red,cond_red] = select_features(data_quant(:,1:2),data_quant(:,3),2);

Thanks.

shahla

shahla (view profile)

Hi
I would appreciate if you could let me know how to fix this error:

>> demo_feature_select
using settings: degree 2, pessimistic 0, dir_fwd 1, dir_bwd 0, prior 0.000000, red_wt 1.000000, cond_red_wt 1.000000
Error using repmat
Replication factors must be a row vector of integers or integer scalars.

Error in marginal (line 72)
mrg = repmat(mrg,missing_dim);

Error in mutual_info (line 108)
mrg_xz = marginal(prob_smooth, ~dim2); % P(X,Z)

Error in select_features (line 240)
rel(i) = mutual_info(p,'prior',prior_wt);

Error in demo_feature_select (line 52)
[steps,sel_flag,rel,red,cond_red] = select_features(data_quant(:,1:2),data_quant(:,3),2);

Thanks.

Rok Martincic

Hello!

I'm also having some trouble with the quantize function. When I run the demo provided with the toolbox, I get an error:
??? Error using ==> quantize
MEX level2 S-function "quantize" must be called with at least 4 right hand arguments
If I open the quantize.m function, I can see that the possible input arguments are t, x, u, flag and q...but there is no explanation of these arguments.
Can somebody explain them to me please? And also, which of them are necessary for the function to work...because there are 5, and the error says there should be at least 4.

Hey a great script!! But sometimes I get this error:

??? Error using ==> marginal at 37
length of dimension vector incompatible with size of probability table

Error in ==> select_features>update_comb at 491
p_xy = marginal(p_zxy, ~ofn(1, deg+1), 1); % all variables except target

Error in ==> select_features>update_degree at 457
use_full = update_comb(var_update, var_sel, var_sgn, [], deg);

Error in ==> select_features at 382
needs_full = update_degree(i, sel, sgn, flag_sel, d);

Error in ==> predict_club2 at 96
test=select_features(traindata, trainlabels,feature_nr);

june

june (view profile)

function [varargout] = select_features(features, target, max_iter, varargin)
what target stand for?feature?or class?

asma

asma (view profile)

The algorithms are so slow, is there is a way to improve the speed?

i have a feature space of 72 columns and i need to reduce them into 20 or 15.

I see you're right. I was actually hoping to build something that didn't depend on the statistics toolbox, so I guess this isn't it. Thanks for keeping the code itself quite clear though; it's a good place for me to start to build what I need.

Stefan Schroedl

Stefan Schroedl (view profile)

the quantile function is defined in the statistics toolbox, maybe you don't have it installed?

Hopester Hope

Perhaps I'm being dense, but I'm getting errors whenever I use these functions. For example, the script:

X=rand(10,1);XQ=quantize(X,'levels',2);[X XQ]

throws the error:

'??? Undefined function or method 'quantile' for input arguments of type 'double'.

The script comes directly from your help page, so I guess it should work... Any ideas?

MATLAB Release
MATLAB 7.8 (R2009a)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video

Win prizes and improve your MATLAB skills

Play today

select_features/html/