image thumbnail

updated 18 days ago

DM Utils (data mining utils) by Przemyslaw

The tools for dealing with distance matrix, improving data mining capabilities (data mining, shared memory, parallel computing)

P=pair_dist_par(X,fun,varargin)

P=pair_dist_seq(X,fun,varargin)

[Phdl]=pair_dist_spmd(X,fun,varargin)

image thumbnail

updated 2 months ago

A Simple Fuzzy Classifier based on Inconsistency Analysis of Labeled Data by Janos Abonyi

Obtaining accurate but also interpretable fuzzy rulebased classifiers from labelled observation data (coil challenge, data mining, direct mail)

freqan.m

image thumbnail

updated 8 months ago

Entropy triangle by Francisco José Valverde-Albacete

A set of visual entropy-based tools to assess the performance of multiclass classifiers. (machine learning, data mining, classifier assessment)

examplePrincipe

[H_Pxy, H_Px,H_Py, EMI_Pxy, I_Pxy, I_Px, I_Py, M_Pxy]=ent...

[cen,mcc]=cen_mcc(Am)

image thumbnail

updated 11 months ago

White Reality Check by Arnout Tilgenkamp

A reality check for data snooping by Halbert White, uses the stationary bootstrap by Politis&Romano (white, reality check, datasnooping)

PolitisRomanoBootstrap( alternative , n , blockparam, dis...

WhiteRealityCheck( alternative , flag, benchmark , n , di...

image thumbnail

updated 12 months ago

Parallel Coordinate Plots GUI toolbox by Barnaby

Visualize and manipulate parallel coordinate plots with this GUI. UNIVERSITY OF BRISTOL (parallel coordinate p..., parallel coordinate p..., parallel coordinate)

AN=scaletominmax(A)

AN=scaletominmax(A)

Filter(varargin)

image thumbnail

updated 1 year ago

Boosted Binary Regression Trees by Kota Hara

Boosted Binary Regression Trees is a powerful regression method which can handle vector targets. (statistics, data mining, regression)

brtTest( input, brtModel, varargin )

brtTrain( X, T, leafNum, treeNum, nu )

regressionTreeTest( input, Tree )

image thumbnail

updated 1 year ago

Association Rules by Narine Manukyan

This function discovers association rules using Apriori algorithm. (apriori, association analysis, association rules)

demoAssociationAnalysis( )

findRules(transactions, minSup, minConf, nRules, sortFlag...

image thumbnail

updated 1 year ago

Gaussian Mixture Model by Ravi Shankar

The code finds out the parameters of a gaussian mixture model by Expectation-Maximization Algorithm. (data mining, data import, data export)

GMmodel(x,no_gaus_distr)

kmclust(x,no_clust)

image thumbnail

updated 1 year ago

Maximum(minimum) Weight Spanning Tree ( Directed ) by Guangdi Li

For learning "Directed Maximum Spanning Tree", Chu-Liu/Edmonds Algorithm is implemented here. (spanning tree, data mining, learning algorithm)

DirectedMaximumSpanningTree( OriginalCostMatric,Root )

DirectedMinimalSpanningTree( OriginalCostMatric,Root )

MaximalDirectedMSF( CostMatric )

image thumbnail

updated 1 year ago

K2 algorithm for learning DAG structure in Bayesian network by Guangdi Li

This is the code of Cooper's K2 algorithm proposed in 1992, quick and convenient for using. (dag, bayesian network, data mining)

ConstructLGObj( OriginalSample )

GClosedFun( LGObj, X, PAX )

k2( LGObj,Order,u )

image thumbnail

updated 1 year ago

Preprocessing dataset in IDS by Maher Salem

This code transfer nominal features into numeric and then normalize the whole dataset using min-max (preprocessing, normalization, kdd)

KDDPreprocessor.m

image thumbnail

updated 1 year ago

Improved Distance Evaluation (IDE) by Behrad Bagheri

This method could be implemented for feature selection in classification problems. (feature selection, signal processing, classification)

[scores indices selected duration]=IDE2(data,classes,thre...

image thumbnail

updated 1 year ago

k-means++ by Laurent S

Cluster multivariate data using the k-means++ algorithm. (data mining, clustering, kmeans)

kmeans(X,k)

image thumbnail

updated almost 2 years ago

Mean shift for finding modes by Soumitry J Ray

The submission finds modes in data. Data is generated from a mixture of gaussian with added outliers (computer vision, data mining, find modes)

findModes(points, n_seeds, dTol)

generateData(nDims,nPts,nClusters)

plotData(data, labels, data_mean)

image thumbnail

updated 2 years ago

Simple Econometrics and Computational Finance Laboratory Toolbox by Leo Chen

Simple Econometrics and Computational finance Laboratory Toolbox for MATLAB 7.x (econometrics, computational finance, computational intelli...)

SECF__assess_R2( y_test, y_calculation )

SECF__assess_RMS( xi )

SECF__assess_mAP(data_matrix)

image thumbnail

updated 2 years ago

Decision Trees and Predictive Models with cross-validation and ROC analysis plot by Andrea Padoan

This code implements a classification tree and plots the ROC curves for each target class (data mining, statistics, decision tree)

BestTree (t, X, y, nsamples)

CalculateFeaturesImportance.m

CalculateOutcomeGroups (y)

image thumbnail

updated 2 years ago

RUSBoost by Barnan Das

RUSBoost is a boosting-based sampling algorithm that handles class imbalance in class labeled data. (class imbalance, machine learning, data mining)

CSVtoARFF (data, relation, type)

ClassifierPredict(data,model)

ClassifierTrain(data,type)

image thumbnail

updated 2 years ago

SMOTEBoost by Barnan Das

Implementation of SMOTEBoost algorithm used to handle class imbalance problem in data. (class imbalance, machine learning, data mining)

CSVtoARFF (data, relation, type)

ClassifierPredict(data,model)

ClassifierTrain(data,type)

image thumbnail

updated 2 years ago

k-means intra cluster measure by ps

to find the cluster validation in k-means by intra cluster measure coding in matlab (data mining)

intra.m

image thumbnail

updated almost 3 years ago

Similarity classifier by Pasi Luukka

Classifier based on similarity measure. (biotech, image processing, data exploration)

[Classification_accuracy,p1,m1,classes]=simclass(datalear...

[Mean_accuracy, Variance,p1,m1]=simclass2(data,v,c, measu...

calcfitness(data, ideals, y, w)

image thumbnail

updated almost 3 years ago

Parallel Distributed Processing of Weka Algorithms in Matlab by Jaspar Cahill

Run Weka algorithms in parallel across distributed computers to exploit available hardware. (weka, parallel, parallel processing)

datasetToWeka(dat)

getNonMatlabPaths()

runParallelWeka

image thumbnail

updated almost 3 years ago

CLUSTERING THROUGH OPTIMAL BAYESIAN CLASSIFICATION by Lionel

The package contains function for performing soft clustering. (statistics, clustering, clusters)

Dis=contrastEM(x,centroids,py_x)

[npy_x,npy,idxClus]=distinctW(w,py_x,py)

[py_x,centroids,clustind,complexity]=EM(x,beta,varargin)

image thumbnail

updated almost 3 years ago

The Adjusted Mutual Information by Xuan Vinh Nguyen

The Adjusted Mutual Information for clustering comparison (clustering, data mining, machine learning)

[AMI_]=AMI(true_mem,mem)

image thumbnail

updated almost 3 years ago

The minCEntropy algorithm for alternative clustering by Xuan Vinh Nguyen

The minCEntropy algorithm for alternative clustering (clustering, alternative clusterin..., data mining)

Cont=Contingency(Mem1,Mem2)

MI(label, result)

[max_mem,max_obj,all_mem,S]=minCEntropy(a,K,sigma_factor,...

image thumbnail

updated almost 3 years ago

The Spherical K-means algorithm by Xuan Vinh Nguyen

Clustering on the hypersphere with the Spherical K-means algorithm (kmeans, spherical kmeans, clustering)

[best_x,best_f,best_membership,empty,loop]=SPKmeans(a,K,n...

b=normalize_norm(a)

b=normalize_norm_mean(a)

image thumbnail

updated almost 3 years ago

Fisher's exact test with n x m contingency table by Guangdi Li

Do you have problem with Fisher's exact test where the contingency table is more than 2*2? (statistics test, mathematics, category)

FisherExactTest( XVector,YVector )

FisherExactTest_R( X,Y )

ControlCentor.m

image thumbnail

updated 3 years ago

Software implementations of DC by qi zhang

advantage of speed compared with HAC & a stable and effective algorithm (data competition, clustering algorithm, data mining)

[new_center,cl]=update_centers(SD,II,E,K,num,center)

evalf(trueclass,cl,x,sfct)

similarity_euclid(data)

image thumbnail

updated 3 years ago

Software implementations of PBKM by qi zhang

advantage of speed & performance appears under large number of clusters (clustering, large number of clust..., clustering algorithm)

IDXA=cut(n,IDX);

IDXA=divKMS(x,K);

T=cengci(x,K);

image thumbnail

updated 3 years ago

MultiClass LDA by Darko Juric

Performs multiclass linear discriminant analysis. (classifier, data mining, multiclasslda)

LDA

LDA_Demo.m

image thumbnail

updated 3 years ago

Simulation of Ant Based Clustering Algorithm Based on Cemetery Organization (Lumer&Faeita Method) by Heidar Rastiveis

This code shows the process of Lumer and Faeita algorithm in data clustering. (data mining, data clustering, swarm intelligence)

AntBasedClustering(varargin)

image thumbnail

updated 3 years ago

Pegasos - Primal Estimated sub-Gradient SOlver for SVM by nitin thokare

This is implementation of "Pegasos-Primal Estimated sub-Gradient SOlver for SVM" paper. (pegasos, primal estimated subg..., svm)

pegasos(X,Y,lamda,k,maxIter,Tolerance)

image thumbnail

updated 3 years ago

Frequent Itemset Searching in Data Mining by nitin thokare

This code is provided for searching frequent itemsets in given database (data mining, frequent itemset sear..., machine learning)

AFreq1=GetFreq1(A,t)

AFreq2=GetFreq2(A,t,AFreq1)

AFreqk=GetFreqk(A,t,AFreq1,AFreqk0)

image thumbnail

updated 3 years ago

ARMADA Data Mining Tool version 1.4 by James Malone

An association rule data mining tool for experimentation and analysis. (statistics, probability, data mining)

alterConfBox()

alterRateMenu()

builderAddToRule

image thumbnail

updated 3 years ago

Discretization algorithms: Class-Attribute Contingency Coefficient by Guangdi Li

To discrete continuous data, CACC is a promising discretization scheme proposed in 2008 (discretization, classification, data mining)

CACC_Discretization( OriginalData, C )

ControlCenter.m

image thumbnail

updated almost 4 years ago

LDA: Linear Discriminant Analysis by Will Dwinnell

Performs linear discriminant analysis. (statistics, lda, linear discriminant)

LDA(Input,Target,Priors)

image thumbnail

updated almost 4 years ago

Mahalanobis Distance by Kardi Teknomo

Return mahalanobis distance of two data matrices A and B (row = object, column = feature) (mahalanobis distance, statistics, data mining)

d=MahalanobisDistance(A, B)

image thumbnail

updated almost 4 years ago

Covariance matrix by Kardi Teknomo

Return covariance matrix for given data matrix X (row = object, column = feature) (statistics, distance, data mining)

C=Covariance(X)

image thumbnail

updated almost 4 years ago

CAIM Discretization Algorithm by Guangdi Li

CAIM (class-attribute interdependence maximization) is designed to discretize continuous data. (discretization, classification, data mining)

CAIM_Discretization( OriginalData, C )

CAIM_Evaluation( OriginalData, C, Feature, DiscretInterva...

DiscretWithInterval( OriginalData,C,Column,DiscretInterva...

image thumbnail

updated 4 years ago

Rosenblatt's Perceptron by Ibraheem Al-Dhamari

A very simple Neural Networks example. (neural network, perceptron, classification)

MyPerecptronExample

[w,b,pass]=PerecptronTrn(x,y);

e=PerecptronTst(x,y,w,b);

image thumbnail

updated 4 years ago

Frequency Counter by Ibraheem Al-Dhamari

Return array elements and the frequency of each element in the array. (frequency histogram a..., repeated elements, repeat)

F=frq(A)

image thumbnail

updated 4 years ago

DeltaRule by Will Dwinnell

Trains a single artificial neuron using the delta rule. (neural, neural net, neural network)

DeltaRule(X,Y,LearningRate,MinimumWeightChange,MaximumPas...

Logistic(Input)

image thumbnail

updated 5 years ago

Maximum Weight Spanning tree (Undirected) by Guangdi Li

Chu-Liu-Edmonds Algorithm for learning "Undirected Maximum Weight Spanning tree" is proposed here. (spanning tree, data mining, graph theory)

UndirectedMaximumSpanningTree (CostMatrix)

ControlCentor.m

image thumbnail

updated 7 years ago

GRABIT by Jiro Doke

Extract (pick out) data points off image files. (data exploration, data, extract)

grabit.m

Contact us