Its simply a feature extraction code using the Wavelet Packet Transform (WPT).
The code represents a generalization of the Multisignal 1D wavelet decomposition. I have written a simple routine to extract the full tree out of the Matlab "mdwtdec" function available within the wavelet toolbox.
The code is efficient and very simple, and is mainly written for feature extraction from the WPT tree using an overlapping windowing approach.
To use the code: the main function is getmswpfeat, read inside the code for an example.
** Kindly cite either of the following papers if you use this code **
References:
[1] R. N. Khushaba, A. AlJumaily, and A. AlAni, “Novel Feature Extraction Method based on Fuzzy Entropy and Wavelet Packet Transform for Myoelectric Control”, 7th International Symposium on Communications and Information Technologies ISCIT2007, Sydney, Australia, pp. 352 – 357.
[2] R. N. Khushaba, S. Kodagoa, S. Lal, and G. Dissanayake, “Driver Drowsiness Classification Using Fuzzy Wavelet Packet Based Feature Extraction Algorithm”, IEEE Transaction on Biomedical Engineering, vol. 58, no. 1, pp. 121131, 2011.
Notes:
You can use the code on one or multiple signals, these are given as columns of variable x.
 You can use the matlab wavelet toolbox or the wmtsa wavelet toolbox http://www.atmos.washington.edu/~wmtsa/
Tutorial available within submission.
1.3  http Link fixed 

1.3  Link to wmtsa wavelet toolbox updated 

1.3  Tutorial: typos fixed. 

1.2  Code organized and added support for the wmtsa wavelet toolbox (in case you don't have the matlab wavelet toolbox). 

1.1  A newer version added for feature extraction from multiple signals at the same time (if you have enough memory ;) ). 
Inspired: Combining Left and Right Palmprint Images for More Accurate Personal Identiﬁcation
neetu singh (view profile)
Dear mam , I am working on a project " emotion recognition through EEG signals "can i use this code for feature extraction from EEG. and how do i use it. I m beginner in MATLAb so i want guidance from u.
Didik Purwanto (view profile)
ahmed silik (view profile)
very good
savita sharma (view profile)
Rance Tino (view profile)
Hi Dr Rami, for the function getmswpfeat.m, I keep getting this error:
"Maximum variable size allowed by the program is exceeded.
Error in getmswpfeat (line 78)
Features = zeros(numwin,nfCh*size(x,2));"
I'm using the same input you have:
x = rand(300,1);
feat = getmswpfeat(x,256,32,256,'matlab')
Any help would be awesome. Thank you!
Rance Tino (view profile)
MD MAHABUB HOSSAIN (view profile)
Hi Sir,
I run the code but i got zero features. Can you please suggest me how to solve it?
kind regards
Hossain
yas yasmini (view profile)
Hi and thanks for useful code
could you please describe me how can I discover true size of "winsize" and "wininc" for P300 detection in EEG signals?
thanks in advance
Saiful Musa (view profile)
assalamualaikum.. sir, the code is running well for me.. but i just want to ask how to get 1 value every feature extraction? in trial i get 25 row every feature.. thanks u
Sanjay (view profile)
Sir, can you tell me how to extract features(like energy) from an mri image using the same code?
abinash kumar (view profile)
sir, i have eeg signal of set A having (4097*100). the signal is filtered by lpf with cut off frequency of 64HZ. after that i use wavelet decomposition on it and reconstruct the signal. after reconstructing, fft is used to plot the spectrum. now, i want to extract the feature i.e mean absolute deviation of rectified signal and frequency present. but i am not able to extract the feature. please help me
Rami Khushaba (view profile)
Winsize and wininc are parameters that you need to set yourself depending on your own application.
*** Please do spend some time reading the document included in this submission as the answers to all of your questions are there.
Tung Nguyen (view profile)
Thank you
I used Olimex EEGSMT with only 2 nodes(chanel) to get signal. The device samples the analog signals and builds a 17byte packet which is transmitted at 256Hz,
using 1 start bit, 8 data bits, 1 stop bit, no parity, 57600 bits per second.
Minimial transmission speed is 256Hz * sizeof(modeeg_packet) * 10 = 43520 bps.
X is raw signal, winsize = 256, right?
And what is wininc?
Thank for your help ^^
Rami Khushaba (view profile)
The input is a one dimensional signal (or multidimensional one, i.e., many signals), simply any type of signals.
As an example, think of this as your EMG signal from the muscles on your forearm or even EEG signals from many sensors placed on the brain.
If you spend some time looking inside the code, there is an example, with x being the input signal from which you are trying to extract features.
Tung Nguyen (view profile)
Can you tell me what is input of "getmswpfeat" ?
getmswpfeatV00(x,winsize,wininc,SF,'toolbox')
What is x?
Thank you
Vasumathi Ganesh (view profile)
Dear mam,
Which .m file can i run among these Three files u have given.
Rami Khushaba (view profile)
Guys, perform this test to make sure yourself that this is the WPT
% define some random signal
X = rand(32,1);
% perform the WPT decomposition using the matlab wpdec command
T = wpdec(X,3,'db4')
% Now decompose the same example using the proposed mswpd here in this post.
D = mswpd('col',X,'db4',3);
% Now look at the resulting difference between coefficients from both techniques at any node
sum(abs(wpcoef(T,[1 1])  D{2,2}))
Hope this clarifies things.
Asyur Zaldi (view profile)
Asyur Zaldi (view profile)
Dear Dr Rami, thank you for the answer
Yes I like to get the result a vector instead of matrix.
fea=getmswpfeat(s,isize,iinc,12,'matlab'), where isize = iinc = signal size and same with iinc. The function return me one single value. Am I miss something?
thanks in advance
Rami Khushaba (view profile)
@Asyur: simply, the answer is yes, you can use the code on one or multiple signals. If you are interested in getting one vector output you can easily set the window size and window increment both equal to the input vector length, and in this way you get only one vector at the output. Alternatively, if your window size is smaller than the input vector length, you will end up with multiple vectors at the output. There is nothing to prevent you from averaging, in my own opinion, but you might as well just use the above mentioned window size and window increment to get one vector without the need for further averaging.
Rami Khushaba (view profile)
@utkatsh: the wavelet packet transform is a generalisation of the wavelet transform, where instead of applying the wavelet decomposition on the approximation coefficients only (as in WT), you also apply the wavelet decomposition on the details coefficients as well in the WPT. Thus, the baseline technique is the same, and depending on how you apply it then you get the WT and WPT.
utkarsh (view profile)
Dear Sir,
What exactly have you used in this code? WPT or DWT for decomposition. I am confused. description says WPT but in the code you are using 'mdwtdec'.
dec = mdwtdec(dirDec,T,1,h);
Asyur Zaldi (view profile)
Dear Dr Rami, I just come across to this session and seems like I got what I'm looking for.
Since this is a multisignal, can i use it for a one single vector(column)? and I try it it return some column depend on the parameter sent, and the result return in matrix form and I average to become a single vector. the step i mention is it ok? base on the code you posted in here. thanks in advance
mira (view profile)
Dear Dr,
Is the feature in this coding is the energy?
hossein (view profile)
Dear Dr
can we use these features in clustering?
is it possible, for example, how can we do with kmeans?
Ahmed (view profile)
Dear Dr.Rami
Thanks for you support; Currently I am working on the same point of your paper"Driver Drowsiness Classification"; I hope you can guide me to database for EEG data for this issue.Thanks inadvance
Rami Khushaba (view profile)
Code updated, a small tutorial and paper added and support for the wmtsa wavelet toolbox also added (in case you don't have the matlab wavelet toolbox).
Please rate this submission if you learned something here or you found it useful.
behailu regasa (view profile)
Dear Dr.,
Thank you for your immediate response. I expect your answer with the supporting document. would you share to me the second reference or how could i get it freely?
Rami Khushaba (view profile)
Dear Behailu Regasa
Thanks for your inquiry, I am planing an update to this code within few days, I will also put a supporting document to help understand the process.
Thanks
Rami
behailu regasa (view profile)
Dear DR. Rami,
i could not understand how could i extract features from multiple signal. That is from hundred 1D signal like EEG and use them to train neural network. would you help me. I thank you in advance.
Rami Khushaba (view profile)
Dear Deepak
Thank you for your inquiry, you are actually right about SF. However, please note the comment in the code, that is you can use the sampling frequency to determine the actual number of decomposition levels J, I remember having a comment in the code about that in the same line where I defined J=7 (comment says you can use J=(log(SF/2)/log(2))1;). I fixed the decomposition levels in the code sample you mentioned to make it easy for new users to understand it, but you can change that by using the commented equation.
Rami
Deepak (view profile)
Dear Dr. Rami, thank you very much for providing such an useful code. I could not understand the use of sampling frequency (SF)in the getmswpfeat. It asks SF as an argument but does not use it anywhere in the code. Please correct me if I am wrong.
Madi (view profile)
Thank you for the comment.. I just concern It should be same answer for 'feat' at Approximate coefficient.. From the example you given, YES the answer are same.. when I go through the code (getmswtfeat.m) It is look like different in how to determine the last answer.. for code (getmswpfeatV00.m).. I can understand the flow.. but for the 'getmswtfeat.m', I'm confused for term 'percentENER(notZER,:) = ...', '[cfs,longs] = wdec2cl(dec,'all')',.. and I hope the final answer for 'tab_ENER' should be same as WP code... sorry to many question...
Rami Khushaba (view profile)
Here is a quick proof
Signal = rand(256,1);
J=2;
D = mswpd('col',Signal,'db4',J);
EnergyWPT = sum(D{3,1}.*D{3,1});
dec = mdwtdec('col',Signal,J,'db4');
EnergyWT = sum(dec.ca .* dec.ca);
disp(sprintf('Energy of WT = %f, \t Energy of WPT = %f',EnergyWT,EnergyWPT))
Rami Khushaba (view profile)
Madi, Please check what you did again.
Both of my codes getmswtfeat and getmswpfeatV00 are based on the same function "mdwtdec" so they cant get you different result if you use the exact same settings in both files, which is not the case here because I didn't post this to make them an exact replica of each other, though this is a good point for an update here from my side.
To prove the above to you:
1) make sure you use the same decomposition level J (its 7 in getmswtfeat and 10 in getmswpfeatV00 by default).
2) make sure you use the same wavelet family (its 'db4' in one code and 'Sym8' in the other).
3) Remove the different normalization utilized within both codes (you need to modify both codes in some sections).
Based on the above, it appears to me that its not just changing the log1p that would give you the same result.
I am planning an update to both codes and will unify both codes in terms of every single parameter.
Thanks for your concern and Good Luck :).
Rami
Madi (view profile)
Thank You Dr Rami,
I tried the code (getmswtfeat.m (Your WT code)) and try to remove and maintain 'log1p'.. the result for 'energy' approximate coefficient are different when using the code (getmswpfeatV00.m)..
Rami Khushaba (view profile)
Correction, for the second point I meant to say its for normalization (I believe I took out the sum up to 1 part).
Rami Khushaba (view profile)
Hi Madi
For the first point, Yes it should be the same, but I never tried doing that (just check the code in the feature extraction step). Plus, youalready said that you compared with dwt and you got the same answer when commenting the few lines you mentioned above. So, yes it should be fine.
For the second point, the lines you mentioned are for normalization, use it if you want the total energy at each level to sum up to 1. Of course, when you comment these lines you get similar result to dwt and if you use these lines its like you are normalizing the output of WPT features but not the dwt one, so if you do the same on the other features you will end up with similar results.
Kind Regards
Rami
Madi (view profile)
Thank you for reply..I understand now..
If I run this code (getmswpfeatV00.m) and I only took the energy at approximation part (Low pass filter part).. It is the result will be same if I'm using this code (getmswtfeat.m (Your WT code))?
One more thing, actually the code below is for what? Because you said can comment it.. If I comment this code the result is different.
I test for J=1 and comment the code below and I compare with using this code ([CA,CD] = dwt(xx,'sym8');) and took the energy for CA and CD.. and the result is same.
%% You can comment this section
JJ = J;
for i=1:J1
T = (2^i):(2^i)+(2^i)1;
feat(:,T) = feat(:,T)./log(JJ);
JJ=JJ1;
end
Rami Khushaba (view profile)
HI Madi
1) The difference is in the tree structure, WT decomposes only the approximation part and (Low pass filter part) and keeps decomposing that. WPT on the other hand, decomposes both approximation and details parts (LPF and HPF). Depending on your application, either one of these could work well. Look at my thesis for a description (http://www.ramikhushaba.com/phdthesis.html)
2) The types of features is up to you to decide, you can get energy of coefficients, mean or variance or any other features. In my code above, I set it up as the log scaled energy, but you can change this of course.
Hope the above helps.
Rami
Madi (view profile)
Dear Dr Rami,
I'm new in Wavelet Feature extraction, I have two question: :)
1) just to ask what is the different Wavelet Packet Decomposition and Wavelet Transform Decomposition?
2) From you code, feat = WPT features (compete tree without node selection).. actually what is WPT features?.. It is energy based? or median or mean?
Thank You
Rami Khushaba (view profile)
Hi Ahmed
It looks like you didn't decide on the window size and increment yet, right?
Ok, here is the simplest scenario. Simplest way is to treat each speech signal independently (though this code was specifically written to help you do that on multiple signals directly). Each wave your recorded is a 1dimensional vector say x1, x2, x3, ....etc
feat1 = getmswpfeatV00(x1,winsize,wininc,SF)
feat2 = getmswpfeatV00(x2,winsize,wininc,SF)
feat3 = getmswpfeatV00(x3,winsize,wininc,SF)
SF is your sampling frequency, winsize and wininc you have to put based on your own application (cant help you with that, just try some numbers and see which one works best for you). After you get these vectors;
FeaturesW1 = [feat1;feat2;feat3];
Of course, if the vectors (x1,x2,x3) represent the same word then you should add another last column to FeaturesW1 made of all 1's.
Now if you have another x1,x2,x3 of another words, do the same and get FeaturesW2 and put the last columns as all 2's.
Now combine
Features =[Features1;Features2];
Done, you have your feature matrix with class label, do your classification now.
Regards
ahmed alhamdani (view profile)
Dear Dr. Rami
If I have n file(for example five files) for the same speech word (.wave) how can I extract feature vector for each file and put it the out put as matrix .
kind regardes
Rami Khushaba (view profile)
Yes, either
1 Reshape each image into a vector and apply this code
2 Or simply use the 2d version of the wavelet command.
Subha (view profile)
Sir is it possible to extend it to image processing also??? if so how to do that.. thanks in advance..
Rami Khushaba (view profile)
Dear Ahmed
Yes, what you said seems ok.
1 Do you need Hamming window if you used wavelet? try both cases and see the difference in accuracy.
2Try different window sizes to see which one gives you the best accuracy.
3On the other hand, LIBSVM library does actually give you a probability output so have a look at it (Google LIBSVM).
4As for the combination of SVM/HMM, I tried before combining SVM with DBN but SVM actually performed very good in our problem and there was no need for DBN after SVM, but in your application it might work better than SVM alone, just try and see.
The answer to most of your questions is that its actually problem dependent, i.e., unless you try you won't know.
Kind Regards
Rami
ahmed alhamdani (view profile)
Dear Dr. Rami Khushaba;
Thank you for answer.
Summry:
The specific for what I need help is about how I can use the output data from SVM as input to HMM. My project is about ( ISOLATED WORD speech RECOGNITION Using Hybrid SVM /HMM).
Preprocessing
The data base consists 25 speakers ,male ,female, young and child each speaker the same utter the same 20 words .
The file is saved as (*.wav);
The preprocessing includes sampling, segmentation, framing and windowing.
Sampling frequency=11025Hz
16 – Bit A/D Converter.
Framing the speech signal is blocked into frames of N samples N=256(which is equivalent to23msec) .
Overlaps between frames =128;
Using Hamming window for windowing each frame.
My question is this procedure right or not???
Processing
Feature extraction
The Discrete wavelet transform(DTW) daubechies filter and pitchdetection used In order to evaluate feature vectors . Used this feature vectors as input to Hybrid SVM/HMM for training and testing system .
I will do all this preprocessing and processing steps and I have a table of feature vector for each word.
My question is this setep procedure right or not???
Hybrid model of SVM/HMM
The hybrid model includes two parts: training and classification. Firstly , parameters of model can be obtained by training. Secondly, it can be calculated the probability estimate by Viterbi.
But the output of SVM is numerical value. For combined with HMM easily, it should transform the value into probability .
My question how can I do that when I used this feature vector as input to Hybrid SVM/HMM for training and testing system to recognize each word (so I needed algorithm and program to do that using matlab).
With Best Regards
Ahmed alhamdanil
ahmeddsa_hamdani@yahoo.com
Qasem (view profile)
Rami Khushaba (view profile)
Dear Ahmed
This function is required to decompose multiple signals together in one go using the discrete wavelet transform. Yes, you can program that yourself using any free library like WAVELAB or UVIWAVE.
Regards
Rami
ahmed amin (view profile)
Hi Dr.Rami
Thank you for your interest, what the benefit of this function?Is it possible to be programmed.
Regards
Ahmed
Rami Khushaba (view profile)
Hi Ahmed
Thanks for your inquiry, you need the wavelet toolbox to run this code, and obviously, from the error message you got, Matlab can't find this function on your PC, i.e., it means that you don't have that toolbox (or have old version of that toolbox without this function).
Regards
Rami
ahmed amin (view profile)
Hi Dr.Rami
i tray to us your code but i have this problem:
??? Undefined function or variable 'mdwtdec'.
i don't know why this problem do?
can you explane to me please.
Nabila (view profile)
Thank you very much for the update and kind reply, Dr.Rami. This is really helpful for me and everyone.
I hope I also can make great contributions in the future like you did :)
Best Regards,
Nabila
Rami Khushaba (view profile)
@Nabila, the code is now updated, please download the new version.
Regards
Rami
Rami Khushaba (view profile)
It actually depends on how you are going to process the data later on. For simplicity, I would just say put an outer for loop (for j=1:Nsignals)to repeat the process across each dimension of x (the input signal), and then use curwin =(st:en,j);
Or, simply wait for the update, I will do that myself today, then one or two days for mathworks to approve it.
Regards
Rami
Nabila (view profile)
Thank you very much for your kind explanation, Dr.Rami. This helps me a lot.
As for now I'm working with multidimensional signal and I changed the code a bit to this:
for i = 1:numwin
temp=[];
curwin = x(st:en,:).*repmat(datawin,1,Nsignals);
temp = detrend(curwin);
feat{1,i} = temp;
st = st + wininc;
en = en + wininc;
end
Am I working on the right track?
Thank you
Regards
Nabila
Rami Khushaba (view profile)
Thank you for your inquiry, the multiplication line was written for the case when you use the code on multidimensional signals rather than one dimensional, it wouldn't affect anything in the current situation, and I will update the code soon for the case when you provide multiple dimensions in the input signal.
As for the detrend command, we usually remove the mean or any linear trend in the signal before spectral analysis so it wouldn't affect the resultant power spectrum. Think about it in terms of FFT, add a constant value of say 100 to your signal and perform FFT and observe how the DC component will be affected, so to avoid such a situation just remove the mean or any trend there before processing.
Hope this help.
Regards
Rami
Nabila (view profile)
First of all thank you very much for this code.
If I may ask something, why do we need to multiply the signal with ones and then use detrend in this line? :
curwin = x(st:en,:).*repmat(datawin,1,Nsignals);
feat(1:winsize,i) = detrend(curwin);
Thank you very much
Rami Khushaba (view profile)
Best basis selection using fuzzy mutual information is also achievable with this code (ranke features by I_Cx./H_x):
http://www.mathworks.com/matlabcentral/fileexchange/31888fuzzyentropyandmutualinformation
Please cite the following paper if you use this code
[2] R. N. Khushaba, S. Kodagoa, S. Lal, and G. Dissanayake, “Driver Drowsiness Classification Using Fuzzy Wavelet Packet Based Feature Extraction Algorithm”, IEEE Transaction on Biomedical Engineering, vol. 58, no. 1, pp. 121131, 2011.
Rami Khushaba (view profile)
Suppose you have a sensor giving you raw data, and the sensor samples at 256 Hz, i.e., you get 256 samples or sensor readings per second. Say now you have collected few seconds of data from the sensor, i.e., for example 2048 samples (2048 Samples = 256 samples/sec * 8 sec). Now we want to extract features from the corresponding 2048 samples, but say for example you only want to look at windows of size 256 samples and incremented by 32 samples, so lets run the code:
Signal = rand(2048,1) % one dimensional sensor data for example, 8 seconds with 256 samples/sec
winsize =256;
wininc =32;
SF = 256; %sampling frequency
feat = getmswpfeatV00(Signal,winsize,wininc,SF);
feat is a matrix of 57 x 255, which means you have extracted 57 samples from the total 8 sec worth of data and that each sample had 255 features. Now your question will be why 57 and why 255:
no of samples = (data length – winsize)/winin + 1 = (2048256)/32+1 = 57
no of features : just look at the figure above, level 0 gives 2^0 features, level 1 gives 2^1 =2 features and so on.
Pay attention that you still have to do feature selection after having the 255 features, as these might be redundant.
Hari (view profile)
guys please explain me how to execute this code what are the input arguments for this code
Rami Khushaba (view profile)
Hello
All what you need to do is to remove the mean and the log operator to get the coefficients, but pay attention to the dimensionality of the produced coefficients matrix. Let me know if you can't do it and I will rewrite it for you.
Rami
Rohtash Dhiman (view profile)
I need only to extract coeff values to use them for further programm for 8th level wavelet packet decomposition
Ali Ali (view profile)
Thanks you very much Rami for the code
Rami Khushaba (view profile)
Note: you still have to do feature selection to select the most promising feature set for your own problem, as the extracted feature set here represent the complete set with redundant bases.