File Exchange

image thumbnail

Feature Extraction Using Multisignal Wavelet Packet Decomposition

version (521 KB) by Rami Khushaba
Its simply a feature extraction code using the Wavelet Packet Transform (WPT).


Updated 08 Feb 2017

View Version History

View License

The code represents a generalization of the Multisignal 1-D wavelet decomposition. I have written a simple routine to extract the full tree out of the Matlab "mdwtdec" function available within the wavelet toolbox.
The code is efficient and very simple, and is mainly written for feature extraction from the WPT tree using an overlapping windowing approach.
To use the code: the main function is getmswpfeat, read inside the code for an example.
** Kindly cite either of the following papers if you use this code **


[1] R. N. Khushaba, A. Al-Jumaily, and A. Al-Ani, “Novel Feature Extraction Method based on Fuzzy Entropy and Wavelet Packet Transform for Myoelectric Control”, 7th International Symposium on Communications and Information Technologies ISCIT2007, Sydney, Australia, pp. 352 – 357.

[2] R. N. Khushaba, S. Kodagoa, S. Lal, and G. Dissanayake, “Driver Drowsiness Classification Using Fuzzy Wavelet Packet Based Feature Extraction Algorithm”, IEEE Transaction on Biomedical Engineering, vol. 58, no. 1, pp. 121-131, 2011.

-You can use the code on one or multiple signals, these are given as columns of variable x.
- You can use the matlab wavelet toolbox or the wmtsa wavelet toolbox

-Tutorial available within submission.

Cite As

Rami Khushaba (2021). Feature Extraction Using Multisignal Wavelet Packet Decomposition (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (74)

Julian Grodzicky

Very well constructed page, and the provided links and papers answer a lot of the questions in the comments. Thank you for your diligence and effort.

Isaac Llorente

@Rami Khushaba,
Thank you for sharing this code. It is an amazing and useful contribution.!

ugo ogbulafor

@Rami Khushaba
thank you for your code and explanation.
please how to we get access to your PhD thesis ? the link you referred to " (" requires a password


Dear sir.. can i have the wavelet packet decomposition same as the structure of wavedec instead of taking the whole tree??? please reply.

spoorthy venkatesh

spoorthy venkatesh

Hello sir, thank you for the code. I just have few doubts. I m trying extract features from an EEG signal consisting of 16 channels. I am not able to give input of all 16 channels. I am new to matlab, any suggestion will be helpful. Thank you in advance.


How is it possible to design a filterbank for wavelet to introduce a mother wavelet in wavemngr function in matlab?
can you guide me?

Huda Diab Abdulgalil

Dr. Rami thank you for this useful code. After using this code on my dataset I got a good accuracy but still wondering if I'm going on the right direction. Im new in signal analysis I have raw 100 EEG single channel, first I remove the noise and then I used this code to extract the features as follow:
Features1 = getmswpfeat(x,51,32,5,'matlab');
FeaturesN = getmswpfeat(x,51,32,5,'matlab'); X is the EEG signal, window size of 51 at 32 increments, and 5 decomposition levels.
I did this for all EEG record in the dataset and then put the collected features randomly in two matrixes for training and testing the model. each EEG is (1, 4079), when used this code of WPT I got 63 features and 172 samples for each record. But I read the code instruction and not sure what these features. My question " is it correct to use the output of this code directly" or I should work with this output to extract specific features such as (Log energy entropy, Norm entropy). Please help. Thanks in advance 

neetu singh

Dear mam , I am working on a project " emotion recognition through EEG signals "can i use this code for feature extraction from EEG. and how do i use it. I m beginner in MATLAb so i want guidance from u.

Didik Purwanto

ahmed silik

very good

savita sharma

Rance Tino

Hi Dr Rami, for the function getmswpfeat.m, I keep getting this error:

"Maximum variable size allowed by the program is exceeded.

Error in getmswpfeat (line 78)
Features = zeros(numwin,nfCh*size(x,2));"

I'm using the same input you have:
x = rand(300,1);
feat = getmswpfeat(x,256,32,256,'matlab')

Any help would be awesome. Thank you!

Rance Tino


Hi Sir,
I run the code but i got zero features. Can you please suggest me how to solve it?
kind regards

yas yasmini

Hi and thanks for useful code
could you please describe me how can I discover true size of "winsize" and "wininc" for P300 detection in EEG signals?
thanks in advance

Saiful Musa

assalamualaikum.. sir, the code is running well for me.. but i just want to ask how to get 1 value every feature extraction? in trial i get 25 row every feature.. thanks u


Sir, can you tell me how to extract features(like energy) from an mri image using the same code?

abinash kumar

sir, i have eeg signal of set A having (4097*100). the signal is filtered by lpf with cut off frequency of 64HZ. after that i use wavelet decomposition on it and reconstruct the signal. after reconstructing, fft is used to plot the spectrum. now, i want to extract the feature i.e mean absolute deviation of rectified signal and frequency present. but i am not able to extract the feature. please help me

Rami Khushaba

Winsize and wininc are parameters that you need to set yourself depending on your own application.

*** Please do spend some time reading the document included in this submission as the answers to all of your questions are there.

Tung Nguyen

Thank you
I used Olimex EEG-SMT with only 2 nodes(chanel) to get signal. The device samples the analog signals and builds a 17-byte packet which is transmitted at 256Hz,
using 1 start bit, 8 data bits, 1 stop bit, no parity, 57600 bits per second.
Minimial transmission speed is 256Hz * sizeof(modeeg_packet) * 10 = 43520 bps.

X is raw signal, winsize = 256, right?
And what is wininc?

Thank for your help ^^

Rami Khushaba

The input is a one dimensional signal (or multidimensional one, i.e., many signals), simply any type of signals.

As an example, think of this as your EMG signal from the muscles on your forearm or even EEG signals from many sensors placed on the brain.

If you spend some time looking inside the code, there is an example, with x being the input signal from which you are trying to extract features.

Tung Nguyen

Can you tell me what is input of "getmswpfeat" ?
What is x?

Thank you

Vasumathi Ganesh

Dear mam,
Which .m file can i run among these Three files u have given.

Rami Khushaba

Guys, perform this test to make sure yourself that this is the WPT

% define some random signal
X = rand(32,1);

% perform the WPT decomposition using the matlab wpdec command
T = wpdec(X,3,'db4')

% Now decompose the same example using the proposed mswpd here in this post.
D = mswpd('col',X,'db4',3);

% Now look at the resulting difference between coefficients from both techniques at any node

sum(abs(wpcoef(T,[1 1]) - D{2,2}))

Hope this clarifies things.

Asyur Zaldi

Asyur Zaldi

Dear Dr Rami, thank you for the answer
Yes I like to get the result a vector instead of matrix.
fea=getmswpfeat(s,isize,iinc,12,'matlab'), where isize = iinc = signal size and same with iinc. The function return me one single value. Am I miss something?
thanks in advance

Rami Khushaba

@Asyur: simply, the answer is yes, you can use the code on one or multiple signals. If you are interested in getting one vector output you can easily set the window size and window increment both equal to the input vector length, and in this way you get only one vector at the output. Alternatively, if your window size is smaller than the input vector length, you will end up with multiple vectors at the output. There is nothing to prevent you from averaging, in my own opinion, but you might as well just use the above mentioned window size and window increment to get one vector without the need for further averaging.

Rami Khushaba

@utkatsh: the wavelet packet transform is a generalisation of the wavelet transform, where instead of applying the wavelet decomposition on the approximation coefficients only (as in WT), you also apply the wavelet decomposition on the details coefficients as well in the WPT. Thus, the baseline technique is the same, and depending on how you apply it then you get the WT and WPT.

Utkarsh Singh

Dear Sir,

What exactly have you used in this code? WPT or DWT for decomposition. I am confused. description says WPT but in the code you are using 'mdwtdec'.

dec = mdwtdec(dirDec,T,1,h);

Asyur Zaldi

Dear Dr Rami, I just come across to this session and seems like I got what I'm looking for.
Since this is a multisignal, can i use it for a one single vector(column)? and I try it it return some column depend on the parameter sent, and the result return in matrix form and I average to become a single vector. the step i mention is it ok? base on the code you posted in here. thanks in advance


Dear Dr,
Is the feature in this coding is the energy?


Dear Dr

can we use these features in clustering?
is it possible, for example, how can we do with kmeans?


Dear Dr.Rami
Thanks for you support; Currently I am working on the same point of your paper"Driver Drowsiness Classification"; I hope you can guide me to database for EEG data for this issue.Thanks inadvance

Rami Khushaba

Code updated, a small tutorial and paper added and support for the wmtsa wavelet toolbox also added (in case you don't have the matlab wavelet toolbox).

Please rate this submission if you learned something here or you found it useful.

behailu regasa

Dear Dr.,
Thank you for your immediate response. I expect your answer with the supporting document. would you share to me the second reference or how could i get it freely?

Rami Khushaba

Dear Behailu Regasa

Thanks for your inquiry, I am planing an update to this code within few days, I will also put a supporting document to help understand the process.


behailu regasa

Dear DR. Rami,
i could not understand how could i extract features from multiple signal. That is from hundred 1-D signal like EEG and use them to train neural network. would you help me. I thank you in advance.

Rami Khushaba

Dear Deepak

Thank you for your inquiry, you are actually right about SF. However, please note the comment in the code, that is you can use the sampling frequency to determine the actual number of decomposition levels J, I remember having a comment in the code about that in the same line where I defined J=7 (comment says you can use J=(log(SF/2)/log(2))-1;). I fixed the decomposition levels in the code sample you mentioned to make it easy for new users to understand it, but you can change that by using the commented equation.



Dear Dr. Rami, thank you very much for providing such an useful code. I could not understand the use of sampling frequency (SF)in the getmswpfeat. It asks SF as an argument but does not use it anywhere in the code. Please correct me if I am wrong.


Thank you for the comment.. I just concern It should be same answer for 'feat' at Approximate coefficient.. From the example you given, YES the answer are same.. when I go through the code (getmswtfeat.m) It is look like different in how to determine the last answer.. for code (getmswpfeatV00.m).. I can understand the flow.. but for the 'getmswtfeat.m', I'm confused for term 'percentENER(notZER,:) = ...', '[cfs,longs] = wdec2cl(dec,'all')',.. and I hope the final answer for 'tab_ENER' should be same as WP code... sorry to many question...

Rami Khushaba

Here is a quick proof

Signal = rand(256,1);
D = mswpd('col',Signal,'db4',J);
EnergyWPT = sum(D{3,1}.*D{3,1});
dec = mdwtdec('col',Signal,J,'db4');
EnergyWT = sum( .*;
disp(sprintf('Energy of WT = %f, \t Energy of WPT = %f',EnergyWT,EnergyWPT))

Rami Khushaba

Madi, Please check what you did again.

Both of my codes getmswtfeat and getmswpfeatV00 are based on the same function "mdwtdec" so they cant get you different result if you use the exact same settings in both files, which is not the case here because I didn't post this to make them an exact replica of each other, though this is a good point for an update here from my side.

To prove the above to you:

1) make sure you use the same decomposition level J (its 7 in getmswtfeat and 10 in getmswpfeatV00 by default).

2) make sure you use the same wavelet family (its 'db4' in one code and 'Sym8' in the other).

3) Remove the different normalization utilized within both codes (you need to modify both codes in some sections).

Based on the above, it appears to me that its not just changing the log1p that would give you the same result.

I am planning an update to both codes and will unify both codes in terms of every single parameter.

Thanks for your concern and Good Luck :).



Thank You Dr Rami,

I tried the code (getmswtfeat.m (Your WT code)) and try to remove and maintain 'log1p'.. the result for 'energy' approximate coefficient are different when using the code (getmswpfeatV00.m)..

Rami Khushaba

Correction, for the second point I meant to say its for normalization (I believe I took out the sum up to 1 part).

Rami Khushaba

Hi Madi

For the first point, Yes it should be the same, but I never tried doing that (just check the code in the feature extraction step). Plus, youalready said that you compared with dwt and you got the same answer when commenting the few lines you mentioned above. So, yes it should be fine.

For the second point, the lines you mentioned are for normalization, use it if you want the total energy at each level to sum up to 1. Of course, when you comment these lines you get similar result to dwt and if you use these lines its like you are normalizing the output of WPT features but not the dwt one, so if you do the same on the other features you will end up with similar results.

Kind Regards


Thank you for reply..I understand now..

If I run this code (getmswpfeatV00.m) and I only took the energy at approximation part (Low pass filter part).. It is the result will be same if I'm using this code (getmswtfeat.m (Your WT code))?

One more thing, actually the code below is for what? Because you said can comment it.. If I comment this code the result is different.

I test for J=1 and comment the code below and I compare with using this code ([CA,CD] = dwt(xx,'sym8');) and took the energy for CA and CD.. and the result is same.

%% You can comment this section
JJ = J;
for i=1:J-1
T = (2^i):(2^i)+(2^i)-1;
feat(:,T) = feat(:,T)./log(JJ);

Rami Khushaba

HI Madi

1) The difference is in the tree structure, WT decomposes only the approximation part and (Low pass filter part) and keeps decomposing that. WPT on the other hand, decomposes both approximation and details parts (LPF and HPF). Depending on your application, either one of these could work well. Look at my thesis for a description (

2) The types of features is up to you to decide, you can get energy of coefficients, mean or variance or any other features. In my code above, I set it up as the log scaled energy, but you can change this of course.

Hope the above helps.



Dear Dr Rami,

I'm new in Wavelet Feature extraction, I have two question: :)

1) just to ask what is the different Wavelet Packet Decomposition and Wavelet Transform Decomposition?

2) From you code, feat = WPT features (compete tree without node selection).. actually what is WPT features?.. It is energy based? or median or mean?

Thank You

Rami Khushaba

Hi Ahmed

It looks like you didn't decide on the window size and increment yet, right?

Ok, here is the simplest scenario. Simplest way is to treat each speech signal independently (though this code was specifically written to help you do that on multiple signals directly). Each wave your recorded is a 1-dimensional vector say x1, x2, x3, ....etc

feat1 = getmswpfeatV00(x1,winsize,wininc,SF)

feat2 = getmswpfeatV00(x2,winsize,wininc,SF)

feat3 = getmswpfeatV00(x3,winsize,wininc,SF)

SF is your sampling frequency, winsize and wininc you have to put based on your own application (cant help you with that, just try some numbers and see which one works best for you). After you get these vectors;

FeaturesW1 = [feat1;feat2;feat3];
Of course, if the vectors (x1,x2,x3) represent the same word then you should add another last column to FeaturesW1 made of all 1's.

Now if you have another x1,x2,x3 of another words, do the same and get FeaturesW2 and put the last columns as all 2's.

Now combine
Features =[Features1;Features2];

Done, you have your feature matrix with class label, do your classification now.


ahmed alhamdani

Dear Dr. Rami
If I have n file(for example five files) for the same speech word (.wave) how can I extract feature vector for each file and put it the out put as matrix .
kind regardes

Rami Khushaba

Yes, either
1- Reshape each image into a vector and apply this code
2- Or simply use the 2d version of the wavelet command.


Sir is it possible to extend it to image processing also??? if so how to do that.. thanks in advance..

Rami Khushaba

Dear Ahmed

Yes, what you said seems ok.
1- Do you need Hamming window if you used wavelet? try both cases and see the difference in accuracy.
2-Try different window sizes to see which one gives you the best accuracy.

3-On the other hand, LIBSVM library does actually give you a probability output so have a look at it (Google LIBSVM).

4-As for the combination of SVM/HMM, I tried before combining SVM with DBN but SVM actually performed very good in our problem and there was no need for DBN after SVM, but in your application it might work better than SVM alone, just try and see.

The answer to most of your questions is that its actually problem dependent, i.e., unless you try you won't know.

Kind Regards

ahmed alhamdani

Dear Dr. Rami Khushaba;

Thank you for answer.


The specific for what I need help is about how I can use the output data from SVM as input to HMM. My project is about ( ISOLATED WORD speech RECOGNITION Using Hybrid SVM /HMM).


The data base consists 25 speakers ,male ,female, young and child each speaker the same utter the same 20 words .

The file is saved as (*.wav);

The preprocessing includes sampling, segmentation, framing and windowing.

Sampling frequency=11025Hz

16 – Bit A/D Converter.

Framing the speech signal is blocked into frames of N samples N=256(which is equivalent to23msec) .

Overlaps between frames =128;

Using Hamming window for windowing each frame.

My question is this procedure right or not???


Feature extraction

The Discrete wavelet transform(DTW) daubechies filter and pitchdetection used In order to evaluate feature vectors . Used this feature vectors as input to Hybrid SVM/HMM for training and testing system .

I will do all this pre-processing and processing steps and I have a table of feature vector for each word.

My question is this setep procedure right or not???

Hybrid model of SVM/HMM

The hybrid model includes two parts: training and classification. Firstly , parameters of model can be obtained by training. Secondly, it can be calculated the probability estimate by Viterbi.

But the output of SVM is numerical value. For combined with HMM easily, it should transform the value into probability .

My question how can I do that when I used this feature vector as input to Hybrid SVM/HMM for training and testing system to recognize each word (so I needed algorithm and program to do that using matlab).

With Best Regards

Ahmed alhamdanil


Rami Khushaba

Dear Ahmed

This function is required to decompose multiple signals together in one go using the discrete wavelet transform. Yes, you can program that yourself using any free library like WAVELAB or UVIWAVE.


ahmed amin

Hi Dr.Rami
Thank you for your interest, what the benefit of this function?Is it possible to be programmed.

Rami Khushaba

Hi Ahmed

Thanks for your inquiry, you need the wavelet toolbox to run this code, and obviously, from the error message you got, Matlab can't find this function on your PC, i.e., it means that you don't have that toolbox (or have old version of that toolbox without this function).


ahmed amin

Hi Dr.Rami
i tray to us your code but i have this problem:
??? Undefined function or variable 'mdwtdec'.
i don't know why this problem do?
can you explane to me please.


Thank you very much for the update and kind reply, Dr.Rami. This is really helpful for me and everyone.
I hope I also can make great contributions in the future like you did :)
Best Regards,

Rami Khushaba

@Nabila, the code is now updated, please download the new version.


Rami Khushaba

It actually depends on how you are going to process the data later on. For simplicity, I would just say put an outer for loop (for j=1:Nsignals)to repeat the process across each dimension of x (the input signal), and then use curwin =(st:en,j);

Or, simply wait for the update, I will do that myself today, then one or two days for mathworks to approve it.



Thank you very much for your kind explanation, Dr.Rami. This helps me a lot.
As for now I'm working with multidimensional signal and I changed the code a bit to this:

for i = 1:numwin
curwin = x(st:en,:).*repmat(datawin,1,Nsignals);
temp = detrend(curwin);
feat{1,i} = temp;
st = st + wininc;
en = en + wininc;

Am I working on the right track?
Thank you


Rami Khushaba

Thank you for your inquiry, the multiplication line was written for the case when you use the code on multidimensional signals rather than one dimensional, it wouldn't affect anything in the current situation, and I will update the code soon for the case when you provide multiple dimensions in the input signal.

As for the detrend command, we usually remove the mean or any linear trend in the signal before spectral analysis so it wouldn't affect the resultant power spectrum. Think about it in terms of FFT, add a constant value of say 100 to your signal and perform FFT and observe how the DC component will be affected, so to avoid such a situation just remove the mean or any trend there before processing.

Hope this help.



First of all thank you very much for this code.
If I may ask something, why do we need to multiply the signal with ones and then use detrend in this line? :

curwin = x(st:en,:).*repmat(datawin,1,Nsignals);
feat(1:winsize,i) = detrend(curwin);

Thank you very much

Rami Khushaba

Best basis selection using fuzzy mutual information is also achievable with this code (ranke features by I_Cx./H_x):

Please cite the following paper if you use this code
[2] R. N. Khushaba, S. Kodagoa, S. Lal, and G. Dissanayake, “Driver Drowsiness Classification Using Fuzzy Wavelet Packet Based Feature Extraction Algorithm”, IEEE Transaction on Biomedical Engineering, vol. 58, no. 1, pp. 121-131, 2011.

Rami Khushaba

Suppose you have a sensor giving you raw data, and the sensor samples at 256 Hz, i.e., you get 256 samples or sensor readings per second. Say now you have collected few seconds of data from the sensor, i.e., for example 2048 samples (2048 Samples = 256 samples/sec * 8 sec). Now we want to extract features from the corresponding 2048 samples, but say for example you only want to look at windows of size 256 samples and incremented by 32 samples, so lets run the code:

Signal = rand(2048,1) % one dimensional sensor data for example, 8 seconds with 256 samples/sec
winsize =256;
wininc =32;
SF = 256; %sampling frequency
feat = getmswpfeatV00(Signal,winsize,wininc,SF);

feat is a matrix of 57 x 255, which means you have extracted 57 samples from the total 8 sec worth of data and that each sample had 255 features. Now your question will be why 57 and why 255:

no of samples = (data length – winsize)/winin + 1 = (2048-256)/32+1 = 57
no of features : just look at the figure above, level 0 gives 2^0 features, level 1 gives 2^1 =2 features and so on.

Pay attention that you still have to do feature selection after having the 255 features, as these might be redundant.


guys please explain me how to execute this code what are the input arguments for this code

Rami Khushaba


All what you need to do is to remove the mean and the log operator to get the coefficients, but pay attention to the dimensionality of the produced coefficients matrix. Let me know if you can't do it and I will rewrite it for you.


Rohtash Dhiman

I need only to extract coeff values to use them for further programm for 8th level wavelet packet decomposition

Ali Ali

Thanks you very much Rami for the code

Rami Khushaba

Note: you still have to do feature selection to select the most promising feature set for your own problem, as the extracted feature set here represent the complete set with redundant bases.

MATLAB Release Compatibility
Created with R2010b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!