This program recognizes ASL input(query) images by comparing it with the database images and outputs the equivalent ASCII representation of it.
The details are given inside the zip file on separate explanatory algorithm diagrams and every .m file is documented in detail.
Open the 'Readme.txt' in order to learn how to execute the program.
i need to detect and track hand in a video
Guohe thanks for your review.
Decrementing the threshold value is needed for increasing the validity. Not for increasing the number of keypoints.
thanks very much for the code.I just learn recognition. In your annotation, "% Decrement the threshold value in order to find more valid keypoints." maybe is "% Decrement the threshold value in order to find less valid keypoints."
hello sir.. your coding very awesome.. but, how can i change the picture? i want my database is my original hand picture, and the input image is my binary image from my original hand picture.. what i have to do? i already did like that, but the result is still your hand.. do i have to convert my picture to anything else? and,,sorry for disturbing youu...
hello Sir,are the Images that you have provided in database are processed?
also i am trying to recognise the dynamic hand gestures.Could you please help me in that regard how i should start.thank you
It is working very well, thank you !!!
I am using Lowe's SIFT implementation.
Note the following site:
There is a comparison with Lowe's SIFT implementation.
It is strange. Points of interest extracted with the classical VL_FEAT are so different both in number and location relative to your SIFT function.
If you don't have pure background then you will end up with more and false keypoints. MK-RoD is prone to false SIFT keypoint matches. So simply, the algorithm will not work if you use background.
Hi Caglar, you have done a great work, I have a question. Why is it recommendable to use a pure black background? I mean, what happens with the algorithm and the keypoints if you don´t have that background?
Excellent work Mr. Arslan.
I had a question regarding to your source code? Does it work with any images i have in database? can it be face, feature or any object detection?
Brilliant work done
this code is not working with other input image. when we take other input image, we get error massage like described by Bipul and after this, if we take right image from input database than it show error.
sorry for english.....
Sir Caglar Arslan,
you've used siftwin32.exe, what is the purpose of it.
And this code is very time consuming.
So, i am unable to make it into the real-time code.
There is no other easy code for student learning for same purpose?
Can you upload simple matlab code for ASL?
you have to activate and deactivate commented lines in order to switch between the images. CTRL-R is one of the shortcut keys.
dear sir when i execHGR file run only "input='Images/Inputs/sample/b_sample_green.jpg';" b output generaete but c,h,i,o .........output is not generate.....how can other output generate???????
how can hand gesture recognition? please give me useful code for this topics.
it looks like your current working directory is not set to the program's destination. you have to change your folder.
Point Pattern Matching Algorithm for Hand Gesture / American Sign Language (ASL) Recognition
this file can exucute in matlab 13a ???????
this program is not execute execHGR.m
but shown following error
Error using imread (line 350)
File "Images/Inputs/sample/y_sample_green.jpg" does not exist.
Error in sift (line 25)
image = imread(imageFile);
Error in match (line 56)
[im2, des2, loc2] = sift(image2);
Error in formResults (line 18)
[match1,match2,cx1,cy1,cx2,cy2,num] = match(dataBase(Selecteds(i),:),
Error in hgr (line 64)
Error in execHGR (line 35)
pleas help me & explain how execute this program step by step
Just execute execHGR.m
Hit F5 to run it!
and please read the freaking manual!
The code is working for years.
JUST FOLLOW THE INSTRUCTIONS GIVEN IN README.TXT.
THIS CODE IS NOTE WORKING
how can i see keypoints in my matlab figure window
hi,i am workin on the same project.we r trying to do it in real time.the results of sift are not satisfactory.suggest some way to improve it
I recently found how the NFA computed in the ASIFT code:
float nfa = orsa((w1+w2)/2, (h1+h2)/2, match_coor, index, t_value_orsa, verb_value_orsa, n_flag_value_orsa, mode_value_orsa, stop_value_orsa);
Any feedback whether that computation is correct?
It seem that the link is not working.
Hello Caglar, I uploaded my edits here: http://www.mathworks.com/matlabcentral/fileexchange/36154 sorry for the VERY late reply
If I create something useful by any chance I will directly put it here.
Best wishes and good luck
For now I am going on with SIFT. Also, trying on ASIFT. Please let me know if you implement ASIFT would be greatly helpful.
Thanks & Regards.
Sorry I couldn't write anything because of work.
@Bhavik > I do not have an ASIFT version currently. I started working on it a couple months ago but couldn't progress because of my professional duties. I hope I would find some spare time for this.
@kaushalya > It is not an NNS algorithm. But you can consider it as an Nearest Pattern Search. You can find the explanation inside the jpg files of the project and also inside the code comments.
Does MKRoDAlgorithm has any relationship to NNS or is it your own invention? I need an explanation on this algorithm.
Hi Caglar, I tried using demo_ASIFT.exe
It would be very helpful if you provide your implementation of this for Hand Gesture Recognition. Because in this .exe file all the parameters are calculated and matching is done withinin itself. There is no place where I can extract the features of the image and store it.
Also do we have to use Mk-Rod algorithm or ASIFT matching is enough ??
I am trying to break the C++ code and use it for storing the features of database images.
If you provide your implementation, I can take some ideas from it and make it work faster.
Thanks & Regards.
Thanks a lot Caglar for useful guidance.
We are trying to implement it on realtime so time is a great concern for us.
Yes we are pre-calculating all the SIFT parameters for all database images and storing it in a .mat file and fetching the same data at runtime.
Thanks for the ASIFT link. I'll try it and let you know the progress.
Thanks a lot
If execution time is not your concern, you can increase the amount of database images. For example for each sign you can have 3 or 5 dataset images.(i.e. 3 "a" signs, 3 "b" signs and so on..) This will have a huge impact over the execution time but the accuracy will improve.
You have to modify the code in order to handle multiple dataset images per char.
You may calculate the mean of the validityRatio for every character set and base your decision on that.
I do not guarantee that this would work but in a logically sense you can eliminate the wrong detections.
Also pre-calculate the SIFT keypoints of the dataset images instead of calling at every recognition request.
In addition pre-calculate the SIFT of input image in order to reduce the execution.
Also Bhavik note that, false matches could occur in SIFT level matching(That is the main cause of having wrong detections). Instead of SIFT, try to embed a SIFT variant, ASIFT. I tested ASIFT with hand gesture images about a month ago and observed that it provides better matching accuracy and lesser amount of false matching. I highly recommend you to replace SIFT with ASIFT. Try it on http://www.cmap.polytechnique.fr/~yu/research/ASIFT/demo.html
They also provide the source codes on the website. The sources are in C++ but the application executable has MATLAB based wrapper( a .m file that calls an executable, like I did in my algorithm(siftWin32.exe)
Download it from http://www.ipol.im/pub/algo/my_affine_sift/demo_ASIFT_Win.zip
Bhavik I strongly advise you to embed the ASIFT. Please let me know about the progress.
Thanks Caglar for the guidance.
I am facing another problem, it would be great help if you guide me on this.
I am trying to compare all the 26 alphabets and also 10 digits 0-9
Sometimes the alphabet is correctly recognized but many times wrong or similar sign language is decoded giving wrong output result. Please guide me as what all changes are needed. I tried changing the Mk-Rod threshold and distratio of sift, but still I get erroneous result.
Please guide me , what parameters to change or any additional check required.
Thanks & Regards
No it is not necessary, the point pattern for each image pair is independent of each other thus you do not need to parametrize the image sizes.
But note that having a pure background is not the only necessity. Details of the hand is also important. Higher resolution would return better results.
Just another doubt.
Is it necessary to have all the images of same size in database ?
like for example in your database all the images are of size 400x300
also in the inputs the size is same.
In our database, we are having all images of different size only having it's region of interest with black background.
Will this cause any problem in matching ?
Thanks and Regards,
Himadri, thank you very much.
You can do whatever you want to this project under the BSD license.
At the top of the page you can find the BSD License
Hi.. I just went through your project and i think you have done a great job..
i need to know if you can please let me use the code of this project and modify it as well..
@Bhavik I strongly recommend you to investigate David Lowe's SIFT paper. You will find your answers there.
@sachin no problem. the siftwin32 is the Lowe's C implementation of SIFT. That executable generates the SIFT keypoints for input images. btw sorry all of the documentation is embedded as programming comments inside the project. And also there are a bunch of explanatory images about the algorithm. Good luck.
@caglar sorry to disturb you sir,
can you please explain about siftwwin32?
and if you have any documents related to your project please send it email@example.com.
Was going through your code, in that in Match function you have called an .exe file for des, locs and img
I cannot get as what data does des and locs store?
also, i think for matching purpose you have used the following code, I cannot get it, as how the matching is done?
for i = 1 : size(des1,1)
dotprods = des1(i,:) * des2t; % Computes vector of dot products
[vals,indx] = sort(acos(dotprods)); % Take inverse cosine and sort results
% Check if nearest neighbor has angle less than distRatio times 2nd.
if (vals(1) < distRatio * vals(2))
myMatch(i) = indx(1);
myMatch(i) = 0;
for i = 1: size(des1,1)
if (myMatch(i) > 0)
num = sum(myMatch > 0);
please let us know as to how the matching is done by having the data in des, locs and img... it will be very helpful for us developing our own method.
Thanks a lot.
Are you talking about what to do after step 6?
Note that, the tutorial site is all about the implementaton of the SIFT algorithm.
As I said earlier, SIFT is the main keypoint generator for my algorithm. Without having a properly working SIFT, you can't match any images.
Please investigate the documentation(the program comments, explanatory algorithm images) of my code(which I believe it is pretty good documented) if you would like to implement a variant of my algorithm.
we are following the same blog:
for tutorial and algorithm flow. We are writing the code in MATLAB using it's Image processing toolbox functions.
we are reached till step 5 for assigning orientation fr pixels around the keypoints to all the database images. We are stuck up in next step, as in what to do after tat ? and how to compare the database images with the input images. Please help us understand the last step.
or can you please mail any documents if you have on firstname.lastname@example.org
Thanks for your compliments.
It is so nice to hear from you again.
It is also so great to hear about your progress.
I really wondered what kind of modifications you did to the algorithm.
Also note that, I was planning to publish this algorithm as a conference paper. I would appreciate if you can contribute and share your findings. So that you can be one a co-author for this algorithm.
If you are interested send me a private message.
I am waiting for your reply.
Hey Caglar, you are welcome. It's nice to see you helping out others here.
Last summer when I commented on this thread I had actually made some changes with some of your algorithms and managed to increase the pattern matching efficiency to more than 95%.
If you would like then I would like to post the same here. Also I changed the method of input for the images; as in you can have different methods for inputs.
Hello Kok Hong again,
I would suggest you to investigate the match function.
%[match1,match2,cx1,cy1,cx2,cy2,num] = match(image1, image2, distRatio)
The returning value "num" keeps the matched number of keypoints.
"num" would resolve this issue.
If it doesn't please inform me. So that I can correct it in the next revision.
i have another question.
i used ur code for real time sign language algorithm. i made the background of every image with black colour. and if there is nothing on the image captured(only black colour background) there is an error like below:
??? Attempted to access vals(1); index out of bounds because numel(vals)=0.
Error in ==> match at 85
if (vals(1) < distRatio * vals(2)) %?????????????
Error in ==> formResults at 18
[match1,match2,cx1,cy1,cx2,cy2,num] = match(dataBase(Selecteds(i),:), input,
Error in ==> hgr at 77
do u know how to solve it?? thx in advanced. im appreciate of ur help.
Nice to hear from you again.
It is also unclear to me about the .pgm related implementation.
This .exe is downloaded from http://www.cs.ubc.ca/~lowe/keypoints/ .
For your SIFT related questions, I would definitely recommend the following site:
I hope everything is going well for you.
Hi Caglar, this is me again
Got some doubt about the siftwin32 .exe file. In your program you have called this file to give sift key-points if am not wrong. Just want to know, what this .exe file actually does? How it calculates the key points and it's algorithm and also what data is been stored in tmp.pgm file ?
hello Kok Hong,
As far as I can remember, the pgm file is a process file generated by SIFT. Since the SIFT function is called outside MATLAB, you can't have appropriate updates onto core SIFT function unless you change it with a MATLAB implementation.
hi caglar, im the beginner. can i ask how to open pgm image file that is using in this code?
hello divya, thanks for your compliments.
What I would suggest you is to find the palm of the hand. Since you can track the fingertips, you can create a database of possible hand orientations (palm, fingertips). Also it would be great to detect each fingertip separately. (i.e. index finger, middle finger, thumb, ....). If you can accomplish this, it would become easier to match the orientations of the hand with the dataset.
Sorry, I have no experience with feature extraction using PCA for the sign language. But I would suggest you to read some papers regarding PCA based feature extraction for face recognition.
Good luck with your thesis.
hello Caglar Arslan. Thannx for d project.you have done a great job.i am also doing my thesis on sign language recognition. i have segmented the hand and done with fingertips detection and tracking.i want to do feature extraction using PCA(principal component analysis). please suggest me which features should i take to match signs.
Heyy Caglar Arslan
Thanks allot again for the needful information.
Am sorry but as of now there isn't any web link to the project, but will surely create it when the project is all ready and will let you know about the progress and probably ask for the help needed.
Hello there Bhavik again,
Consider the SIFT as a Black box in this program. There is not so much you can do about modifying the SIFT's code content.
For further information about SIFT, read Lowe's paper about SIFT which is named as "Distinctive image features from scale-invariant keypoints". You can download it from psu.edu.
The algorithm in this program(point pattern matching) is a brand new simple algorithm in the computer vision context. I tried to explain it as much as possible in the documentation. I think it is all enough to understand it.
Please provide me a web link of your project, if you have of course. I will be wondering about your progress.
Once again, good luck again.
Thanks allot Caglar Arslan for your kind suggestions and information.
Your code is really helpful in the Project but I am not much familiar with the SIFT algorithm, would you please suggest some good web links which would be helpful for understanding the algorithm.
Although the comments and documentation are very helpful, we need to understand whole algorithm and also apply the same for real-time application. Because we are planning to implement the text to speech in addition to gestures recognition and that needs really fast processing, so we need to identify if we can develop any short technique to process images.
Thanks allot once again..
Hello Bhavik, good luck in your project.
If you want to include my codes, please read the licence.
Anyway, the program does not read webcam inputs. If you want to run the webcam, you have to use the Image Acquisition Toolbox.
Investigate the http://www.mathworks.com/help/toolbox/imaq/exampleindex.html for correct usage of the toolbox.
The realtime issue is a little bit hard for this project.
I am listing what are needed in order to have a realtime system.
1)You have to process every single frame of the webcam input. Ideally the goal is to achieve 40 miliseconds of process time for every frame for a true realtime system.
2) The algorithm performs better with pure black backgrounds. So this means that you have to preprocess every single frame of the webcam video stream for the removal of the backgrounds. My suggestion is to implement a hand detection and tracking algorithm and extract the ROI (Region of Interest) from every frame and then remove the background of the ROI Image and then put that frame into progress.
3) You have to modify the code to reduce the amount of unnecessary calculations. For example for this program the biggest unnecessary calculation ,which is a big occlusion for a realtime version, is the recalculation of the SIFT keypoints of every dataset image at every iteration. You have to really get rid of this situation if you want to have a realtime system. My suggestion is to calculate the SIFT points of the dataset images once at another program and store them by saving the results. In addition, the second unnecessary calculation is the Input Image's SIFT keypoint calculation. In the beginning of the algorithm, you have to calculate it for once and use it for the rest.
4) MATLAB is not a suitable platform for realtime application development. MATLAB is very suitable for rapid algorithm development and testing. But note that creating a realtime application in MATLAB is not impossible. My last suggestion is about incorporating GPUMat (GPU toolbox for MATLAB) into your project with a CUDA enabled graphics card. Parallel processing can boost the performance.
5) Try to implement the project/algorithm in C++ OpenCV if you want to have a realtime performance.
For your question about execHGR.m,
you have to change the input section
input parameter holds the image location to be processed. just change it with the image that you want to get processed.
For your question about recognizing different gestures;
You have expand the dataset images set, open the hgr.m. and find the
% 'Selecteds' indicate the selected Database Images.
% If you add 1.jpg to the Database folder, you have to change the first
% number as 1. For our case, you have to make the first 0 of 'Selected'as 1.
% Also note that, this array stores the candidate database images.
% At the end of the algorithm only 1 selected (matched) image is left
% inside this array
Selecteds=[0 2 3 0 0 0 0 8 9 0 0 12 0 0 15 0 0 0 0 0 0 0 0 0 25 0];
I think the comments are so explainful.
By the way, there isn't any detailed project report/conference paper for this project. But I hope in the future, there may be. Don't worry, the comments and explanatory diagrams of the algorithms should be enough for you.
In fact I am planning to release a more academic version of this program. If I can find some time beyond my day job :).
Good luck Bhavik, let me inform about your progress.
I am so glad you liked the code, good luck again.
Seems to be a great project work. Its really nice
I Am pleased to inform you that I am currently working on the same kind of project of gesture recognition.
I downloaded the files of the project but I am not sure how it works. Does it use the webcam input ??
I want to use webcam input and run the program in real-time to recognize gestures.
I am running execHGR.m file but every time it shows the same output and the webcam is not switched ON. As in how to make work the complete project to recognize different gestures in the database.
And can you please mail me the detail project report of the algorithm used ?
k thnx i ll be informing you about my project thnx again:)
thanks for your compliments asif.
Sure you can use the codes in your project. But don't forget to review the BSD licence.
I am not so much sure that the algorithm would work with face images or not. One thing I would recommend is the usage of face images with pure black background.
Please inform me about the results asif. I really wondered it.
Good luck and best wishes
Really nice work Caglar i wana use these files in my project with modifcations kindly allow me to do so...
One thing more how can i enhance it to add facial features.....??
Derya> thank you for your comment, I will try to include it into a future revision of the code
deepti> for your first question; I am not sure about it but I think that the addition of .bmp and .png inside the database folder is probably caused by MATLAB's own image processing mechanism. Consider the .bmp/.png as a temp image. My code has nothing to do with .bmp/.png images.
For your second question; yes you can add multiple images of the same character for processing but you have to slightly change the code for that purpose. Having a seperate folder for each character sounds good to me. Good luck.
Heba> It is so great to hear such an improvement. I really would like to add that optimization for the next revision of the code. If you would like to contribute for the revision, please contact me Heba. Best wishes.
ahmad awad> thank you for your rating.
First of all I would like to thank you so much about the posted codes
Secondly, I would like to let you know that I made some edits to process the database keypoints and store it in the .mat file and I made the input image to be processed only once instead of each iteration loop, thus the code processing time decreased from an average of 17 seconds to an average of 1.36 seconds
I want to know why .png , .bmp images are added in the database folder. when they are not used .
Is there any way to add more iamges of the same character so that there are more images for comparison for the recognition like it is done in training of neural networks ..can something of that sort can be done here?
Hi, this is great work overall - thank you. I noticed a small bug when reading the selected results from theHGRDatabase.mat file. All the filenames that have single digit numbers, i.e. Images/Database/1.jpg to Images/Database/9.jpg get read as 22 characters whereas they only have 21 characters. This makes the file name: imageFile = 'Images/Database/1.jpg ' (Notice the blank before the closing quote).This being the case, none of the files can be read using imread(). The solution is simple: whenever a name is read from the dataBase file, do something like this:
Hello Nada again,
It is so great to talk about possible improvements with you. So you are not distrubing me. It is a pleasure for me to discuss with you.
Well, you have a good question and problem right now.
I think your problem is how to cluster the objects given in a single image. The first thing that comes to my mind is the usage of a divisive clustering algorithm. The best example is the K-means.
Assuming that you have only face and hand objects in a single image. Then you would cluster the image into two seperate images and then use my algorithm to recognize the content.
I think after calculating the keypoints for the single image which contains the hand and face, you can cluster it down by setting K of K-means as 2. Then you have to find a way to determine the accurate way to calculate the boundary line after clustering.
After finding the boundary line, you have to save one portion of the boundary as image-1 and the other portion as image-2 and then it becomes easier to recognize the content.
What I would also recommend you is to investigate the Image Segmentation algorithms.
Hope to hear good progress news from you.
Hi Caglar ,
First Really sorry for disturbing you .
Thanks a lot for you suggestion . I am searching about algorithm to speed up matching and to improve accuracy . But I want to ask you a question about feature extraction. how to extract object from many of objects like i have hand and face sign for any word in one image . how can i extract the hand sign from image or the face sign or twice according to the word . if you have any idea please tell me and if you know any algorithm that can we use it .
Thanks a lot :)
I've got a suggestion about your ASL decoder project.
You have to find a way to deal with the speed issues. One of the main drawback of my project is that the program calls the database images one by one and calculates the matching sift points for each of the iteration of each database character. This situation obviously slows down the operation.
If you can pre-record SIFT keypoint sets of each database images for different distRatio parameters (Let's say 3 levels of recursion at least) and call those stored keypoints on demand, you can definitely boost-up the program and progress better for your project.
I hope this could make sense.
Good luck with your project.
I wish to hear about your progress in the future.
Hi Caglar ,
Thanks a lot for your reply . really that's a wonderfull project . First I tried image with white background and white skin hand and it's works with JPEG Format.Really I like This project .
Second I need to add face signs with hand signs to project how can i do it .
And i want to capture the video of signs and cut the video into frames and convert each sign to word not a character .
Your welcome, I hope you liked it.
Allright, let's begin with the second question.
You have to note that following image formats are supported by MATLAB.
BMP (Microsoft Windows Bitmap)
GIF (Graphics Interchange Files)
HDF (Hierarchical Data Format)
JPEG (Joint Photographic Experts Group)
PNG (Portable Network Graphics)
TIFF (Tagged Image File Format)
XWD (X Window Dump)
What I would recommend is to use JPEG for simplicity. Other formats should work but I couldn't be able to run experiments for non-JPEG images.
Now, the first question..
You can add other signs by expanding the Images\Database folder. The order of the sign must match the letter. (i.e. B = 2.jpg, C=3.jpg, etc.)
Also you have to change the Selecteds array inside the HGR.m
Selecteds=[0 2 3 0 0 0 0 8 9 0 0 12 0 0 15 0 0 0 0 0 0 0 0 0 25 0];
For example if you add 1.jpg (A char.) to the database, you have to change the array as;
Selecteds=[1 2 3 0 0 0 0 8 9 0 0 12 0 0 15 0 0 0 0 0 0 0 0 0 25 0];
Note that background of every database image and input image need to be purely black.
Also I couldn't be able to run experiments with full alphabet.
You might not get full accuracy with additional characters.
Please let me know about the results Nada.
hi Caglar Arslan,
Thank you for this project :) but i need to add ather signs how ?. And can i load any format of image?
Thanks a lot
thank you Milind
Title, summary and tags sections are updated
Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.