Good evening Anna.
Allow me to add my humble opinion.
To train and SVM , you need a series of positive and negative examples- usually you need hundreds/thousands of each, with negatives being much more (~x10, ~x20...) then positives. In case of images this will mean you need to have multiple examples of human figures photos as positives, and relevant images without such figures as negatives (we usually used all image regions without humans as negatives). This usually implies having sufficient database with ground truth markings/ annotations. Building this on your own is lots of work, but luckily for many problems under serious research you have available set of examples. Now, you convert each image (negative and positive) into a feature vector- resulting in a huge group of vectors- each with a priory know label- "positive" or "negative". You provide this to SVMtrain function (depending on the toolbox you're using), specifying SVN parameters- linear, RBF, etc, and voila, you got a trained SVN. Now, with SVNclassify- using the trained classifier and given a feature vector you will know whether is is considered positive and negative. You got Yourself a detector ! Good luck!