If all you have is a picture of a piece of clothing, and a picture of a person, then you cannot reliably fit the clothing on the image of the person.
In the best case, the person would happen to be holding their body in the same position as the picture of the clothing happens to have. In that situation, you would use image registration or cross correlation to locate the corresponding points on the piece of clothing and the human, and then you would scale the image of the clothing, and rotate as needed, and drop it on top of the person.
However, most of the time you will not get that lucky. For example if you look at the sample jacket, the arms on it are straight and a particular angle away from the body, whereas the arms of the person in the image at a different angle relative to the body (and different from each other), and the arms are not exactly straight. You would therefore need to manipulate the picture of the clothing to angle parts of it relative to what it is, like image registration with distortions.
The image registration with distortion problem is at least studied, and you can find algorithms for it in reference papers.
But... when all you have is the image of the clothing, you do not know anything about the stiffness of the clothing or which parts stretch or curve when worn. And that means you cannot reliably place it over a human in a natural pose -- not unless you have engineering data that models that particular clothing as a mesh of nodes, each of which is to be modeled as a flat triangle (you can use larger triangles for the back than for the arms if you need to.)