# Stochastic Gradient Descent (SGD) for Image Processing

51 views (last 30 days)
Ibraheem Al-Dhamari on 24 Jan 2017
Commented: Dan on 23 Jun 2017
Dear all,
I am trying to apply SGD to solve a classical image processing problem as in this link . I am not sure what should I change. Here is the Gradient Descent Code:
niter = 500; % number of iterations
x = u; % initial value for x, u is the input noisy image
for i=1:niter
% smoothed total variation of the image
sgdx=gdx(:,:,1)+gdx(:,:,2);
NormEps = sqrt( epsilon^2 + sgdx );
J = sum(NormEps(:)) ; % this is a scalar value
% functional to minimize, lambda is weight of J
nm=sum((x(:)-u(:)).^2);
f = 1/2 * nm^2 + lambda * J;
% the gradient of the functional function f is:
% x - y + lambda * GradJ
x = x - tau * ( x - u + lambda * GradJ);
end
clf;
imageplot(clamp(x)); % this is the result denoised image
I understand that in SGD we took only random part of the image at each iteration then we compute the minimum, but if I apply this on the input noisy image, I will denoise (badly) small part of the image at each iteration, right? an explanation based on the code above would be excellent!
Best regards,
Ibraheem
Dan on 23 Jun 2017
Hi Ibraheem, I have updated an existing algorithm to apply SGD : https://www.google.co.il/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&uact=8&ved=0ahUKEwjI9I_R1tTUAhUsCMAKHc0HB0MQFgg5MAM&url=https%3A%2F%2Fwww.mathworks.com%2Fmatlabcentral%2Ffileexchange%2F62921-ecc-registration-100x-faster&usg=AFQjCNGFD-abP0KTXE9Lvsi7dfId74JIUw
Enjoy, Dan

Xilin Li on 2 Feb 2017
Update a random part of the image at each iteration is not SGD. In SGD, the parameter, say x, you want to optimize for all iterations is the same x, but the gradient used to update x is noisy due to replacing expectation with sample average. I checked your image denoising problem. It is a standard convex optimization, and there are many efficient solvers. You left a comment on my psgd post, and I showed how to use psgd to converge to better solutions faster.
Ibraheem Al-Dhamari on 3 Feb 2017
[update]
I checked the code and still do not see where is the stochastic part. I saw these lines:
dX = sqrt(eps)*randn(size(x));
I think the first line add random noise to x (the filtered image) but still, you need to compute the gradient of the whole image not part of it.
From wiki:
"In stochastic (or "on-line") gradient descent, the true gradient of Q(w) is approximated by a gradient at a single example".
I understood that we need to compute the gradient of part of the image instead of the whole image, right?