In your result, you compute the residual error as:
d = (B(:,i) - (s*R*A(:,i) + T));
err = err + norm(d);
Wouldn't it be more appropriate to compute the sum of squared errors, because this is what you actually minimize? (so just use norm(d).^2 instead)
A nice addition would be to add the symmetric scale computation as mentioned later in the paper as a third option.