rescale function does not work correctly
Show older comments
In my computer, the code
load('B');
min(rescale(B(:)),[],'all')
get answer 1.1102e-16, not 0, does anyone have the same problem?
Answers (1)
Jack
on 29 Mar 2023
Hi,
Yes, it is a common issue in numerical computing due to the limitations of floating-point arithmetic. In floating-point arithmetic, numbers are represented with a finite number of bits, which can lead to rounding errors and inaccuracies in calculations.
In your case, the value of 1.1102e-16 is very close to zero, but it is not exactly zero. When you use the rescale function, it scales the elements of B so that the minimum value becomes 0 and the maximum value becomes 1. However, due to rounding errors in the computation, the minimum value of B may not be exactly zero, but rather a very small number close to zero.
When you apply the min function to the rescaled B, it returns the smallest value in the matrix, which is very close to zero but not exactly zero. The 'all' option specifies that the minimum value should be computed over all elements of the matrix.
To avoid this issue, you can use a tolerance value when comparing values to zero. For example, you could modify your code to check if the minimum value is within a certain tolerance of zero, like this:
load('B');
tolerance = 1e-10; % set a tolerance value
min_val = min(rescale(B(:)));
if abs(min_val) < tolerance
disp('Minimum value is approximately zero');
else
disp('Minimum value is not approximately zero');
end
This code checks if the absolute value of the minimum value is less than the tolerance value. If it is, it displays a message indicating that the minimum value is approximately zero. Otherwise, it indicates that the minimum value is not approximately zero.
9 Comments
Jack
on 29 Mar 2023
I apologize for the confusion. You are correct that according to the rescaling algorithm mentioned in the documentation you provided, the minimum value of the scaled array should be zero.
However, if the minimum value of the rescaled array is not exactly zero, it may be due to floating-point precision errors. MATLAB uses double-precision floating-point numbers to represent data, which have a limited number of bits to represent the number. Therefore, some numbers cannot be represented exactly and may lead to small rounding errors.
In practice, these small differences are usually not important, and it is common to use a tolerance value when comparing floating-point numbers. For example, you could consider any number between 0 and 1e-6 to be zero, and check if the minimum value of the rescaled array is within this tolerance range:
load('B');
scaled_B = (B-min(B(:)))/(max(B(:))-min(B(:)));
tolerance = 1e-6;
if abs(min(scaled_B)) < tolerance
disp('Minimum value is close enough to zero');
else
disp('Minimum value is not close enough to zero');
end
This will output "Minimum value is close enough to zero" if the minimum value of scaled_B is within 1e-6 of zero, or "Minimum value is not close enough to zero" otherwise.
Steven Lord
on 29 Mar 2023
Please try this little experiment. Find something to write with and something to write on (ideally compatible things; pencil and paper not pencil and whiteboard.)
Step 1: Using long division (like you learned in school) divide 1 by 3. Call the result x. You are allowed to write as many decimal places of the result as you want, but only those you explicitly write can be used in step 2. No using 0.3 repeating to get "an infinite" number of places.
Step 2: Multiply x by 3. Call the result y.
In exact arithmetic we know (1/3)*3 is exactly 1. But the x value you defined in step 1 is not one third. It is slightly smaller than one third because you rounded off one third to fit it into x. If you've written one more decimal place in step 1 you'd have an x that's closer to one third than the x you actually used in step 2. Therefore y will not be 1. The value stored in y will be slightly smaller than 1.
Similarly, in your problem if we performed the calculations in exact arithmetic you might actually get 0 as an answer. But in floating point arithmetic that's not always possible. In this case 1e-16 is reasonably close to 0.
In theory there is no difference between theory and practice. In practice there is. In theory the answer should be 0, in practice the answer should be very close to 0.
It's worth noting that the operations are not exactly as simple as the conceptual equivalent:
scaled_B = (B-min(B(:)))/(max(B(:))-min(B(:)));
... so what otherwise would seem like a simple subtraction
minimum_of_scaled_B = min(B(:)) - min(B(:));
... involves more than subtraction.
If you want to know what actually goes on internally, open rescale.m and see.
Image Analyst
on 30 Mar 2023
@Steven Lord since the documentation says that if you use rescale with no other input arguments, as @bd l did, "R = rescale(X) scales the entries of X to the interval [0,1]. " then it should do that. In other words, it should check the min and if the min is not exactly zero, and there are no other arguments, it should set it to exactly zero to match the documentation.
For other ranges, where you pass in args #2 and #3, the numbers may not exactly match what was passed in, but again you could check the min and max of the result and simply set them equal to the passed in values.
Can you expand on this comment? I'm not getting the point you are trying to make (even after looking in rescale.m)
I don't see how truncation of digits applies here. The input A is double precision. At least one value of A must satisfy
A == min(A(:))
and that value of A is called inmin,
At least one value of A must satisfy
A == max(A(:))
and that value of A is called inmax.
For any element of A that is exactly equal to inmin or inmax (there must be at least one of each) how can
(inmin - inmin)/(inmax - inmin) be anything other than zero
and how can
(inmax - inmin)/(inmax - inmin) be anything other than 1 (the numerator and denominator must have identical floating point representations, right?)
Here's an example of the latter
rng(102)
format long e
r = rescale(rand(1,10));
min(r)
max(r)
It actually seems to be harder to find a case where min(r) ~= 0 as in the OP's case.
My comment is really as simple as it appears. OP's reading of the documentation is a seemingly obvious formula which would result in the output minimum always being exactly zero, since it's a number minus itself divided by another number. In reality, the actual calculations have more room for rounding errors to occur.
EDIT:
Oh I see what's going on here. In the current version, all of that code is built-in, so there's nothing to see. I'm running R2019b, where it's still naked m-code.
I'm not going to paste it here, but I don't know that it really matters what specific numerical operation causes the errors. I have a feeling that it's more important to understand why rescale() is not using the aforementioned simple formula. The comments in the code suggest that it does some prescaling to avoid overflow/underflow, though I admit I haven't really thought through every line. I'd rather leave that to someone who can explain it more confidently than I can.
Since I haven't analyzed the whole thing, I'm not making any claims that the code as given could not be made to return exact zero for the output minimum. I'm just saying that the seemingly obvious formula is misrepresenting what's actually under the hood.
bd l
on 30 Mar 2023
I think that explaining relevant non-obvious internal operations is precisely what an "algorithms" section in the documentation is supposed to clarify, so I think I agree about that.
I fhought about trying to rewrite the code to share, but I'm not sure how much I can really "rewrite" it in such a way that it's not just obfuscated copypasta -- especially since I'd rather just keep the comments verbatim.
EDIT: I'm just going to paste my little excerpt that I was poking at. It's still largely the original code, though I rearranged and renamed some things for my own viewing comfort. I threw in a few comments and removed some complications that aren't relevant to this specific usage. If someone objects that it's still too infringement-like, then I invite them to offer a better clarification.
load('B');
% we don't actually need the entire array
% we only need the extrema to observe the effect
B = [min(B(:)) max(B(:))]
% assuming default input limits
inlo = double(min(B(:)));
inhi = double(max(B(:)));
% assuming default output limits
outlo = 0;
outhi = 1;
% clamp input to input range
B = min(max(B,inlo),inhi);
% Regularize constant values to return lowerbound of output range
% i'm guessing this is to avoid dividing by zero
constReg = (inlo == inhi);
% Determine where to center the problem based on the input range
sigma = min(max(0,inlo),inhi); % zero if inrange crosses zero; otherwise input limit closest to zero
inlo = inlo - sigma; % shift the input limits so that zero is on or between them
inhi = inhi - sigma;
% Scale to prevent overflow/underflow
% i'll leave this to you to unpack
e1 = nextpow2(max(abs(inlo), abs(inhi)));
e2 = nextpow2(max(abs(outlo),abs(outhi)));
r1 = 2.^(e1-1);
r2 = 2.^(e2-1);
r3 = 2.^(fix((e1+e2)/2)-1);
z = ( (inhi./r1).*(outlo./r3) ...
- (inlo./r1).*(outhi./r3) ...
+ (outlo./r3).*(constReg./r1) ) ...
./ ((inhi./r1) - (inlo./r1) + (constReg./r1));
slope = ((outhi./r2) - (outlo./r2)) ...
./ ((inhi./r3) - (inlo./r3) + (constReg./r3));
% assuming input is float
R = r2.*((slope./r3).*(B - sigma) + (r3./r2).*z);
% clamp output to output range
R = min(max(R,outlo),outhi);
fprintf('%.20f\n',R)
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!