File Exchange

image thumbnail


version 1.0.0 (2.76 KB) by Nick Higham
MATLAB Code for Parameters of Floating-Point Arithmetics


Updated 23 May 2020

From GitHub

View license on GitHub

`float_params` is a MATLAB function for obtaining the parameters of several
floating-point arithmetics. The parameters are built into the code and are
not computed at run time.

The parameters are

- the unit roundoff,
- the smallest positive (subnormal) floating-point number,
- the smallest positive normalized floating-point number,
- the largest floating-point number,
- the number of binary digits in the significand (including the
implicit leading bit)

and the arithmetics supported are

- bfloat16,
- IEEE half precision (fp16),
- IEEE single precision (fp32),
- IEEE double precision (fp64),
- IEEE quadruple precision (fp128).

The code was developed in MATLAB R2018b and works with versions at least
back to R2016b.

Cite As

Nick Higham (2021). float_params (, GitHub. Retrieved .

Comments and Ratings (1)

Marco Cococcioni

I have a little bit extended this function, to compute the same values for BFloat8, a data type of interest in the machine learning community.
Please find it here at Matlab FEX, as the float_params2 entry:

MATLAB Release Compatibility
Created with R2018b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Inspired: float_params2

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!