Main Content

CustomFloat

Numeric object with a custom floating-point data type

Description

Use a CustomFloat object to define a floating-point numeric data type with specified word length and mantissa length. Floating-point data types defined by a CustomFloat object adhere to the IEEE 754-2008 standard. For more information on floating-point data types, see Floating-Point Numbers.

Creation

Description

example

x = CustomFloat(v) returns a CustomFloat object with value v. The output object has the same word length, mantissa length, and exponent length as input v.

example

x = CustomFloat(v, type) returns a CustomFloat object with value v and floating-point type specified by type.

example

x = CustomFloat(v, WordLength, MantissaLength) returns a CustomFloat object with the specified word length and mantissa length.

example

x = CustomFloat(v, WordLength, MantissaLength, 'typecast') returns a CustomFloat object with the bit pattern of v and the specified mantissa length. The word length must match the word length of the input v.

x = CustomFloat(cf) returns a CustomFloat object with value and data type properties of CustomFloat object cf.

Input Arguments

expand all

The value of the CustomFloat object, specified as a scalar, vector, matrix, or multi-dimensional array.

Data Types: half | single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | fi
Complex Number Support: Yes

Floating-point data type of CustomFloat object, specified as either 'double', 'single', or 'half'.

The properties of these types are summarized in the following table.

TypeWord LengthMantissa Length
double6452
single3223
half1610

Data Types: char

Custom floating-point type, specified as a CustomFloat object.

Properties

expand all

Scalar integer representing the offset value for the exponent.

This property cannot be changed directly, however you can change this property by changing the WordLength and MantissaLength properties, which influence the ExponentLength property. The ExponentBias for a floating-point data type is computed through the following equation:

ExponentBias = 2e-1-1(1)
where e represents the ExponentLength.

Data Types: double

Number of bits representing the exponent. You cannot edit this property directly, however you can change the exponent length by changing the MantissaLength and WordLength properties.

The ExponentLength, MantissaLength, and WordLength properties are related through the following equation:

WordLength = 1+MantissaLength+ExponentLength(2)

Data Types: double

Number of bits representing the mantissa, specified as a scalar integer.

The ExponentLength, MantissaLength, and WordLength properties are related through the following equation.

WordLength = 1+MantissaLength+ExponentLength(3)

Example: custfloat.MantissaLength = 14;

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | fi

Total number of bits in the data type, specified as a scalar integer.

The ExponentLength, MantissaLength, and WordLength properties are related through the following equation.

WordLength = 1+MantissaLength+ExponentLength(4)

Example: custfloat.WordLength = 28;

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | fi

Object Functions

expand all

absAbsolute value and complex magnitude
ceilRound toward positive infinity
complexCreate complex array
conjComplex conjugate
coshHyperbolic cosine
expExponential
fixRound toward zero
floorRound toward negative infinity
fmaMultiply and add using fused multiply add approach
hypotSquare root of sum of squares (hypotenuse)
ldivideLeft array division
logNatural logarithm
log2Base 2 logarithm and floating-point number dissection
log10Common logarithm (base 10)
minusSubtraction
modRemainder after division (modulo operation)
mtimesMatrix multiplication
ndimsNumber of array dimensions
plusAddition or append strings
pow10Base 10 power and scale half-precision numbers
pow2Base 2 power and scale floating-point numbers
powerElement-wise power
rdivideRight array division
realReal part of complex number
remRemainder after division
roundRound to nearest decimal or integer
rsqrtReciprocal square root
sqrtSquare root
tanhHyperbolic tangent
timesMultiplication
uminusUnary minus
uplusUnary plus
binUnsigned binary representation of stored integer of fi object
doubleDouble-precision arrays
fiConstruct fixed-point numeric object
int88-bit signed integer arrays
int1616-bit signed integer arrays
int3232-bit signed integer arrays
int6464-bit signed integer arrays
isnanDetermine which array elements are NaN
isrealDetermine whether array uses complex storage
singleSingle-precision arrays
uint88-bit unsigned integer arrays
uint1616-bit unsigned integer arrays
uint3232-bit unsigned integer arrays
uint6464-bit unsigned integer arrays
eqDetermine equality
geDetermine greater than or equal to
gtDetermine greater than
leDetermine less than or equal to
ltDetermine less than
neDetermine inequality
catConcatenate arrays
ctransposeComplex conjugate transpose
horzcatHorizontal concatenation for heterogeneous arrays
isfiniteDetermine which array elements are finite
isinfDetermine which array elements are infinite
normVector and matrix norms
numelNumber of array elements
reshapeReshape array
sizeArray size
transposeTranspose vector or matrix
vertcatVertical concatenation for heterogeneous arrays
dispDisplay value of variable

Examples

collapse all

This example shows how to create a CustomFloat object.

v = pi;
x = CustomFloat(v)
x = 
    3.1416


           Data Type: Floating-point: Double-precision
          WordLength:  64
      MantissaLength:  52
      ExponentLength:  11
        ExponentBias: 1023

Because the input to the CustomFloat constructor was a double, the data type of the CustomFloat object, x, is also a double. If the value passed in to the CustomFloat function is a single, then the resulting CustomFloat object will also have a single-precision floating-point data type.

v = single(pi);
x = CustomFloat(v)
x = 
    3.1416


           Data Type: Floating-point: Single-precision
          WordLength:  32
      MantissaLength:  23
      ExponentLength:   8
        ExponentBias: 127

To create a CustomFloat object with a specified floating-point data type, specify the data type as the second argument in the CustomFloat function.

v = pi;
x = CustomFloat(v,'half')
x = 
    3.1406


           Data Type: Floating-point: Half-precision
          WordLength:  16
      MantissaLength:  10
      ExponentLength:   5
        ExponentBias:  15

Specify a word length and a mantissa length in the CustomFloat function.

v = pi;
wl = 16;
ml = 4;
x = CustomFloat(v,wl,ml)
x = 
    3.1250


           Data Type: Floating-point: Custom-precision
          WordLength:  16
      MantissaLength:   4
      ExponentLength:  11
        ExponentBias: 1023

Compare the difference between the double-precision value and the value of the CustomFloat object as you change the mantissa length.

err = zeros(1,12);
for ml = 1:12
    x = CustomFloat(v,wl,ml);
    err(ml) = v-double(x);    
end

plot(err);
title('Error: v - double(x)');
ylabel('Error');
xlabel('Mantissa Length');

Using the 'typecast' input argument, the CustomFloat function creates a CustomFloat object with the bit pattern of the input value, and the specified word length and mantissa length.

Define a single-precision value. Single-precision floating-point data types have a 32-bit word length and 23-bit mantissa length. View the binary representation of the single-precision value.

v = single(pi);
bit_pattern = bin(CustomFloat(v))
bit_pattern = 
'01000000010010010000111111011011'

Define a CustomFloat object that has the same bit pattern as the input value, but has a different mantissa length.

x = CustomFloat(v, 32, 20, 'typecast')
x = 
   50.1239


           Data Type: Floating-point: Custom-precision
          WordLength:  32
      MantissaLength:  20
      ExponentLength:  11
        ExponentBias: 1023

View the binary representation of the CustomFloat object, and compare it to the bit pattern of the single-precision input value.

bit_pattern2 = bin(x)
bit_pattern2 = 
'01000000010010010000111111011011'
same = strcmp(bit_pattern, bit_pattern2)
same = logical
   1

Limitations

The following functions, which support custom floating-point inputs, do not support complex custom floating-point inputs.

Introduced in R2020a