Errors are measured in “units of the last place”. This is a measure for the relative error. For a number z with the representation d.d...d·2^e (we assume IEEE floating-point numbers with base 2) the ULP is represented by
|d.d...d - (z / 2^e)| / 2^(p - 1)
where p is the number of bits in the mantissa of the
floating-point number representation. Ideally the error for all
functions is always less than 0.5ulps in round-to-nearest mode. Using
rounding bits this is also
possible and normally implemented for the basic operations. Except
for certain functions such as sqrt
, fma
and rint
whose results are fully specified by reference to corresponding IEEE
754 floating-point operations, and conversions between strings and
floating point, the GNU C Library does not aim for correctly rounded results
for functions in the math library, and does not aim for correctness in
whether “inexact” exceptions are raised. Instead, the goals for
accuracy of functions without fully specified results are as follows;
some functions have bugs meaning they do not meet these goals in all
cases. In the future, the GNU C Library may provide some other correctly
rounding functions under the names such as crsin
proposed for
an extension to ISO C.
errno
may also be set (see Error Reporting by Mathematical Functions). (The “inexact”
exception may be raised, or not raised, even if this is inconsistent
with the infinite-precision value.)
long double
format, as used on PowerPC GNU/Linux,
the accuracy goal is weaker for input values not exactly representable
in 106 bits of precision; it is as if the input value is some value
within 0.5ulp of the value actually passed, where “ulp” is
interpreted in terms of a fixed-precision 106-bit mantissa, but not
necessarily the exact value actually passed with discontiguous
mantissa bits.
long double
format, functions whose results are
fully specified by reference to corresponding IEEE 754 floating-point
operations have the same accuracy goals as other functions, but with
the error bound being the same as that for division (3ulp).
Furthermore, “inexact” and “underflow” exceptions may be raised
for all functions for any inputs, even where such exceptions are
inconsistent with the returned value, since the underlying
floating-point arithmetic has that property.
Therefore many of the functions in the math library have errors. The
math testsuite only flags results larger than 9ulp (or 16 for IBM
long double
format) as errors; although most of the implementations
show errors smaller than the limit.
A more comprehensive analysis of the GNU C Library math functions precision could
be found in ’Accuracy of Mathematical Functions in Single, Double, Double
Extended, and Quadruple Precision’; Brian Gladman, Vincenzo Innocente,
John Mather, and Paul Zimmermann at
<https://inria.hal.science/hal-03141101>. It does not cover complex
functions, nor jn
/yn
, and it is only for x86_64, and for
rounding to nearest, and does not cover any architecture variations (in
particular IBM long double
is out of scope).
For complex functions, some analysis of the GNU C Library math functions can be
found in ’Accuracy of Complex Mathematical Operations and Functions in Single
and Double Precision’; Paul Caprioli, Vincenzo Innocente, Paul Zimmermann
at https://inria.hal.science/hal-04714173. It only covers float
and double
, only for x86_64, and for rounding to nearest, and does not
cover any architecture variations.