Long-float variant giving more precision to numerical computations.
Subject: Long-float variant giving more precision to numerical computations.
From: Douglas Crosher
Date: Mon, 11 Feb 2008 02:55:15 +1100
R Fateman wrote:
> Adding extra precision algorithms (good for 80 bits) when the hardware
> only does 64 bits might slow down the programs for those less-precise
> systems. (For example, 80-bit sin/cos/log/exp/bessel/...). So maybe
> the libraries should be kept separate.
The CL implementation should provide accurate implementations of many
of these functions, such as sin,cos,log,exp etc. For the others it may
be possible for compile-time macros to emit the required number of terms.
Quad could be another good option, more precision than 80-bits, although
it has less exponent range and is much slower.
CLISP long-floats can have a very high precision and a large exponent range,
and the precision can be set by the user, so this is an easy path if people
need better precision now. CLISP users may even probably prefer this over
a quad implementation.
Many of the Fortran libraries yield a higher precision when using a higher
precision representation for the floating point numbers, but some modifications
have been needed. The LAPACK library has been tested, SVD etc, and it is
working well. However the modifications to the translated CL files now
need to be maintained.
Here are some examples of the precision:
1. find_root
f(x) := sin(x) - x/2;
r: find_root (f(x), x, 0.1, %pi);
fpprintprec: 16;
fpprec: 1000;
f(bfloat(r));
CMUCL 64 bit double-float: 6.154491740245677b-18
SCL 80 bit long-float: - 6.141025968424459b-20
CLISP 1024 bit long-float: - 4.193340718207969b-309
2. LAPACK Eigen values.
load(lapack);
fpprintprec: 16;
fpprec: 1000;
rows: 100;
cols: 100;
m[i, j] := i * cols + j;
M: genmatrix (m, rows, cols)$
[L, v, u] : dgeev (M, true, true)$
D : apply (diag_matrix, L)$
E: abs(bfloat(M) . bfloat(v) - bfloat(v) . bfloat(D))$
block([sum2],
sum2: 0,
for i: 1 thru rows do
for j: 1 thru cols do
sum2: sum2 + E[i,j]^2,
sqrt(sum2));
SCL 64 bit double-float: 9.457340356041502b-10
SCL 80 bit long-float: 5.319785096627637b-13
CLISP 1024 bit long-float: 8.220676310896712b-302
Regards
Douglas Crosher