Long-float variant giving more precision to numerical computations.

Subject: Long-float variant giving more precision to numerical computations.
From: Douglas Crosher
Date: Mon, 11 Feb 2008 02:55:15 +1100

R Fateman wrote:
> Adding extra precision algorithms (good for 80 bits) when the hardware 
> only does 64 bits might slow down the programs for those less-precise 
> systems.  (For example, 80-bit sin/cos/log/exp/bessel/...).  So maybe 
> the libraries should be kept separate.

The CL implementation should provide accurate implementations of many
of these functions, such as sin,cos,log,exp etc.  For the others it may
be possible for compile-time macros to emit the required number of terms.

Quad could be another good option, more precision than 80-bits, although
it has less exponent range and is much slower.

CLISP long-floats can have a very high precision and a large exponent range,
and the precision can be set by the user, so this is an easy path if people
need better precision now.  CLISP users may even probably prefer this over
a quad implementation.

Many of the Fortran libraries yield a higher precision when using a higher
precision representation for the floating point numbers, but some modifications
have been needed.  The LAPACK library has been tested, SVD etc, and it is
working well.  However the modifications to the translated CL files now
need to be maintained.

Here are some examples of the precision:

1. find_root

f(x) := sin(x) - x/2;
r: find_root (f(x), x, 0.1, %pi);
fpprintprec: 16;
fpprec: 1000;
f(bfloat(r));

CMUCL   64 bit double-float:    6.154491740245677b-18
SCL     80 bit   long-float:  - 6.141025968424459b-20
CLISP 1024 bit   long-float:  - 4.193340718207969b-309

2. LAPACK Eigen values.

load(lapack);
fpprintprec: 16;
fpprec: 1000;
rows: 100;
cols: 100;
m[i, j] := i * cols + j;
M: genmatrix (m, rows, cols)$
[L, v, u] : dgeev (M, true, true)$
D : apply (diag_matrix, L)$
E: abs(bfloat(M) . bfloat(v) - bfloat(v) . bfloat(D))$

block([sum2],
       sum2: 0,
       for i: 1 thru rows do
          for j: 1 thru cols do
             sum2: sum2 + E[i,j]^2,
       sqrt(sum2));

SCL     64 bit double-float:  9.457340356041502b-10
SCL     80 bit   long-float:  5.319785096627637b-13
CLISP 1024 bit   long-float:  8.220676310896712b-302

Regards
Douglas Crosher

Prev by Date: Long-float variant giving more precision to numerical computations.
Next by Date: autoloading of linearalgebra
Previous by thread: Long-float variant giving more precision to numerical computations.
Next by thread: Long-float variant giving more precision to numerical computations.
Index(es):
- Date
- Thread