Dear Ray,
I slept over your arguments, but I still think they are not right. The problem I see is that floating point numbers can have very large exponents, e.g. more than 1000 binary digits for IEEE double precision. So worst case you need to do a range reduction with a precision of about 1000 bits for a floating point format with 53 bits mantissa. The effort for doing the range reduction, essentially a modulus operation, is quadratic (well there are asymptotically faster methods, but I think with SIMD and 1000 bits, the quadratic method is still best), so we have worst case ~400 times the effort, if we assume that numbers are precise. The question is, if this effort is worthwhile. I think no, because I think it is very unlikely, that the result of some computation has a higher precision than the mantissa. The use case of literal numbers is a bit artificial and not really what floating point numbers are made for. One argument for this is that it would be easy to detect if results are exact and add a corresponding flag to floating point numbers. But this is typically not done, so one can assume that processing of exact numbers is not the intention.
So I think, if you need high precision, you should use a high precision floating point implementation.
Best regards,
Michael
From: Raymond Toy [mailto:toy.raymond at gmail.com]
Sent: Thursday, May 10, 2012 6:04 PM
To: Soegtrop, Michael
Cc: Barton Willis; maxima at math.utexas.edu
Subject: Re: cos(1.0d97) with GCL, Clozure, and Julia
On Thu, May 10, 2012 at 8:26 AM, Soegtrop, Michael <michael.soegtrop at intel.com<mailto:michael.soegtrop at intel.com>> wrote:
Dear Barton,
this isn't a problem of the fsin/fcos implementation. For calculating a sine function you need a certain absolute precision. E.g. if you want to have 2 decimals output precision, your number must have an absolute precision of 2 decimals. Floating point numbers have constant relative precision. So the larger the number gets, the worse the absolute precision becomes. If the absolute precision is worse than pi, the result is simply meaningless. Imagine you have the number 1000000000000 with error +-1000. The relative error is quite small (1ppb), but the absolute error is more than 100*pi. The result is meaningless.
I disagree with this analysis. If I want (cos (scale-float 1d0 120)) (from one of Kahan's examples), cos should return the correct value. Sure you can argue that 2^120 (as a float) is some kind of approximation, but, as I've constructed, I want exactly 2^120, which happens to be exactly representable in a double-float. I want cos to return the value as accurately as possible given the input.
Any supposed "error" in a floating-point number is up to the user to decide, The cos function should accept it as an exact number and produce a value based on that.
Ray