speed testing: Allegro vs GCL



A better test for Maxima speed, if you are
concerned with the speed of polynomial and
rational function manipulation, may be this:

(showtime:all, rat( (a+b+c+d+1)^35), 1)$

note that instead of doing ratsimp, we are
doing rat.

 What ratsimp does is take the result of
the tightly encoded polynomial result of rat and
conses up an algebraic tree, in macsyma's
"general form".
This does a sorting / ordering kind of
operation which is rather expensive. 
In particular, I just tried using MAXIMA source on
Allegro CL on a 933 Mhz machine. rat alone takes
about 0.94 seconds.

(the commercial macsyma takes 1.2 sec + another 7.5 in 
garbage collection).

Doing the RATSIMP takes 4.65 seconds.

The Allegro profiler (is there one for GCL?)

gives this information, which shows that the largest
single user of computer time is the
8% of the time in the RATSIMP taken within the
function MEMQ.

See below for more info:


RATSIMP Profiler

Sample represents 3.9 seconds of processor time (out of a total of 3.9)

Times below 1.0% will be suppressed.

  %     %     self  total            self   total  Function
 Time  Cum.   secs   secs    calls ms/call ms/call   name
  8.1   8.1    0.3    0.3                          MEMQ
  5.2  13.2    0.2    0.3                          "kf_bignum_multiply"
  3.6  16.8    0.1    2.5                          ... SIMPLIFYA
  3.6  20.4    0.1    0.2                          ... ALIKE1
  3.2  23.6    0.1    2.4                          ... SIMPTIMES
  3.1  26.7    0.1    0.2                          EQTEST
  3.0  29.6    0.1    0.7                          ... PCTIMES1
  2.8  32.5    0.1    0.1                          EXCL::EQUAL-NOT-EQ
  2.7  35.1    0.1    0.2                          PZEROP
  2.6  37.7    0.1    0.1                          EXCL::*_2OP
  2.5  40.2    0.1    0.1                          EXCL::=_2OP
  2.4  42.7    0.1    0.3                          ... SIMPEXPT
  2.2  44.8    0.1    0.1                          "kf_prunebig"
  2.1  47.0    0.1    1.3                          ... PTIMES
  2.1  49.1    0.1    0.2                          MNUMP
  2.0  51.1    0.1    0.1                          EXCL::GET_2OP
  2.0  53.0    0.1    0.5                          ... TMS
  1.9  54.9    0.1    0.1                          "qcons"
  1.8  56.7    0.1    0.7                          ... PLUSIN
  1.8  58.5    0.1    0.1                          "kf_restify2"
  1.8  60.3    0.1    0.4                          ... GREAT
  1.7  62.0    0.1    0.3                          ... TIMESIN
  1.7  63.7    0.1    0.1                          ZEROP1
  1.6  65.3    0.1    0.1                          "qcar"
  1.6  66.9    0.1    0.1                          ONEP1
  1.4  68.3    0.1    0.2                          ... PPLUS1
  1.3  69.6    0.1    2.5                          ... SIMPLUS
  1.3  70.9    0.1    0.9                          ... PCTIMES
  1.2  72.2    0.0    0.3                          ... ORDFN
  1.2  73.4    0.0    0.0                          "qcdr"
  1.1  74.5    0.0    0.1                          ALIKE
  1.1  75.6    0.0    0.5                          CTIMES
  1.0  76.6    0.0    0.1                          "big_add_int"

..............
If we do the RAT()  problem a few times and look at the profile:


RAT Profiler:

Sample represents 2.0 seconds of processor time (out of a total of 2.0)

Times below 1.0% will be suppressed.

  %     %     self  total            self   total  Function
 Time  Cum.   secs   secs    calls ms/call ms/call   name
 16.9  16.9    0.3    0.5                          "kf_bignum_multiply"
  9.4  26.2    0.2    1.1                          ... PCTIMES1
  8.3  34.6    0.2    0.2                          PZEROP
  7.1  41.6    0.1    0.1                          EXCL::*_2OP
  6.0  47.7    0.1    2.0                          ... PTIMES
  5.7  53.3    0.1    0.1                          "kf_prunebig"
  5.1  58.5    0.1    0.1                          EXCL::=_2OP
  3.5  62.0    0.1    1.3                          ... PCTIMES
  3.3  65.3    0.1    0.8                          CTIMES
  3.2  68.6    0.1    0.1                          "big_add_int"
  3.1  71.7    0.1    0.4                          ... PPLUS1
  3.0  74.7    0.1    0.1                          "kf_new_other"
  2.1  76.8    0.0    0.0                          EXCL::+_2OP
  2.1  78.9    0.0    0.3                          PCPLUS
  2.0  81.0    0.0    0.1                          "kf_newbignum"
  1.8  82.8    0.0    0.0                          "fixnum_in_big"
  1.8  84.6    0.0    1.4                          PTIMES1
  1.6  86.3    0.0    0.0                          "kf_restify2"
  1.5  87.8    0.0    0.5                          "integer_multiply"
  1.5  89.3    0.0    0.1                          PCOEFADD
  1.5  90.8    0.0    0.5                          PCETIMES1
  1.4  92.3    0.0    0.4                          PPLUS
  1.4  93.7    0.0    0.0                          "qcons"
  1.2  94.9    0.0    0.2                          CPLUS
  1.0  96.0    0.0    0.0                          "replace_temp_bignum"
[1] CL-MACSYMA(22): 

..............
So a good place to try to improve this code is
either in making kf_bignum_multiply  faster, [or to
eliminate calls to it].  Another place is pzerop.
This is defined as
(defun pzerop (x) (and (numberp x) (zerop x)))
which suggests it could 
 (a) be done in-line  and/or
 (b) done faster e.g.  (defun pzerop(x)(equalp x 0))

It is plausible that PCTIMES and PTIMES are
taking a lot of time:  PCTIMES multiplies
a polynomial by a coefficient (which could
be a polynomial in other variables), and
PTIMES.  CTIMES and CPLUS are in there, but
not as big contributors.