Initial- and boundary-value problems in Maxima.



On Mon, Sep 19, 2011 at 6:24 AM, Michel Talon <talon at lpthe.jussieu.fr>wrote:

>
> (%i3) :lisp(sb-profile:report)
> measuring PROFILE overhead..done
>

Here's what I get using cmucl.   They're roughly the same.  I guess the
difference could be assumed to be a difference in the compiler.  The top few
items are pretty similar.

    Consed |   Calls |  Secs | Sec/Call |  Bytes/C. | Name:
-----------------------------------------------------------------------
19,343,200 |      74 | 0.600 |  0.00811 |   261,395 | COLNEW::LSYSLV
 8,759,448 |   2,782 | 0.394 |  0.00014 |     3,149 | COLNEW::DGESL
13,525,352 |   1,584 | 0.367 |  0.00023 |     8,539 | COLNEW::VWBLOK
 3,223,656 |     396 | 0.199 |  0.00050 |     8,141 | COLNEW::DGEFA
 2,767,096 |  52,818 | 0.154 |  0.00000 |        52 | COLNEW::DAXPY
 7,418,416 |       1 | 0.090 |  0.09000 | 7,418,416 | COLNEW::CONTRL
 1,480,968 |   5,872 | 0.068 |  0.00001 |       252 | COLNEW::APPROX
 1,302,056 |   1,594 | 0.047 |  0.00003 |       817 | COLNEW::GBLOCK
   743,128 |      49 | 0.040 |  0.00081 |    15,166 | COLNEW::SBBLOK
   222,168 |       4 | 0.030 |  0.00750 |    55,542 | COLNEW::ERRCHK
   227,720 |   1,291 | 0.017 |  0.00001 |       176 | COLNEW::RKBAS
    67,536 |   2,772 | 0.014 |  0.00001 |        24 | COLNEW::IDAMAX
 1,331,240 |      10 | 0.010 |  0.00100 |   133,124 | COLNEW::NEWMSH
    13,984 |     396 | 0.009 |  0.00002 |        35 | COLNEW::FACTRB
    59,880 |   2,772 | 0.004 |  0.00000 |        22 | COLNEW::DSCAL
    39,968 |       1 | 0.000 |  0.00000 |    39,968 | COLNEW:COLNEW
   243,592 |      64 | 0.000 |  0.00000 |     3,806 | COLNEW::GDERIV
    25,312 |   1,198 | 0.000 |  0.00000 |        21 | COLNEW::SUBBAK
    39,776 |   1,198 | 0.000 |  0.00000 |        33 | COLNEW::SUBFOR
       128 |       4 | 0.000 |  0.00000 |        32 | COLNEW::VMONDE
     4,288 |       1 | 0.000 |  0.00000 |     4,288 | COLNEW::CONSTS
     8,848 |     380 | 0.000 |  0.00000 |        23 | COLNEW::SHIFTB
    33,728 |     162 | 0.000 |  0.00000 |       208 | COLNEW::HORDER
     1,344 |       4 | 0.000 |  0.00000 |       336 | COLNEW::SKALE
   211,704 |      16 | 0.000 |  0.00000 |    13,232 | COLNEW::FCBLOK
     4,600 |      49 | 0.000 |  0.00000 |        94 | COLNEW::DMZSOL
-------------------------------------------------------------------
61,099,136 |  75,492 | 2.045 |          |           | Total


> (%i1) load(colnew);
> ....
> (%i2) :lisp(sb-profile:profile fsub dfsub "COLNEW")
>
> (%i2) load(prob2);
> (%i3) :lisp(sb-profile:report)
> measuring PROFILE overhead..done
>
>  seconds  |     gc     |    consed   |  calls |  sec/call  |  name
> ----------------------------------------------------------
>      1.355 |      0.000 |  12,880,288 |  4,632 |   0.000292 | FSUB
>     0.704 |      0.000 |  11,850,192 |  1,584 |   0.000445 | DFSUB
>     0.580 |      0.066 |  28,463,896 |     74 |   0.007841 | COLNEW::LSYSLV
>     0.567 |      0.013 |  32,434,888 |  2,782 |   0.000204 | COLNEW::DGESL
>     0.225 |      0.005 |  11,057,520 |    396 |   0.000567 | COLNEW::DGEFA
>     0.147 |      0.039 |  11,592,584 |     49 |   0.002998 | COLNEW::SBBLOK
>     0.126 |      0.000 |           0 | 52,818 |   0.000002 | COLNEW::DAXPY
>

Corresponding results for cmucl:

     Consed |   Calls |  Secs | Sec/Call |  Bytes/C. | Name:
-----------------------------------------------------------------------
 8,995,512 |   2,782 | 0.464 |  0.00017 |     3,233 | COLNEW::DGESL
15,297,232 |   4,632 | 0.391 |  0.00008 |     3,303 | FSUB
12,451,000 |   1,584 | 0.257 |  0.00016 |     7,860 | DFSUB
 3,271,472 |     396 | 0.179 |  0.00045 |     8,261 | COLNEW::DGEFA
 3,527,200 |      74 | 0.150 |  0.00203 |    47,665 | COLNEW::LSYSLV
 2,074,056 |   5,872 | 0.128 |  0.00002 |       353 | COLNEW::APPROX
 7,418,448 |       1 | 0.100 |  0.10000 | 7,418,448 | COLNEW::CONTRL
 1,310,232 |   1,594 | 0.067 |  0.00004 |       822 | COLNEW::GBLOCK
 1,076,160 |   1,584 | 0.047 |  0.00003 |       679 | COLNEW::VWBLOK
...
-------------------------------------------------------------------
60,178,656 |  81,952 | 1.903 |          |           | Total

I think this just basically confirms that cmucl compiles things differently
from sbcl.  But it's clear the fsub and dfsub take a large amount of time.
Roughly 34% of the time.  dgesl and lsyslv take another 32%.

Making lsyslv faster would be a nice improvement.

I also wanted to mention  that colnew makes use of a lot of common blocks.
In the current conversion, the common blocks are represented as one giant
array instead of a structure of smaller arrays/scalars.   To access a
particular part of a common block, an array slice is taken.  I don't know
how much additional time this takes, but it could be reduced if the parts of
the common blocks were left as arrays.  The naming used in colnew prevents
f2cl from doing this.  (Because in some parts of colnew we have a common
block with an array named kdum, but in other parts it's named k, and that
difference confuses f2cl.)

Ray