Initial- and boundary-value problems in Maxima.
- Subject: Initial- and boundary-value problems in Maxima.
- From: Raymond Toy
- Date: Mon, 19 Sep 2011 20:32:16 -0700
On Mon, Sep 19, 2011 at 6:24 AM, Michel Talon <talon at lpthe.jussieu.fr>wrote:
>
> (%i3) :lisp(sb-profile:report)
> measuring PROFILE overhead..done
>
Here's what I get using cmucl. They're roughly the same. I guess the
difference could be assumed to be a difference in the compiler. The top few
items are pretty similar.
Consed | Calls | Secs | Sec/Call | Bytes/C. | Name:
-----------------------------------------------------------------------
19,343,200 | 74 | 0.600 | 0.00811 | 261,395 | COLNEW::LSYSLV
8,759,448 | 2,782 | 0.394 | 0.00014 | 3,149 | COLNEW::DGESL
13,525,352 | 1,584 | 0.367 | 0.00023 | 8,539 | COLNEW::VWBLOK
3,223,656 | 396 | 0.199 | 0.00050 | 8,141 | COLNEW::DGEFA
2,767,096 | 52,818 | 0.154 | 0.00000 | 52 | COLNEW::DAXPY
7,418,416 | 1 | 0.090 | 0.09000 | 7,418,416 | COLNEW::CONTRL
1,480,968 | 5,872 | 0.068 | 0.00001 | 252 | COLNEW::APPROX
1,302,056 | 1,594 | 0.047 | 0.00003 | 817 | COLNEW::GBLOCK
743,128 | 49 | 0.040 | 0.00081 | 15,166 | COLNEW::SBBLOK
222,168 | 4 | 0.030 | 0.00750 | 55,542 | COLNEW::ERRCHK
227,720 | 1,291 | 0.017 | 0.00001 | 176 | COLNEW::RKBAS
67,536 | 2,772 | 0.014 | 0.00001 | 24 | COLNEW::IDAMAX
1,331,240 | 10 | 0.010 | 0.00100 | 133,124 | COLNEW::NEWMSH
13,984 | 396 | 0.009 | 0.00002 | 35 | COLNEW::FACTRB
59,880 | 2,772 | 0.004 | 0.00000 | 22 | COLNEW::DSCAL
39,968 | 1 | 0.000 | 0.00000 | 39,968 | COLNEW:COLNEW
243,592 | 64 | 0.000 | 0.00000 | 3,806 | COLNEW::GDERIV
25,312 | 1,198 | 0.000 | 0.00000 | 21 | COLNEW::SUBBAK
39,776 | 1,198 | 0.000 | 0.00000 | 33 | COLNEW::SUBFOR
128 | 4 | 0.000 | 0.00000 | 32 | COLNEW::VMONDE
4,288 | 1 | 0.000 | 0.00000 | 4,288 | COLNEW::CONSTS
8,848 | 380 | 0.000 | 0.00000 | 23 | COLNEW::SHIFTB
33,728 | 162 | 0.000 | 0.00000 | 208 | COLNEW::HORDER
1,344 | 4 | 0.000 | 0.00000 | 336 | COLNEW::SKALE
211,704 | 16 | 0.000 | 0.00000 | 13,232 | COLNEW::FCBLOK
4,600 | 49 | 0.000 | 0.00000 | 94 | COLNEW::DMZSOL
-------------------------------------------------------------------
61,099,136 | 75,492 | 2.045 | | | Total
> (%i1) load(colnew);
> ....
> (%i2) :lisp(sb-profile:profile fsub dfsub "COLNEW")
>
> (%i2) load(prob2);
> (%i3) :lisp(sb-profile:report)
> measuring PROFILE overhead..done
>
> seconds | gc | consed | calls | sec/call | name
> ----------------------------------------------------------
> 1.355 | 0.000 | 12,880,288 | 4,632 | 0.000292 | FSUB
> 0.704 | 0.000 | 11,850,192 | 1,584 | 0.000445 | DFSUB
> 0.580 | 0.066 | 28,463,896 | 74 | 0.007841 | COLNEW::LSYSLV
> 0.567 | 0.013 | 32,434,888 | 2,782 | 0.000204 | COLNEW::DGESL
> 0.225 | 0.005 | 11,057,520 | 396 | 0.000567 | COLNEW::DGEFA
> 0.147 | 0.039 | 11,592,584 | 49 | 0.002998 | COLNEW::SBBLOK
> 0.126 | 0.000 | 0 | 52,818 | 0.000002 | COLNEW::DAXPY
>
Corresponding results for cmucl:
Consed | Calls | Secs | Sec/Call | Bytes/C. | Name:
-----------------------------------------------------------------------
8,995,512 | 2,782 | 0.464 | 0.00017 | 3,233 | COLNEW::DGESL
15,297,232 | 4,632 | 0.391 | 0.00008 | 3,303 | FSUB
12,451,000 | 1,584 | 0.257 | 0.00016 | 7,860 | DFSUB
3,271,472 | 396 | 0.179 | 0.00045 | 8,261 | COLNEW::DGEFA
3,527,200 | 74 | 0.150 | 0.00203 | 47,665 | COLNEW::LSYSLV
2,074,056 | 5,872 | 0.128 | 0.00002 | 353 | COLNEW::APPROX
7,418,448 | 1 | 0.100 | 0.10000 | 7,418,448 | COLNEW::CONTRL
1,310,232 | 1,594 | 0.067 | 0.00004 | 822 | COLNEW::GBLOCK
1,076,160 | 1,584 | 0.047 | 0.00003 | 679 | COLNEW::VWBLOK
...
-------------------------------------------------------------------
60,178,656 | 81,952 | 1.903 | | | Total
I think this just basically confirms that cmucl compiles things differently
from sbcl. But it's clear the fsub and dfsub take a large amount of time.
Roughly 34% of the time. dgesl and lsyslv take another 32%.
Making lsyslv faster would be a nice improvement.
I also wanted to mention that colnew makes use of a lot of common blocks.
In the current conversion, the common blocks are represented as one giant
array instead of a structure of smaller arrays/scalars. To access a
particular part of a common block, an array slice is taken. I don't know
how much additional time this takes, but it could be reduced if the parts of
the common blocks were left as arrays. The naming used in colnew prevents
f2cl from doing this. (Because in some parts of colnew we have a common
block with an array named kdum, but in other parts it's named k, and that
difference confuses f2cl.)
Ray