Initial- and boundary-value problems in Maxima.
- Subject: Initial- and boundary-value problems in Maxima.
- From: Michel Talon
- Date: Mon, 19 Sep 2011 15:24:49 +0200
Raymond Toy wrote:
> On Fri, Sep 16, 2011 at 9:01 AM, Raymond Toy <toy.raymond at gmail.com>
> wrote:
>
>>
>> Before we do anything else, let's agree on what exactly is being tested.
>> The profile that I reported was using the current maxima sources using
>> cmucl. I just did load(colnew), then enabled profiling for the MAXIMA
>> and
>> COLNEW packages and then load("share/colnew/prob4.mac"). What steps did
>> you use?
>>
>> Also, perhaps prob4 is more complex that we really need. I think
>> prob1.mac is a simpler test.; let's use that for comparison.
>>
>> prob1 doesn't take very long to run. prob2 takes longer, so let's use
> that.
>
> Ray
OK. Here is the first step, i am profiling prob2 with the latest maxima
compiled with sbcl, since i have it at hand.
I will run maxima compiled with cmucl later on.
First the deterministic profiler.
Here is the result:
niobe% rlwrap bin/maxima
Maxima 5.25.1 http://maxima.sourceforge.net
using Lisp SBCL 1.0.43
(%i1) load(colnew);
...
(%i2) :lisp(sb-profile:profile "COLNEW")
(%i2) load(prob2);
......
(%i3) :lisp(sb-profile:report)
measuring PROFILE overhead..done
seconds | gc | consed | calls | sec/call | name
----------------------------------------------------------
1.732 | 0.012 | 41,630,560 | 74 | 0.023403 | COLNEW::LSYSLV
0.668 | 0.031 | 12,610,568 | 1,584 | 0.000422 | COLNEW::VWBLOK
0.590 | 0.015 | 32,386,360 | 2,782 | 0.000212 | COLNEW::DGESL
0.224 | 0.029 | 11,143,912 | 396 | 0.000566 | COLNEW::DGEFA
0.171 | 0.041 | 11,650,088 | 49 | 0.003490 | COLNEW::SBBLOK
0.158 | 0.000 | 0 | 52,818 | 0.000003 | COLNEW::DAXPY
0.090 | 0.000 | 644,576 | 5,893 | 0.000015 | COLNEW::APPROX
0.033 | 0.000 | 2,068,232 | 16 | 0.002046 | COLNEW::FCBLOK
0.029 | 0.000 | 0 | 2,772 | 0.000010 | COLNEW::DSCAL
0.026 | 0.000 | 1,799,408 | 10 | 0.002552 | COLNEW::NEWMSH
0.018 | 0.000 | 89,360 | 1,312 | 0.000014 | COLNEW::RKBAS
0.016 | 0.000 | 806,416 | 4 | 0.003921 | COLNEW::ERRCHK
0.012 | 0.000 | 0 | 1,198 | 0.000010 | COLNEW::SUBFOR
0.011 | 0.000 | 0 | 2,772 | 0.000004 | COLNEW::IDAMAX
0.010 | 0.000 | 150,824 | 64 | 0.000153 | COLNEW::GDERIV
0.008 | 0.000 | 646,864 | 396 | 0.000020 | COLNEW::FACTRB
0.004 | 0.000 | 0 | 49 | 0.000079 | COLNEW::DMZSOL
0.003 | 0.000 | 0 | 1,198 | 0.000002 | COLNEW::SUBBAK
0.002 | 0.000 | 28,648 | 162 | 0.000009 | COLNEW::HORDER
0.001 | 0.000 | 0 | 380 | 0.000002 | COLNEW::SHIFTB
0.000 | 0.000 | 6,792,408 | 1 | 0.000000 | COLNEW::CONTRL
0.000 | 0.000 | 966,848 | 1,594 | 0.000000 | COLNEW::GBLOCK
0.000 | 0.000 | 0 | 4 | 0.000000 | COLNEW::VMONDE
0.000 | 0.000 | 0 | 1 | 0.000000 | COLNEW::CONSTS
0.000 | 0.000 | 4,096 | 4 | 0.000000 | COLNEW::SKALE
0.000 | 0.000 | 0 | 21 | 0.000000 | COLNEW:APPSLN
0.000 | 0.000 | 28,376 | 1 | 0.000000 | COLNEW:COLNEW
----------------------------------------------------------
3.803 | 0.128 | 123,447,544 | 75,555 | | Total
estimated total profiling overhead: 0.43 seconds
Since fsub, dfsub are called inside LSYSL while all the other are pure lisp,
we see that half of the time is in colnew, almost half in fsub dfsub, as in
the prob3 case.
One also sees that the linpack routines to solve linear systems, dgesl dgefa
daxpy use 1s, that is half the time spent in colnew proper.
Then i have changed the colnew-if.lisp in 5.25.1 to the file where fsub
dfsub etc are real functions and not flets to be able to profile them.
(%i1) load(colnew);
....
(%i2) :lisp(sb-profile:profile fsub dfsub "COLNEW")
(%i2) load(prob2);
(%i3) :lisp(sb-profile:report)
measuring PROFILE overhead..done
seconds | gc | consed | calls | sec/call | name
----------------------------------------------------------
1.355 | 0.000 | 12,880,288 | 4,632 | 0.000292 | FSUB
0.704 | 0.000 | 11,850,192 | 1,584 | 0.000445 | DFSUB
0.580 | 0.066 | 28,463,896 | 74 | 0.007841 | COLNEW::LSYSLV
0.567 | 0.013 | 32,434,888 | 2,782 | 0.000204 | COLNEW::DGESL
0.225 | 0.005 | 11,057,520 | 396 | 0.000567 | COLNEW::DGEFA
0.147 | 0.039 | 11,592,584 | 49 | 0.002998 | COLNEW::SBBLOK
0.126 | 0.000 | 0 | 52,818 | 0.000002 | COLNEW::DAXPY
0.069 | 0.000 | 950,472 | 5,893 | 0.000012 | COLNEW::APPROX
0.030 | 0.000 | 1,817,520 | 10 | 0.002951 | COLNEW::NEWMSH
0.021 | 0.000 | 0 | 1,198 | 0.000017 | COLNEW::SUBBAK
0.019 | 0.000 | 154,736 | 64 | 0.000294 | COLNEW::GDERIV
0.019 | 0.000 | 2,133,696 | 16 | 0.001169 | COLNEW::FCBLOK
0.017 | 0.000 | 137,872 | 1,312 | 0.000013 | COLNEW::RKBAS
0.014 | 0.000 | 805,872 | 4 | 0.003417 | COLNEW::ERRCHK
0.008 | 0.000 | 568,752 | 396 | 0.000020 | COLNEW::FACTRB
0.006 | 0.000 | 0 | 2,772 | 0.000002 | COLNEW::DSCAL
0.004 | 0.000 | 0 | 2,772 | 0.000001 | COLNEW::IDAMAX
0.003 | 0.000 | 0 | 49 | 0.000058 | COLNEW::DMZSOL
0.003 | 0.000 | 0 | 21 | 0.000134 | COLNEW:APPSLN
0.002 | 0.000 | 0 | 1,198 | 0.000001 | COLNEW::SUBFOR
0.002 | 0.000 | 32,720 | 162 | 0.000009 | COLNEW::HORDER
0.000 | 0.000 | 0 | 380 | 0.000000 | COLNEW::SHIFTB
0.000 | 0.000 | 6,799,280 | 1 | 0.000000 | COLNEW::CONTRL
0.000 | 0.000 | 846,272 | 1,594 | 0.000000 | COLNEW::GBLOCK
0.000 | 0.000 | 0 | 4 | 0.000000 | COLNEW::VMONDE
0.000 | 0.000 | 914,696 | 1,584 | 0.000000 | COLNEW::VWBLOK
0.000 | 0.000 | 0 | 1 | 0.000000 | COLNEW::CONSTS
0.000 | 0.000 | 0 | 4 | 0.000000 | COLNEW::SKALE
0.000 | 0.000 | 28,376 | 1 | 0.000000 | COLNEW:COLNEW
----------------------------------------------------------
3.919 | 0.123 | 123,469,632 | 81,771 | | Total
estimated total profiling overhead: 0.47 seconds
Which confirmes the relative time spent in fsub dfsub
The colnew-if.lisp i have used here is the one beginning by:
(in-package :maxima)
(defvar *f*)
(defvar *df*)
(defvar *g*)
(defvar *dg*)
(defvar *nz*)
(defvar *ncomp*)
(defun fsub (x z f-array)
(declare (type (cl:array double-float (*)) z f-array)
(double-float x))
(let ((res (apply *f* x (coerce (subseq z 0 *nz*) 'list))))
(loop for k from 0
for ff in (cdr res)
do
(setf (aref f-array k) ($float ff))))
(values nil nil f-array))
....
(defun $colnew_expert ( ....
(*f* (coerce-float-fun f))
(*df* (coerce-float-fun df))
(*g* (coerce-float-fun g))
(*dg* (coerce-float-fun dg))
(*nz* (reduce #'+ (cdr m))))
...
I have also run the sbcl statistical profiler, this way:
writefile("stat_profile");
load(colnew);
:lisp(cl:require :sb-sprof "/usr/local/lib/sbcl/sb-sprof/sb-sprof.fasl")
:lisp(sb-sprof:profile-call-counts "COLNEW")
:lisp(sb-sprof:with-profiling (:max-samples 10000 :report :flat :loop nil
)#load(prob2)$)
I get things where sbcl and maxima internals get the predominant number of
samples:
Number of samples: 1120
Sample interval: 0.01 seconds
Total sampling time: 11.2 seconds
Number of cycles: 0
Sampled threads:
#<SB-THREAD:THREAD "initial thread" RUNNING {5B3F2EA9}>
Self Total Cumul
Nr Count % Count % Count % Calls Function
------------------------------------------------------------------------
1 54 4.8 54 4.8 54 4.8 - "foreign function
sigprocmask"
2 42 3.8 42 3.8 96 8.6 - SB-EXT:WEAK-POINTER-
VALUE
3 41 3.7 41 3.7 137 12.2 - "foreign function
stat"
4 37 3.3 37 3.3 174 15.5 - "foreign function
_lstat"
5 36 3.2 36 3.2 210 18.8 - SB-C::COMPACT-INFO-
LOOKUP
6 27 2.4 76 6.8 237 21.2 - REMOVE-IF
7 24 2.1 32 2.9 261 23.3 - SB-C::FIND-CONSTRAINT
8 20 1.8 96 8.6 281 25.1 - SIMPLIFYA
9 20 1.8 20 1.8 301 26.9 - SB-VM::ALLOC-SIGNED-
BIGNUM-IN-EAX
10 15 1.3 1112 99.3 316 28.2 - MEVAL1
11 15 1.3 67 6.0 331 29.6 - (FLET SB-C::LOOKUP)
12 13 1.2 13 1.2 344 30.7 - SB-IMPL::GET3
13 12 1.1 12 1.1 356 31.8 - KEYWORDP
14 11 1.0 17 1.5 367 32.8 - SB-VM::GENERIC-+
15 11 1.0 12 1.1 378 33.8 - (LABELS SB-
IMPL::EQUAL-AUX)
16 10 0.9 16 1.4 388 34.6 - ALIKE1
17 10 0.9 10 0.9 398 35.5 - SB-KERNEL:%MEMBER-EQ
18 10 0.9 10 0.9 408 36.4 - (FLET #:CLEANUP-
FUN-[MAKE-CORE-COMPONENT]37)
19 9 0.8 15 1.3 417 37.2 - SB-KERNEL:VALUES-
SPECIFIER-TYPE
20 9 0.8 11 1.0 426 38.0 5893 COLNEW::APPROX
21 8 0.7 13 1.2 434 38.7 - LENGTH
22 8 0.7 8 0.7 442 39.5 - SB-C::VOLATILE-INFO-
LOOKUP
23 8 0.7 8 0.7 450 40.2 - (LABELS SB-
IMPL::SXHASH-RECURSE)
24 7 0.6 37 3.3 457 40.8 - SB-KERNEL:CSUBTYPEP
25 7 0.6 13 1.2 464 41.4 - GETL
26 7 0.6 9 0.8 471 42.1 - SB-KERNEL::%%TYPEP
27 7 0.6 7 0.6 478 42.7 - SB-KERNEL:SEQUENCEP
28 6 0.5 107 9.6 484 43.2 - MEVALARGS
29 6 0.5 40 3.6 490 43.7 - SB-INT:ABOUT-TO-
MODIFY-SYMBOL-VALUE
30 6 0.5 29 2.6 496 44.3 - SB-C::CONSTRAIN-REF-
TYPE
31 6 0.5 22 2.0 502 44.8 - SB-KERNEL:SPECIFIER-
TYPE
32 6 0.5 16 1.4 508 45.4 - SB-KERNEL:%COERCE-
CALLABLE-TO-FUN
33 6 0.5 15 1.3 514 45.9 - TIMESIN
34 6 0.5 13 1.2 520 46.4 - DELETE
35 6 0.5 10 0.9 526 47.0 - SB-C::CLASS-INFO-OR-
LOSE
36 6 0.5 6 0.5 532 47.5 - SB-C::GET-INFO-VALUE
37 6 0.5 6 0.5 538 48.0 - EQL
38 5 0.4 91 8.1 543 48.5 - (LAMBDA (L))
39 5 0.4 58 5.2 548 48.9 - MSET
40 5 0.4 40 3.6 553 49.4 2782 COLNEW::DGESL
41 5 0.4 28 2.5 558 49.8 - COMMON-LISP:MAKUNBOUND
42 5 0.4 21 1.9 563 50.3 - SB-KERNEL:TYPE=
43 5 0.4 10 0.9 568 50.7 - PLS
44 5 0.4 5 0.4 573 51.2 - SB-IMPL::GET2
45 5 0.4 5 0.4 578 51.6 - SB-KERNEL:HAIRY-DATA-
VECTOR-REF
46 4 0.4 395 35.3 582 52.0 74 COLNEW::LSYSLV
47 4 0.4 63 5.6 586 52.3 - MBIND
48 4 0.4 46 4.1 590 52.7 - MUNBIND
......
Here the colnew function which gets most samples is approx, and it gets 9
samples only. One can see where using:
:lisp(disassemble 'colnew::approx)
but 9 samples on a long function gives nothing interesting.
--
Michel Talon