Subject: Initial- and boundary-value problems in Maxima.
From: Richard Fateman
Date: Thu, 15 Sep 2011 07:46:53 -0700
On 9/15/2011 1:50 AM, Michel Talon wrote:
> ...
> When we looked at these numbers we tried to compile the formulas. It made no
> difference. This is not suprising. When you write x+y in maxima, the + is
> not an ordinary plus, it is an MPLUS, and this simple formula goes through
> several steps before being recognized as the addition of two numbers.
> Compiling the formula doesn't change that.
If you know that x and y are always double-float numbers, then a
declaration is needed to
take advantage of this fact.
add2(x,y):=block([],mode_declare([x,y],float),x+y);
for i:1 thru 10000 do add2(3.0,4.0); takes 0.33 sec
compile(add2);
for i:1 thru 10000 do add2(3.0,4.0); takes 0.11 sec
for i:1 thru 10000 do nothing(x,y); takes 0.06 sec. (empty loop)
so speedup is about 5.4X
>
>
> I have here the profiling we did using sbcl. One of the routines which is
> most used and takes time is COLNEW::DAXPY
DAXPY is a fundamental part of LINPACK, I think. rewriting it could
make other things faster too.
unfortunately, if I understand the SBCL profiling, the time spent in the
28 calls DAXPY constitutes less than 1% of the time for the whole task.
the other colnew functions, approx and vwblok are also insignificant
(less than 1%).
The profile reveals that SIMPLIFYA and its subroutines take 20.5% of the
time.
If you know that all the variables are floats, then SIMPLIFYA should be
largely
removed by mode_declare and compiling.
> This routine is surprisingly simple, it just does linear combinations of 2
> columns of a matrix. It is 48 lines of fortran with comments! One of the
> comments is
> c constant times a vector plus a vector.
> c uses unrolled loops for increments equal to one.
> Perhaps this loop unrolling is beneficial in fortran but bad in lisp?
probably not "bad" in lisp, but maybe ineffective.
> Anyways this is a place where access to matrix elements plays a significant
> role, as you say.
> The statistical profiling shows that mosts samples live in maxima code,
> things like SIMPLIFYA MEVAL1, etc. This should correspond to the above
> discussion of evaluations of the differential equation at mesh points.
> Other frequant calls are to sbcl functions i suppose. In fact here is the
> beginning:
>
> Self Total Cumul
> Nr Count % Count % Count % Calls Function
> ------------------------------------------------------------------------
> 1 490 5.8 1138 13.5 490 5.8 - REMOVE-IF
> 2 467 5.6 467 5.6 957 11.4 - "foreign function
> sigprocmask"
> 3 438 5.2 438 5.2 1395 16.6 - SB-EXT:WEAK-POINTER-
> VALUE
> 4 260 3.1 260 3.1 1655 19.7 - SB-C::COMPACT-INFO-
> LOOKUP
> 5 235 2.8 235 2.8 1890 22.5 - SB-VM::ALLOC-SIGNED-
> BIGNUM-IN-EAX
> 6 211 2.5 1725 20.5 2101 25.0 - SIMPLIFYA
> 7 210 2.5 210 2.5 2311 27.5 - SB-IMPL::GET3
> 8 182 2.2 8345 99.3 2493 29.7 - MEVAL1
> 9 137 1.6 137 1.6 2630 31.3 - (LAMBDA (SB-
> IMPL::VALUE))
> 10 134 1.6 139 1.7 2764 32.9 - (LABELS SB-
> IMPL::EQUAL-AUX)
> 11 134 1.6 134 1.6 2898 34.5 - SB-KERNEL:%MEMBER-EQ
> 12 128 1.5 193 2.3 3026 36.0 - ALIKE1
> 13 124 1.5 131 1.6 3150 37.5 - LENGTH
> 14 116 1.4 247 2.9 3266 38.9 - SB-KERNEL:VALUES-
> SPECIFIER-TYPE
> 15 100 1.2 615 7.3 3366 40.1 - (FLET SB-C::LOOKUP)
> 16 97 1.2 97 1.2 3463 41.2 - SB-C::VOLATILE-INFO-
> LOOKUP
> 17 93 1.1 335 4.0 3556 42.3 - SB-KERNEL:SPECIFIER-
> TYPE
> 18 91 1.1 1318 15.7 3647 43.4 - SUBST1
> 19 91 1.1 233 2.8 3738 44.5 - EQUAL
> 20 90 1.1 2498 29.7 3828 45.6 - MAKE-ARRAY
> 21 88 1.0 88 1.0 3916 46.6 - SB-IMPL::GET2
> 22 84 1.0 109 1.3 4000 47.6 - SB-KERNEL:CSUBTYPEP
> 23 81 1.0 163 1.9 4081 48.6 - GETL
> 24 80 1.0 165 2.0 4161 49.5 - ALIKE
> 25 68 0.8 100 1.2 4229 50.3 76618 COLNEW::APPROX
> 26 65 0.8 65 0.8 4294 51.1 - KEYWORDP
> 27 65 0.8 65 0.8 4359 51.9 - (LABELS SB-
> IMPL::SXHASH-RECURSE)
> 28 63 0.7 390 4.6 4422 52.6 - PLS
> 29 63 0.7 254 3.0 4485 53.4 - TIMESIN
> 30 59 0.7 121 1.4 4544 54.1 - EQTEST
> 31 55 0.7 3194 38.0 4599 54.7 23178 COLNEW::VWBLOK
>
> The first explicit colnew functions are approx and vwblock.
> Approx evaluates numerically the differential equation, this is compute
> intensive, and vwblock solves a big linear system, which is also compute
> intensive.
>
> By contrast DAXPY is very simple, but called very much:
> 63 28 0.3 49 0.6 5793 68.9 1164244 COLNEW::DAXPY
>
>
>