property list hack, profiling, compiler optimization, looking at code




This is not part of the Common Lisp standard, but here is an example 
(edited somewhat for brevity)
to show how profiling can work.  The function f was previously defined.

f(n):=block([],mode_declare([i,n],integer),
   for i thru n do (sin(i),cos(i),log(i),tan(i),atan2(1,i)))$

I compiled it in Allegro CL 8.2 (express edition  [free])

(%i51) f(10^5);
Evaluation took 11.1720 seconds (11.3120 elapsed) including GC time 
0.016 s, using 10500037 cons-cells and 120 other bytes.
(%o51)                               done
(%i52) to_lisp();
Type (to-maxima) to restart, ($quit) to quit Maxima.

maxima> (symbol-plist '%sin)
(tex-rbp 130 texsym ("\\sin ") texword "\\sin " tex tex-prefix 
commutes-with-conjugate t
          sp2
          ((%sum)  ...... etc.   ;; the operators property was way down 
on the prop list


maxima> (load "c:/lisp/ops.lisp")
; Loading c:\lisp\ops.lisp   ;; this puts the operators simplifier at 
the top of the prop list.
t
maxima> (continue)
(%i53) f(10^5);
Evaluation took 11.0000 seconds (11.3130 elapsed) using 10500031 
cons-cells and 152 other bytes.
(%o53)                               done
(%i54) 0.172/11.172;
Evaluation took 0.0000 seconds (0.0000 elapsed) using 9 cons-cells and 
144 other bytes.
(%o54)                        .01539563193698532

;;; the speedup for Allegro CL is only about 1.5%.  Still, essentially 
no cost to do this.

(%i55) :lisp (symbol-plist '%sin)
(operators simp-%sin tex-rbp 130 texsym (\sin ) texword \sin  tex tex-prefix
      ... just checking to see where the operators property is ..


(%i55) :lisp (prof:start-profiler)   ;;; THIS IS THE ALLEGRO PROFILER.  
SBCL HAS ONE,  GCL maybe not..
sampling
(%i55) f(10^5);
Evaluation took 10.9840 seconds (10.9840 elapsed) including GC time 
0.016 s, using 10500031 cons-cells and 216 other bytes.
(%o55)                               done
(%i56) :lisp (prof:show-flat-profile)
Sampling stopped after 3735 samples taken for the time profiler.

Sample represents 11.0 seconds of processor time (out of a total of 11.0)

Times below 1.0% will be suppressed.

   %     %     self  total            self   total  Function
  Time  Cum.   secs   secs    calls ms/call ms/call   name
   7.8   7.8    0.9    0.9                          "qcons"
   6.7  14.5    0.7   10.7                          ... simplifya
   5.7  20.2    0.6    0.6                          excl::get_2op
   4.3  24.4    0.5    0.5                          "bind_one_spec"
   3.6  28.0    0.4    0.4                          "qcar"
   3.2  31.2    0.4    0.5                          ... zerop1
   2.4  33.7    0.3    0.4                          eqtest
   2.2  35.9    0.2    0.4                          ... 
excl::gethash_2op_1ret
   2.2  38.1    0.2    1.5                          ... tms
   2.1  40.2    0.2    0.5                          risplit
   2.0  42.2    0.2    0.2                          mop
   2.0  44.1    0.2    0.2                          $bfloatp
   1.7  45.9    0.2    2.1                          ... simptimes
   1.7  47.6    0.2    1.0                          ... trisplit
   1.7  49.3    0.2    1.3                          ... flonum-eval
   1.6  50.9    0.2    0.2                          "qcdr"
   1.5  52.4    0.2    0.5                          ... complex-number-p
   1.5  53.9    0.2    0.2                          "restify2"
   1.4  55.3    0.2    0.2                          ... alike1
   1.4  56.7    0.2    0.2                          "unbind_count"
   1.4  58.1    0.1    0.6                          ... timesin
   1.3  59.4    0.1    0.1                          floatp
   1.3  60.7    0.1    0.1 "unbind_one_spec"
   1.2  61.9    0.1    1.1                          ... big-float-eval
   1.1  63.0    0.1    4.4                          ... simpatan2
   1.1  64.1    0.1    1.7                          ... simp-%tan
   1.0  65.1    0.1    2.7                          ... simp-%atan
   1.0  66.1    0.1    0.2                          ... distribute-over
(%i56) (%o50)                               done

Looks like there are a bunch of places to make this program run faster.
zerop1 is near the top.  risplit  (real-imaginary-split ???) is there too.
I assume that programs like "qcons" in quotes are part of the lisp system.
In my experience this profiler works for foreign function library programs
too. That is, loading MPFR, written in C, and running the profiler reports
times for calls of C programs to other C programs within MPFR. (subject
to availability of external names, etc.)


If someone wants to speed up the test suite  start with a profile.
for sbcl, see http://www.sbcl.org/manual/Statistical-Profiler.html
This may be a poor place to start (like counting cup-holders :) but
it could be done.

Some speedups are useful only relative to particular underlying lisps.
For example, I found major speedups available in simplification using
hashtables in Allegro CL, but for GCL, not much.  Also there are
issues of compiler optimization vs. debugging.  I assume there is a
modest factor improvement available in Allegro and SBCL by setting
the compiler optimization level.  In selected circumstances with
declarations for numerics the speed-up can be quite substantial
(10X or even 100X), but I assume it would be much less if one re-compiles
"all of maxima" and runs mostly symbolic code.

It is possible to look at the "maxima" version of $f; it is here:

(symbol-plist '$f)
(excl::.args. nil function-mode $any translated t once-translated
  "I was once translated" mprops
  (nil mexpr
   ((lambda) ((mlist) $n)
    ((mprog) ((mlist)) (($modedeclare) ((mlist) $i $n) $integer)
     ((mdo) $i 1 nil nil $n nil
      ((mprogn) ((%sin) $i) ((%cos) $i) ((%log) $i) ((%tan) $i) 
(($atan2) 1 $i)))))))


It is possible to look at the "lisp" version of $f by tracing 
translate-function

        (defmtrfun ($f $any mdefine nil nil) ($n) (declare (special $n) 
(fixnum $n))
                   ((lambda nil
                      nil
                      nil
                      (do (($i 1 (+ 1 $i)))
                          ((> $i $n) '$done)
                        (declare (special $i))
                        (simplify (list '(%sin) $i))
                        (simplify (list '(%cos) $i))
                        (simplify (list '(%log) $i))
                        (simplify (list '(%tan) $i))
                        (simplify (list '($atan2) 1 $i)))))))


It is possible to look at the assembler code for $f  ...

maxima> (disassemble '$f)
;; disassembly of #<Function $f>
;; formals: $n
;; constant vector:
0: $n
1: $i
2: $done
3: (%sin)
4: simplify
5: (%cos)
6: (%log)
7: (%tan)
8: ($atan2)

;; code start: #x21ac1af4:
    0: 55         pushl    ebp
    1: 8b ec      movl    ebp,esp
    3: 83 ec 78   subl    esp,$120
    6: 89 75 fc   movl    [ebp-4],esi
    9: 89 5d e4   movl    [ebp-28],ebx
   12: 39 63 1a   cmpl    [ebx+26],esp
   15: 76 03      jbe    20
   17: ff 57 43   call    *[edi+67]    ; sys::trap-stack-ovfl
   20: 8b 9b e6 00 movl    ebx,[ebx+230]
       00 00
.... etc etc.