Raymond Toy <toy.raymond at gmail.com> writes:
> I tried this out on an old 750 MHz sparc with cmucl, the oldest and
> slowest machine I have access to. From the maxima prompt until "?
> run_testsuite" returns, it takes about 7 sec. A tad slow, but not
> terribly slow. About 3 sec for the next invocation.
>
> For completeness, here are the results from the commit message:
>
> ; Evaluation took:
> ; 5.56f0 seconds of real time
> ; 5.2f0 seconds of user run time
> ; 0.27f0 seconds of system run time
> ; 4,173,048,959 CPU cycles
> ; 0 page faults and
> ; 21,354,312 bytes consed.
Thank you very much for trying this out! To be honest, that's quite a
lot slower than I was expecting. I spent some time trying to see whether
we could be faster on CMUCL and it looks like there's something weird
buried in the regex engine. More below.
> Works ok with cmucl on a Mac. (But the 2013-03a snapshot has a bug in
> file-position that causes the text to be wrong.)
Oh dear. Is that something we can / should work around? What propertion
of cmucl users will be running the 2013-03a snapshot?
About the regex stuff:
TLDR:
=====
(1) I've just pushed a change that makes "??" significantly faster on
CMUCL and should improve it on other lisps too. No change to "?",
I'm afraid.
(2) Ray, is there any way you can easily work out what the sparc is
spending all it's time on? (with the new code) Running the first
code snippet below with profile:profile-all would probably tell me
a lot. If you can, could you use (topic-match "foo" t) and
(topic-match "foo" nil) and let me know the difference?
Gory details:
=============
Firstly, I used the deterministic profilers on SBCL and CMUCL, running
(progn
(setf *documents* nil)
(pushnew (info-pathname nil) *deferred-documents*)
(topic-match "Number Theory" t))
ten times each. The top of the results:
SBCL:
CL-INFO> (sb-profile:report)
measuring PROFILE overhead..done
seconds | gc | consed | calls | sec/call | name
----------------------------------------------------------
0.233 | 0.000 | 30,715,992 | 5 | 0.046609 | SCAN-INFO-FILE
0.150 | 0.016 | 14,676,784 | 5 | 0.030064 | ALL-DOC-REGEX-MATCHES
0.093 | 0.000 | 16,384,464 | 5 | 0.018546 | READ-INFO-INDEX
0.053 | 0.012 | 9,333,656 | 13,840 | 0.000004 | STRIP-SECTION-TITLE
0.019 | 0.000 | 0 | 12,130 | 0.000002 | INFO-INDEX-STRIPPED-NODE
CMUCL:
CL-INFO> (profile:report-time)
Consed | Calls | Secs | Sec/Call | Bytes/C. | Name:
-----------------------------------------------------------------------
14,060,392 | 10 | 0.380 | 0.03800 | 1,406,039 | DOCUMENTATION-MATCHING-TOPICS
11,936,072 | 5 | 0.210 | 0.04200 | 2,387,214 | SCAN-INFO-FILE
10,054,392 | 5 | 0.180 | 0.03600 | 2,010,878 | READ-INFO-INDEX
6,806,208 | 13,840 | 0.082 | 0.00001 | 492 | STRIP-SECTION-TITLE
2,069,040 | 5 | 0.060 | 0.01200 | 413,808 | INFO-READ-TAG-TABLE
The CMUCL results are a bit surprising, since
DOCUMENTATION-MATCHING-TOPICS is just a search through stuff we've
already parsed (and should be quick!). Narrowing down further, if I do
the following:
(defvar *regex*
(coerce
(maxima-nregex::regex-compile
"Number Theory" :case-sensitive nil) 'function))
(defvar *titles*
(mapcar #'doc-topic-name (info-doc-index (cdar *documents*))))
(time (dotimes (i 100) (mapc *regex* *titles*)))
then on SBCL, I get:
Evaluation took:
0.027 seconds of real time
0.028000 seconds of total run time (0.028000 user, 0.000000 system)
103.70% CPU
63,855,669 processor cycles
3,878,904 bytes consed
and on CMUCL:
; Evaluation took:
; 2.5f0 seconds of real time
; 2.444f0 seconds of user run time
; 0.052f0 seconds of system run time
; 5,734,503,475 CPU cycles
; [Run times include 0.09f0 seconds GC run time]
; 0 page faults and
; 111,166,664 bytes consed.
!!! Looks like I've found something CMUCL is doing really badly
I'm not going to work on MAXIMA-NREGEX today. Indeed, maybe the right
thing to do is to replace it with CL-PPCRE or similar? Anyway, the good
news is that "??" doesn't actually need a regular expression search,
since it just works by case-folded substrings. So we can do that
instead. I've pushed a version based on SEARCH instead, where the
"case-folding" works by discarding everything except spaces and
alphanumerics then upcasing the lot. Maybe this isn't quite what we
want, but I like the fact that "?? eulers" returns %e ("Euler's number").
Timings afterwards are much faster for ?? than for ?. On CMUCL, the
difference is pronounced. The following test is basically loading up the
document 50 times and calling "? Number Theory" or "?? Number Theory"
each time.
CL-INFO> (time
(dotimes (n 50)
(progn
(setf *documents* nil)
(deferred-register-document (info-pathname nil))
(topic-match "Number Theory" t))))
; Evaluation took:
; 8.02f0 seconds of real time
; 7.6f0 seconds of user run time
; 0.408f0 seconds of system run time
; 18,400,213,418 CPU cycles
; [Run times include 0.68f0 seconds GC run time]
; 0 page faults and
; 498,037,184 bytes consed.
;
NIL
CL-INFO> (time
(dotimes (n 50)
(progn
(setf *documents* nil)
(deferred-register-document (info-pathname nil))
(topic-match "Number Theory" nil))))
; Evaluation took:
; 5.11f0 seconds of real time
; 4.764f0 seconds of user run time
; 0.336f0 seconds of system run time
; 11,722,715,612 CPU cycles
; [Run times include 0.52f0 seconds GC run time]
; 0 page faults and
; 374,522,656 bytes consed.
;
NIL
The corresponding times before this change are 8.05 seconds and 12.9
seconds, respectively. Woo! (In case anyone is wondering, on SBCL with
the new code the timings are 5.4 seconds, 3.9 seconds)
If anyone's still reading here, I'm astonished...
Rupert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 315 bytes
Desc: not available
URL: <http://www.math.utexas.edu/pipermail/maxima/attachments/20130403/d8da553e/attachment.pgp>