From: Stavros Macrakis
Sent: Monday, December 27, 2010 7:07 PM
To: Richard Fateman
Cc: Richard Hennessy ; Maxima List
Subject: Re: [Maxima] Multi-threaded Maxima
Hennessy speculated:
...Maxima could be as much as 3 ? 5 times faster if it used all
[cores]
Fateman replied:
Not if it is waiting for memory. ...a number of basic algorithms
that could be made faster,... if memory bandwidth were not an issue.
I suppose it all depends on the memory usage patterns (in
particular, the cache footprint) of the particular application.
I ran some simple tests on my 2-core (E8400) machine:
randpol(deg,coef):=sum((random(coef*2)-coef)*x^i,i,0,deg)$
test(d, c) := block([p1, p2], p1 : randpol(d, c), p2 : randpol(d,
c), factor(rat(p1*p2)))$
for i thru 10 do test(100,10^80)$
in two Maxima processes simultaneously (started by hand) vs. in one
Maxima process sequentially.
With two processes, the elapsed times were 54.8+53.5 = 108.3 (with
Windows showing 100 % CPU usage) and with one process, 93.4 (50% CPU
usage). I tried other parameters -- different sized polynomials presumably
require different amounts of memory for the factor calculation --
and saw similar results.
I did the same thing sequentially and I got 45.54+40.11=85.65 seconds. CPU
was at 13%
In two hand-started processes simultaneously I got max(44.22, 48.21) = 48.21
seconds. CPU was at 25%.
In three hand-started processes simultaneously I got max(54.70, 52.81,54.64)
= 54.70 seconds. CPU was at 35%.
I also tried a simple rational matrix inversion:
for i thru 30 do genmatrix(lambda([i,j],random(100)-50),50,50)^^-1$
and got 20.8+21.3 = 42.1 vs. 39.4.
For i = 50 in two processes I got.
for i thru 50 do genmatrix(lambda([i,j],random(100)-50),50,50)^^-1$
I got max(31.76, 32.49) with CPU at 25%
For i = 100 in one process I got
for i thru 100 do genmatrix(lambda([i,j],random(100)-50),50,50)^^-1$
60.83 seconds.
I cannot interpret these results since I don't know what was going on in
these tests. I guess I have a fast machine.
So the advantage in these cases are minimal, as Fateman predicts.
Not for me.
Rich
There may well be symbolic Maxima applications with a small-enough
memory footprint that memory access isn't the bottleneck, but there's no
guarantee of that.
-s