[sage-devel] Fwd: Computer algebra system? / SAGE and Pipes



> ---------- Forwarded message ----------
> From: Richard Fateman <fateman at cs.berkeley.edu>
> Subject: Re: [Maxima] Computer algebra system? / SAGE and Pipes
>
> The comment that SAGE's mechanisms are good (or adequate) for linking
> various systems is either na?ve, or assumes that users of SAGE use SAGE only
> for simple things, or the systems SAGE is linking together are na?ve.  ( I
> think the last is false since it includes Maxima).

His comments weren't naive, as Jaap has been an active user of nontrivial
functionality of SAGE for a long time.  Users of SAGE do not use SAGE only
for simple things -- it is used for everything from teaching to very
sophisticated
research.  The systems SAGE links together or not only naive, as they currently
include Maxima, Maple, Mathematica, GAP, Magma, PARI/GP, Singular,
MuPAD, MATLAB, and many other special purposes systems such as gfan,
polymake, PALP (for lattice polytopes), genus2reduction for curves, Lcalc
for computing with L-functions, etc.

> Pipes as popularized by UNIX are suitable for exchanging relatively
> low-capacity serial messages in an environment where the options on each
> side of the communication are fully characterized.

I agree that messages must be small (basically, at most
about 3000 characters), at least in the direction SAGE --> other system.
SAGE uses an expect-based system and pseudo-tty's
so it is not necessary that the messages be fully characterized.  For example,
one can wait until either any one of a list of regexp's appears in the output
stream or a certain amount of time elapses.

> I have not studied SAGE,

I very much wish you would -- I've read with great interest numerous papers
you've written on other pieces of mathematical software (e.g.,
Mathematica).  You've
commented a few times on SAGE on this list, but perhaps because
you haven't spent much time actually using SAGE, your comments haven't
been as helpful to us as they might otherwise have been.

> but here are some issues that I think would come up.

I think your main point below is that it would be optimal if an entire
mathematical system
were written in one language, and even more optimal if that language were lisp.
The design of SAGE has a very different perspective.  Our goal is to create
a comprehensive organized and unified range of functionality quickly
(!), structured in
such a way that we can make certain parts of that functionality better
over time.  The
goal is to do this by unifying_ diverse open source math software
projects in both a social and technical way, so instead of them competing with
each other (e.g., Singular versus the now GPL'd CoCoA), they can work together
to provide a truly viable alternative to Magma, Maple, Mathematica,
and MATLAB.

SAGE is the only way David Joyner and I could find to have a chance
at attaining the above goals.  Thus in the interest of speed and
cooperation, both
in a social and technical sense, SAGE strongly encourages a wide range of
approaches to mathematical software to work together -- some
people write best in C/C++, some in lisp, some in Python, etc., and
some problems
are best solved with certain languages and libraries.  In SAGE most approaches
are strongly welcome, and indeed SAGE is designed from the ground up to
make using a wide range of techniques together in a single project both
possible and hopefully easy.

One example that illustrates the short and longterm situation with SAGE well
is the interaction between SAGE and Singular.  Singular provides the fastest
available general-purpose open source Groebner basis engine, and incredibly
fast multivariate polynomial arithmetic (the fastest of any system closed or
open source, I think).  Two years ago David Joyner and I started including
Singular with SAGE so that SAGE could factor multivariate polynomials,
compute a huge range of Groebner basis, and take advantage of the comprehensive
library of algebraic geometry code offered by Singular.   Initially the
way SAGE and Singular worked together was via a pseudotty interface
(using the Python package pexpect), and -- when the amount of data
to be sent to Singular is large -- a disk file.  We were able to get a
useful Singular interface up and running in a day or two (literally!),
and it allowed
us to move forward.   By far, most of the time involved in
incorporating Singular
into SAGE was spent on getting Singular to actually compile on all the
architectures
supported by SAGE -- this was *really* hard, but numerous people helped out
and we did it.    However, the interface is almost useless for arithmetic,
unless all the arithmetic takes place in Singular, because of latency
issues with using the interface.

For the last 3 months Martin Albrecht  (a German SAGE developer) has
been creating a direct C++ library interface
between SAGE and Singular, i.e., making it so that SAGE can directly
and very robustly take advantage of Singular's highly optimized multivariate
polynomial arithmetic code.   Just getting this to work has been extremely
challenging -- the first version came out in SAGE-2.5 (it's not used by default,
but the code is there and anybody can try it now).   Moreover, Martin has
to write tons of tricky-to-debug compiled code, every time a new type
of base ring is supported, etc.  It's hard work, often things work on one
platform but not another and that has to be resolved etc.  But the results
are simply *spectacular*.   With psuedo-tty's, I made it so SAGE could use
all that functionality of Singular in just a few hours -- with a
library interface
it takes month of hard work.  Going from the pexpect interface to a C++ library
interface is a natural migration, and it is well under way.

There's a similar story to tell about the SAGE <--> PARI/GP
interaction; in the beginning
there was just a pexpect interface, then Justin Walker, me, Karim Belebas (lead
PARI developer), and Gonzalo Tornaria wrote a much faster C-library interface.
Currently the work of some of the SAGE developers, mainly Bill Hart and
David Harvey, on highly optimized arithmetic in univariate polynomial rings
and on integer factorization is going to hopefully be incorporated back into
PARI when it stabilizes.

It's entirely possible that at some point in the future the SAGE <--> Maxima
interface would change to use something besides pipes.  If somebody
figures out a different way, which is better, I'd very much love to
hear about it.

> 1. Maxima sometimes answers a question with another question, not an answer.
> (e.g. Is s positive, negative, or zero). This is usually unexpected.

in SAGE-2.5 those annoying questions are literally "expected" via the
pexpect interface,
which listens for any of a list of regexp's to appear in the output
stream.  E.g.,
sage: maxima('integrate(x^n,x)')
...
TypeError: Computation failed since Maxima requested additional
constraints (use assume):
Is  n+1  zero or nonzero?
sage:
----------------

At this point, the user can either use the assume command to set a constraint,
or type "!maxima" to temporarily switch to the maxima command line.
  sage: !maxima
  Maxima 5.12.0 http://maxima.sourceforge.net
  ...
  (%i1)

------

It's possible of course that we've missed some sort of interactive question.  If
anybody knows of one that confuses SAGE, let me know.

> 2. Maxima sometimes runs out of memory, or goes into a loop.  Interrupting
> it from the keyboard may or may not be possible (depends on the Lisp,
> probably).

This is certainly an issue.  SAGE is better at this than you might
think, though,
at least as of SAGE-2.5.  It will even automatically kill and restart
a bad maxima process if necessary, after it tries sending control-C's for
a certain specified amount of time. Much fine tuning has gone into
this as a result of over 2 years of heavy usage of the pexpect interface to
maxima by many people.   SAGE's interaction with Maxima can do anything
you can do from the keyboard -- the question is just what strategy to
use in doing it.

> 3. Maxima sometimes produces objects that cannot be serialized without
> substantial (exponential!) growth in size, unless it is done in some
> sophisticated fashion. Exchanging such objects by simply pipes of Lisp
> read/write of characters is not sophisticated enough.

One can work with maxima objects directly without having to print
them out.  For example, in the following session there is a large object,
but at no point has the huge object been sent back or forth via
the pseudo-tty interface:

sage: a = maxima('(x+1)')
sage: b = a^1000
sage: c = b.expand()
sage: c.name()
'sage3'

The c here is simply a Python class which points to a maxima session and
a pointer to a variable name in Maxima.

The interfaces with all other math software works much the same way. One
can directly work with potentially huge objects that live entirely in the other
system.

When Edelman talked about Star-P (http://www.interactivesupercomputing.com/)
at the MSRI parallel computation workshop (I'm sorry you had to miss it, by
the way), he described how their MATLAB <--> Star-P interface works. It
turns out it is similar to what SAGE does above, in that it manipulates remote
objects (often large matrices) without ever moving them back to the client
computer.

> 4. Serializable objects (say, from Maxima) in the command system of SAGE
> must be understood in a context that is suitable for interacting with
   ^^^^^
> objects from another hosted system (say, MAGMA).

Do they?  Who says so.

>  Does SAGE understand
> Maxima's Poisson Series?

Probably not today.   But you could write something like

  sage: f = maxima('something that creates a poison series')

and work with it. And at some point SAGE will very likely
understand what maxima does with Poisson series.

> Writing one to a file and then reading it in via a
> command sent in a pipe could be extraordinarily expensive.

Fortunately, SAGE doesn't have to do that.

> Saving everything
> that one can conceive of -- say a restartable core image, via a pipe would
> be, I think, a challenge.

SAGE doesn't do that.

> Mapping all error messages in Maxima to a single
> error may be uncomfortable.

SAGE doesn't do that.  For example,
sage: maxima('1/0')
...
TypeError: Error executing code in Maxima
CODE:
        sage0 : 1/0$
Maxima ERROR:

Division by 0

sage: maxima('load("foo")')
TypeError: Error executing code in Maxima
CODE:
        sage5 : load("foo")$
Maxima ERROR:

Could not find `foo' using paths in file_search_maxima,file_search_lisp.

-----

And the errors are Python exceptions, so one could catch
them, parse the error messages, and have that influence
the logic of a large program.

> Introducing yet another language (Python, PERL, ....) to the mix makes the
> semantics potentially more difficult.

Quite true.

> Lisp, for example, has an entirely
> smooth interface between small and large integers. Does Python?

It's not perfect, but it's not bad.  Python 3.0 will completely eliminate
that distinction.  In SAGE we mostly use one integer type which is a highly
optimized wrapping of the GMP C library, and there is no distinction
between small and large integers.

Anyway your comment is simply a criticism of Python as a language
for doing mathematics.   There are many other criticisms one can make.
Python is definitely not perfect.  But as a mainstream programming
language goes, it is reasonable suitable for mathematics.

> Writing totally in Lisp  (though perhaps not generic CL in everyone's
> implementation) has an interesting proof-of-concept: the Lisp machines.  In
> these designs the entire or nearly entire system from operating system to
> user-interface (including compilers for Fortran, paging systems etc. were
> all written in Lisp, as well as Macsyma...) This residential system provided
> a far more uniform environment than I think is possible in the SAGE design.
> There were Lisp machines built by Symbolics, LMI (Texas Instruments), Xerox.
> Some of these are still running, apparently.

This approach is very nice in theory, but unfortunately the social and
technical goals of SAGE can't be accomplished via this approach.

> Some of the current Maxima project coding effort is, I suspect, similarly
> hobbled by pipes.  Perhaps someone else might comment on this! My impression
> is that the plotting has the defect, that (say) zooming in and computing
> more points in a plot may not be possible in the usually Maxima plot because
> the data produced by the Maxima plotting program is a fixed package that is
> handed (via a pipe or something similar) to a stand-alone plotting program
> that does not know that it can call back for more information,  as it
> should.  (Such features are available in the commercial Macsyma, and have
> been for many years. I think they were recently introduced to Mathematica).

I don't know any details about the above, except that blaming pipes might be
unfair; maybe you Maxima developers just haven't had time to implement
interactive zooming, and chose instead to improve other parts of Maxima
(thanks for all your fine work!)  it's possible to design ways of
enabling 2-way
communication using pipes for the application suggested above.
pexpect, which SAGE
uses, was designed mostly for scripting ssh logins, etc., i.e.,
interactive sessions
that require two-way communication.

> It is much easier to write a system in which everything works well as long
> as all the participant programs cooperate nicely with the human user. It is
> much harder to write a system that works when the participant programs are
> stretched to (or beyond) the breaking point and the human is trying to solve
> problems that were unanticipated by the designers.  I don't know where in
> the spectrum of systems SAGE fits.

SAGE was frickin' incredibly hard to write.    That's just the way it is.  We
have goals and we need to accomplish them no matter how difficult it is.

For the most part, SAGE goes for "hard to write" but we'll be done this
year, rather than "easy to write" but it will take 30 years.  And actually
solving the really hard problems that have come up in creating SAGE
has been exciting for the people involved.

> I have been impressed and somewhat
> surprised by the apparent success of SAGE, but it could also be that many of
  ^^^^^^^^^^

Indeed.  In December 2004 you wrote "... the activity is
essentially doomed. [...] Elephants are interesting and useful.
Feathers are interesting  and useful.  Elephants with feathers are a
curiosity. Is SAGE an elephant with feathers?"

By the way, thanks for your Dec 2004 post; I definitely learned something
from it.

> the participating programs are more cooperative than Maxima, and were
> designed ab initio to play nicely as batch programs.

Maxima is slightly harder than most other participating programs, primarily
because of Maxima's interactive question asking.   At some point we'll
have to do something
about that.  The design choice for Mathematica is they simply return
all possible
answers, e.g., for integrate(x^n, x), they would return the integral
in two cases.
SAGE will likely eventually do the same (via carefully though through
interaction with Maxima).

-- 
William Stein
Associate Professor of Mathematics
University of Washington
http://www.williamstein.org