Subject: Discussion about arrays and matrices in Maxima
From: Žiga Lenarčič
Date: Tue, 19 May 2009 09:47:21 +0200
On 19. May, 2009, at 6:29 AM, Robert Dodier wrote:
> On 5/18/09, ?iga Lenar?i? <ziga.lenarcic at gmail.com> wrote:
>
>> So from my experience - it's hard to improve something in Maxima,
>
> Keep in mind that you are always free to maintain and distribute
> your own Maxima packages (or even your own entire fork).
> That might be a good way to publicize and refine your proposals.
>
> I don't mean to be difficult --- I'm generally sympathetic to your
> proposals. Bear in mind that the wheel of Maxima grinds slowly,
> but exceedlingly fine.
Perhaps the wheel of Maxima is broken a bit :) If an improvement is
suggested (with lisp code) which is faster and/or more featureful and/
or easier to use without breaking backwards compatibility, I think it
should be implemented in some form or another. For instance I have
ideas for some of these 'harmless' improvements:
*) 'transpose' working on lists of lists (I find myself many times I
want to transform a list like [ [a1,a2,a3,..], [b1,b2,b3,...] ] into
[[a1,b1],[a2,b2],...] - I don't want to mess with converting to
matrix and back, just to use 'transpose')
*) extending 'random' to quickly produce a list of numbers (random
numbers are often used in larger quantities).. random(1.0) would work
as it works now, while random(1.0, 100) would produce a list of 100
random numbers much quicker than makelist( random(1.0), i, 1, 100); -
also less code.
If I write lisp code for these two cases, will that get into Maxima?
I don't see any potential harms in these two improvements...
I also know that at some point Andrej V. posted a 'nintegrate'
function - to wrap up quadpack routines. I think users would find it
simpler to a) use something with 'integrate' in its name, b) to not
have to call different quadpack routines depending on the integration
bounds (finite, infinite..). It even supported 'romberg' as an
integration method. After some discussion (mainly about the name of
the function I think), nothing changed in Maxima, and now 3 years
later one still _has_ to call quadpack routines (which is absurd if a
better solution was presented). I fear this happens too often...
>
>> *) Matrices should internally represented as n-dimensional lisp
>> arrays.
>
> I agree that a matrix should use a Lisp array for storage, but
> I'm inclined to use a 1-d array, since it makes some restructuring
> operations (e.g. transpose, row/column extraction) simpler.
>
> I can't tell if the translated Fortran code wants 1-d or 2-d arrays.
> Fortran would be happy with 1-d, but there seems to be some
> code to present 1-d as 2-d or vice versa, I'm not sure what's
> going on.
As far as transposition goes - in LAPACK some routines take as an
argument a transposition flag - so internally values are refered as a
[i][j] in a normal case or a[j][i] in transposed case - no actual
transposition is done. Maxima could have the same system
(transposition as a flag) but not necessarily. If a function layer is
written for writing and reading matrix elements internally with
(defun mmatrix-ref-element ..), then Maxima functions would see a
matrix as a transposed one or a nontransposed one without doing any
actual transposition on array elements. But perhaps it's simpler to
do actual transposition and then allow Maxima functions to work on
matrices with 'aref' directly?
I don't see how 1-d arrays of 1-d arrays help row/column extraction -
one of them can be done via just returning the 1-d array, but the
other one must be done via 'manually' collecting the elements (so you
profit only on one of this cases). I have a feeling 1-d arrays of 1-d
arrays doesn't translate so well to 3-d matrices..
Since matrix is rectangular (i.e. all rows must be of the same
length) I think using n-dimensional lisp arrays is better and
simpler. I imagine that using a 2d-array instead of and array of
arrays is also faster (because lisp compiler can optimize for
performance much more) for most operations. Also (aref (aref arry j)
i) is probably slower and also more complicated to write than (aref
arry (,i ,j)). For more than 2d matrices/arrays this is even more
obvious. Since Maxima is a lisp program, why not leverage all the
features lisp offers. Another complication is if we use general
arrays and 'double-float arrays. If matrices are n-dimensional lisp
arrays, one can easily check the type of array to decide how to treat
it, while in case of 1-d arrays of 1-d arrays, one has to look at the
type of innermost arrays!
1-d arrays of 1-d arrays (growable) could be used to implement Maxima
list internally with faster access in cases of long lists - but I
like the fact Maxima lists are represented with Lisp lists internally
- with some knowledge of lisp, one can easily write a function which
outputs a Maxima list via (cons '(mlist simp) mylist). I see this
ease of adding your own functions to Maxima as it's strongest
argument for expert use of Maxima compared to Mathematica. Virtually
no modification of some lisp routine is necessary to make it work in
Maxima, while extending MATLAB or Mathematica in a compiled language
(C) always requires a layer which translates your 'simple' results (a
C array perhaps) to something MATLAB/Mathematica can understand - and
this is very tiresome to do. While programs written in C have this
inherent problem (extending requires some layer for conversion of
datastructures), Maxima as a Lisp program seem to reap the benefits
of Lisp in this case.
I see using n-dimensional arrays also as reaping the benefits of
using Lisp - no need to complicate things.
What else do 1-d arrays of 1-d arrays offer, besides easier
transposition and row/column extraction? Perhaps using n-dimensional
arrays would pay off with better performance and cleaner, simpler code?
>> *) Entering matrices would probably be implemented as it is now -
>> with matrix function.
>
> No opinion on this point.
To explain what I was trying to say :)
It would be nice to have a special syntax for entering matrices, like
we have for lists with [a,b,c,d] and sets {1,2,3,4}. But since we
have used up different parenthesis, the only way to enter a matrix is
to write
matrix( [a,b], [c,d] );
I think this is OK.
>
>> *) The only place a word 'array' should be mentioned in Maxima
>> documentation is where implementation of matrices is explained,
>> ideally (if current array facilities are some day removed). I think
>> matrix (in Maxima language) is a suitable word for an array(its
>> meaning in computer science).
>
> I'm not so sure about this. A matrix is an algebraic object, right?
> while an array is just a storage container. I think confounding
> fundamentally different kinds of objects for the sake of convenience
> is a poor choice. (Incidentally I'm also opposed to treating lists
> as synonymous with vectors --- they have different algebraic
> properties.)
Some mathematical abstractions don't translate that well into a
program. The best solution is to have some kind of balance between
'mathematical correctnes' and usability. For instance let's take a
look at how sets are implemented in Maxima.
The usual sets of interest in mathematics are infinite, which
translates very badly into a computer program. If we cover only
finite sets, then we are essencially working with lists (from a
computational point of view) that don't have elements repeated. For
instance we have decided to allow only finite sets because they
translate well into a program, but have also decided to treat sets
differently from lists to emphasize algebraic properties. The
consequences are:
*) We have a set of functions which perform operations only on sets,
though a user would perhaps like to use them on lists also (without
conversion).
*) Functions that are implemented to work on lists have to be
expanded to work on sets also. So in effect Maxima sets have less
functionality available than Maxima lists.
For a user, who is learning high-school math, such a basic
implementation of sets is perhaps useful, though the same user would
be confused, why there are no sets of natural/real numbers he could
call 'length()' on and get a 'mathematical' answer...
For a user, who knows his math and is using Maxima to compute
something, the distinction of list and sets is useless, since he
looks at things from a computational point of view (at least thats my
experience). He wants to operate on data structures, not to learn
what a 'set' is. Therefore having functions that work only on sets
and not on lists also is only an unnecessary complication for him.
I see a similar situation with vectors - what does it bring to the
table for a user who knows his math? Vector is a mathematical
abstraction that doesn't translate well into a program. Since maxima
supports only finite-dimensional vectors written in the same basis -
I don't see what is gained by not seeing vectors as a list of
numbers. For computation I'd like to have vectors accepted as a
Maxima list or a Maxima matrix (preferably both are accepted) -
because I already know how to manipulate these two data forms. A
package (vect?) could be used when one wants to perform abstract
operation on vectors - though these best work on paper I'm afraid.
> A hash table is called an undeclared array in Maxima; I think that's
> an unfortunate choice of words. I'd rather just call them hash tables
> (and implement them with Lisp hash tables).
I don't know about hashtables in Maxima - but in general if a Maxima
structure translates well into a Lisp type, then the Lisp type should
be used I think. In effect this means shorter/more understandable
code and reaping the performance of optimized Lisp implementations of
these types.
Lisp complex numbers pop to mind #C(). Perhaps they are not general
enough for usage in Maxima?
>
> Another tangential point --- Maxima distinguishes arrays which are
> values from arrays which are properties. I'm inclined to preserve
> that distinction.
>
> Thanks for bringing up the topic. Perhaps in the near future
> (i.e. less than 1 year) we'll really get rolling on it.
>
> best
>
> Robert Dodier
I hope something will come out of this :) Lisp arrays are there to
use them - it would be foolish not to use them for matrices.
Regards,
Ziga Lenarcic