memoizing in mathematica vs macsyma..



The secret way that Mathematica keeps track of whether to re-simplify
an expression  (that is, running patterns over it)  is roughly as
follows:

Every top level expression say E, has a collection of pages it depends on,
p1,p2,... pn
and a timestamp T for E.

If you want to compute with E, then look at the timestamps of p1, ... pn.
If they are all older than T, then you can go with E as is.

What are these pages?  Well if E depends x, y, z, flag1, flag2, ... then
these reside on some set of pages.  when you change x, on page p1 you 
update the
timestamp on p1.   Sometimes you end up re-simplifying
E even if nothing it REALLY depends on changes; just something on the same
page as something that matters has changed.

This heuristic doesn't always work.  Sometimes things change but Mathematica
doesn;t notice.

Don't believe me?  Look at this from the Mathematica book about the
Update[] command...

" Update manipulates internal optimization features of Mathematica. It 
should not need to be called except under special circumstances that 
rarely occur in practice.
\[FilledSmallSquare] One special circumstance is that changes in the 
value of one symbol can affect the value of another symbol by changing 
the outcome of Condition tests. In such cases, you may need to use 
Update on the symbol you think may be affected.
\[FilledSmallSquare] Using Update will never give you incorrect results, 
although it will slow down the operation of the system.
\[FilledSmallSquare] See The Mathematica Book: Section 2.5.12."

Anyway, it is about memoizing the state of an expression: simplified or not.

Macsyma stores the state as the SIMP marker on the CAR of an expression.
   But sometimes the SIMP flag
is a lie: you have to resimplify when flags are changed or values change.

I agree that a conservative memoizing implementation could be simple and 
worth it;
  it could
go both ways, storing rat(e)  on the car of e and vice versa.   One question
is how this can be handled recursively.  Do we want only top-level command
expressions to be rat/memoized?  In which case if u:rat(x+y)^2 is memoized,
what happens to 3*u ?

RJF


Stavros Macrakis wrote:
> Memo'izing the disrep'ed form could be useful, especially since much
> code in Maxima blindly disrep's whenever it runs into an CRE (canonical
> rational expression).
> 
> Let's look at what it would take to make it work.
> 
> As you say, any relevant globals should just be stored along with the
> memo'ized form.  But figuring out which globals are relevant is not so
> trivial....
> 
> In the disrep case, this includes all simplifier flags that might affect
> the result, e.g. $numer, $bfloat, $domain, $expop/expop (if $ratfac is
> true), $dotconstrules (and many other dot flags), $listarith,....  The
> Features database can use timestamps.  Note that ratdisrep does NOT
> re-simplify kernel objects, so things like $trigexpand do not need to be
> saved -- as for $trigreduce, it is a function, not a global variable.
> Also, I have not been able to find any cases where the Assume database
> is relevant.  On the other hand, if there are any rewrite rules in the
> pattern matcher database that affect +, *, and ^, then all bets are
> off -- ANY AND ALL simplification flags may be relevant.  Perhaps that
> should just disable memo'izing entirely.
> 
> In the ratrep case, the relevant globals include (a subset of) $ratvars,
> $keepfloat, $ratepsilon, $ratfac, $algebraic, *ratweights, the Tellrat
> database (timestamp).
> 
> I am sure there are others I haven't thought of.
> 
> Another complication is mutable elements.  As far as I can tell, the
> only mutable elements in Maxima expressions are Mlists and Marrays.
> (Top level Mlists and Marrays are not a problem, since ratrep'ing them
> just maps ratrep over their components.)  I have gone through the
> exercise of thinking about handling mutable elements with timestamps
> etc., and it is messy.  Trust me.  I had written a long email explaining
> the complications, but deleted it because I doubt anyone wants to see
> the gory details.  A lot of complication for rare or contrived cases
> like f([rat(x)]) and m . rat(matrix([a,b])).  So I would suggest simply
> giving up whenever there is an embedded MList or MArray, and not
> memo'izing at all.
> 
> This all seems messier than I would have thought at first.
> 
> Another approach would be to use a global assq list as a cache, and null
> it out whenever any "dangerous" event happens.  A dangerous event is any
> MSET to relevant variables or MList elements and any change to the
> Tellsimp, Tellrat, Assume, or Feature databases.  This assumes, of
> course, that we can effectively determine the list of relevant
> variables.  This would get rid of the complications of timestamps, assq
> lists of relevant flags, etc.  In fact, it would only be a dozen lines
> of code total, and the danger of breaking something is small.  As for
> the benefit, it benefits all operations which repeatedly disrep the same
> objects within one command.  Probably not a huge win, but worth a dozen
> lines of code.
> 
> I still think that the full memo'izing scheme could improve performance
> in some cases, but is it worth the added complexity in the code?
> 
> Your thoughts?
> 
>       -s
>