Subject: Maxima as a compiler code generator component ?
From: Richard Fateman
Date: Tue, 06 Dec 2005 09:27:24 -0800
If you are looking for algebraic transformations, then the
time to look at the code is at a higher level (e.g.
abstract syntax tree for C code) not as 3-address code
when it is probably too late. An even higher level is
possible, e.g. when the computation is first being designed.
As far as matching patterns to assembler or pseudo-assembler,
there is a substantial literature on this, relating table-driven
architecture descriptions to code generation. This could presumably
be done by maxima, or by lisp or by other languages.
Order of operations in C code is not, I think, preserved by
optimizers. e.g. (a+b)+c or (a+c)+b is ok, even though the
numbers can come out differently.
RJF
Kim Lux wrote:
>I am a maxima newbie. I used it about 3 years ago to solve a multi
>variable algebraic problem and haven't looked at it since. Haven't
>needed to either, for that matter.
>
>This is a somewhat long winded email. At first you might wonder what it
>has to do with Maxima, but it gets mathematical in a bit.
>
>I am working on developing a system for generating and optimizing
>assembly language for the back end of the open source C compiler, sdcc.
>( http://sdcc.sourceforge.net/ ) At present the compiler uses a hand
>coded back end which we would like to replace with something more
>universal, easier to port, better optimization, etc.
>
>Presently sdcc is comprised of 2 components: The front end and various
>target specific back ends. The front end of the compiler takes C source
>code as input and converts it into what is known as iCode. iCode is a
>normalized representation of the C code in the form of 'iResult = iLeft
>op iRight' statements, where i variables are temporary variables
>assigned by the front end and op is various things like addition,
>subtraction, multiplication, etc. It is the function of the compiler
>back end to transform the iCode into suitably representative assembly
>language for the target processor. In the process of transforming the
>iCode into assembly language, various functions such as optimization,
>register allocation and pattern matching/replacement are performed.
>
>One method of generating assembler code from iCode is to express it in
>what has been termed Register Transfer Language. It is essentially
>expressing the iCode in an low level algebraic form. I like to call it
>nano code, but that isn't an industry term.
>
>For example, lets say we have an iCode of iResult = itemp1 + a.
>
>I could rewrite this in an algebraic format as follows:
>
>t1 = fetch(a)
>t2 = fetch(itemp1)
>t3 = t1+t2
>store(t3,iResult)
>
>We could write function Fetch as follows:
>
>t1 = a;
>t2 = SP + t1;
>t3 = M[t2]; Where M is the memory access function. t1,t2,t3 are local
>to fetch and should not be confused with t1,2,3 above.
>
>How does this turn into assembly language ? By combining expressions
>and matching them to their assembly language counterparts.
>
>For example, we can define the following rtl:
>
>t1 = t2 + t3 as being equivalent to the assembly language statement ADD
>r2,r3,r1
>and
>t3 = M(t2) as being equivalent to the assembly language statement LDW
>r3, r2
>or better yet:
>t3 = M(SP,t1) as being equivalent to the assembly language statement LDW
>SP,r1
>
>Likewise optimization of the rtl is just an algebraic manipulation as
>well. Register allocation isn't a trivial algebraic operation, but we
>can do that outside of the algebraic kernel.
>
>So...
>
>We need an algebraic engine (kernel) that given the algebraic
>representation of the iCode for a program can simplify and match it to
>various algebraic expressions representative of the assembly language
>statements. We will need to control the simplification, ie give it
>rules by which to do the simplification.
>
>A typical program's iCode will expand to 10,000 or more rtl statements.
>A typical processor's instruction set will have about 150 algebraic
>statements to represent it. RTL as boolean algebraic statements as
>well.
>
>We envision the process working something like this:
>
>Feed Maxima an rtl statement.
>Let Maxima reduce or attempt to match it.
>If no match is found, feed in another rtl statement.
>Until we determine that no match is available at which point we have
>faulty rtl or we find a match, at which point we emit the equivalent
>assembly language.
>
>At any one time, Maxima would be working with 30-50 rtl statements,
>meaning that if it didn't find matches in them, it would be an error.
>But it would run through 10,000 or more statements in the course of
>generating a program. It would be attempting to match its resulting
>code to the 150 algebraic statements representative of the assembly
>language every time we fed it a new statement.
>
>The preservation of the order of the statements is essential. Or is
>it ? If we represent the iCode as a true algebraic formula, then
>algebraic manipulation should result in the same result when converted
>to assembly language ? There is a big debate about this in certain
>compiler circles right now. Whether instruction reordering is a
>legitimate form of optimization.
>
>Is Maxima or parts of it suitable to perform this task ?
>
>I am open to hearing any thoughts or opinions on this topic.
>
>FWIW, these exists a proprietary system that works on this basis named
>VPO.
>
>An example of how VPO can be used is here:
>http://compiler.snu.ac.kr/papers/cases01-6.pdf
>
>Thanks.
>
>
>