derivatives of matrix functions



Hello,

To follow up on the maximum likelihood problem posed
by Ross Clement -- I have been working on expressing 
the linear regression problem in matrix form, so that 
summations are implicit in multiplications. This makes
the solution much more comprehensible, and much easier
to extend to any number of variables.

Let w be a parameter vector. Let X be a matrix with
one row per case and one column per parameter. Let y
be the observed outcomes. It's not necessary to 
actually specify the numbers of cases and parameters,
so I'll leave them out of this exercise entirely.

It is well know that the least squares solution, to 
minimize y - X.w, is \hat w = (X'X)^(-1) X.y. This is 
also the maximum likelihood solution if Gaussian noise 
is assumed. (The prime, ', represents the transpose.)
I am trying to reproduce this solution in Maxima.

Here's how far I've gotten. 

(C1) declare(X,nonscalar)$
(C2) declare(y,nonscalar)$
(C3) declare(w,nonscalar)$
(C4) norm2(x) := transpose(x).x$
(C5) gradef('norm2(x), 2 .transpose(x))$
(C6) gradef('transpose(x), 1)$
(C7) display2d:FALSE$
(C8) E:y-X.w$
(C9) EE:'norm2(E)/2$
(C10) diff(EE,w);
(D10) -X*'TRANSPOSE(y-X . w)

Oops, that's not right. (See bug report #853720.)
I'll just plug in the right answer & keep going.

(C11) dEE_dw:-'TRANSPOSE(y-X . w) . X;
(D11) -'TRANSPOSE(y-X . w) . X
(C12) expand(dEE_dw);
(D12) -'TRANSPOSE(y-X . w) . X

Hmm, the quote mark seems to prevent expand from
expanding. I'll just pretend it's not there.
(How would I work around the quote mark in general?)

(C13) dEE_dw:-TRANSPOSE(y-X . w) . X$
(C14) expand(dEE_dw);
(D14) 'TRANSPOSE(w) . 'TRANSPOSE(X) . X-'TRANSPOSE(y) . X

OK, that's more like it. Let's keep going.

(C15) solve(d14=0, w);
(D15) ['TRANSPOSE(w) . 'TRANSPOSE(X) . X = 'TRANSPOSE(y) . X]

Well, this is indeed equivalent to the standard solution,
but I'll try to get something of the form "w = ...".

(C16) dEE_dw:transpose(expand(dEE_dw));
(D16) 'TRANSPOSE(X) . X . w-'TRANSPOSE(X) . y

After taking the transpose of the whole thing, 
it has w instead of transpose(w).

(C17) ('TRANSPOSE(X) . X)^^(-1).dEE_dw;
(D17) ('TRANSPOSE(X) . X)^^(-1) . ('TRANSPOSE(X) . X . w-'TRANSPOSE(X)
. y)
(C18) expand(%);
(D18) w-('TRANSPOSE(X) . X)^^(-1) . 'TRANSPOSE(X) . y

I've multiplied through by the inverse of the stuff in
front of w. Almost there.

(C19) solve(d18=0, w);
(D19) [w = ('TRANSPOSE(X) . X)^^(-1) . 'TRANSPOSE(X) . y]

Et voila! It is a bit disheartening that I had to prod it along
at several points... Oh well. 

Aside from #853720, there appears to be one other bug
manifested here. d(X.w)/dw = A, indeed diff(X.w,w) returns X.
However, d(w'.X)/dw is X', yet diff(transpose(w).X,w))
returns X, not transpose(X).
(More exactly, diff returns d/dw (transpose(w)) . X, 
and d/dw (transpose(w)) is the identity.)

Any comments about how this problem could be solved
with less prodding will be appreciated. In particular
I'm curious about how to get exand() to treat 'transpose()
as if it were transpose().

Regards,
Robert Dodier

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/