Re: [Maxima] Newbie question: linear regression



Hi Ross

we need declare(sum,linear)and expand.
(C1) q2(x,y,n,a,b):=sum((y[i]-(b+a*x[i]))^2,i,0,n-1)$
(C2) declare(sum,linear)$
(C4) display2d:false$
(C5) q2(x,y,n,a,b);
(D5) 'SUM((y[i]-a*x[i]-b)^2,i,0,n-1)
(C6) expand(%);
(D6) b^2*n+'SUM(y[i]^2,i,0,n-1)-2*a*'SUM(x[i]*y[i],i,0,n-1)
           -2*b*'SUM(y[i],i,0,n-1)+a^2*'SUM(x[i]^2,i,0,n-1)
           +2*a*b*'SUM(x[i],i,0,n-1)
(C7) [diff(%,a),diff(%,b)];
(D7) [-2*'SUM(x[i]*y[i],i,0,n-1)+2*a*'SUM(x[i]^2,i,0,n-1)
                                 +2*b*'SUM(x[i],i,0,n-1),
       2*b*n-2*'SUM(y[i],i,0,n-1)+2*a*'SUM(x[i],i,0,n-1)]
(C8) algsys(d7,[a,b]);
(D8) [[a = (('SUM(x[i]*y[i],i,0,n-1))*n
          -('SUM(x[i],i,0,n-1))*'SUM(y[i],i,0,n-1))
          /(('SUM(x[i]^2,i,0,n-1))*n-('SUM(x[i],i,0,n-1))^2),
        b = -(('SUM(x[i],i,0,n-1))*'SUM(x[i]*y[i],i,0,n-1)
          -('SUM(x[i]^2,i,0,n-1))*'SUM(y[i],i,0,n-1))
          /(('SUM(x[i]^2,i,0,n-1))*n-('SUM(x[i],i,0,n-1))^2)]]

this is OK.
furuya gosei

> Hi. I've just joined this list. I've just started using maxima, and to
> be honest am finding it heavy going. My immediate need for maxima is
> to create MLE expressions for the parameters of a number of probability
> distributions. I'm not a mathematician (unfortunately I did a cellular
> and molecular biology/computer science double major for my BSc, and hence
> did not study as much maths as typical computer scientists).
> 
> As practice for solving MLE problems, I'm trying to derive the 
> expressions for a and b (as in y=ax+b) for linear regression, minimising
> the squared residual errors.
> 
> First, I define an equation describing the squared difference for the 
> residuals of a line (described by a and b for ax+b) and a set of data points
> (in arrays x[] and y[], with n data points). This looks like:
> 
> q2( x, y, n, a, b ) := 
>        SUM( (y[i] - (b+a*x[i])) * (y[i]-(b+a*x[i])), i, 0, n-1 );
> 
> I can differentiate this fine and get the results that I expect to find
> as per web pages on the method:
> 
> (C3) diff( q2( x, y, n, a, b ), a );
> 			     n - 1
> 			     ====
> 			     \
> (D3) 			 - 2  >	   x  (y  - a x  - b)
> 			     /	    i   i      i
> 			     ====
> 			     i = 0
> 
> and:
> 
> (C4) diff( q2( x, y, n, a, b ), b );
> 			       n - 1
> 			       ====
> 			       \
> (D4) 			   - 2  >    (y  - a x  - b)
> 			       /       i      i
> 			       ====
> 			       i = 0
> 
> But, when I try and solve these as a set of linear equations, I get 
> nothing!
> 
> (C5) algsys( [ diff( q2( x, y, n, a, b ), b ) = 0, 
>                diff( q2( x, y, n, a, b ), a ) = 0 ], [a, b ] );
> (D5) 				      []
> 
> So, just when I thought I'd reached the finishing line, I'm stumped. Possibly
> incorrectly, I'm thinking that if I can figure out what I'm doing wrong here,
> then MLE estimations for parameters of many distributions will be a cinch
> (and I'm also excited about least-squares minimising fits of equations to
> data). Except of course that I get [] as a result, and am stumped as to
> why.
> 
> Is there something simple that I'm doing wrong? Either in terms of the 
> underlying math, or in my use of maxima?
> 
> Thanks in anticipation,
> 
> Ross-c
> 
> _______________________________________________
> Maxima mailing list
> Maxima@www.math.utexas.edu
> http://www.math.utexas.edu/mailman/listinfo/maxima
> 
> 



 
------------------------------------------------------------------------
$B%9%+%&%H$G(B $BG/<}(BUP$B",$b(B $BL4$8$c$J$$(B by infoseek
http://ap.infoseek.co.jp/career7.html