Subject: LBFGS for use in large maximum likelihood problem
From: Robert Dodier
Date: Thu, 21 Aug 2008 09:13:25 -0600
On 8/19/08, dlakelan <dlakelan at street-artists.org> wrote:
> It seems to me that it takes longer to do each step of the LBFGS when I use
> the "sum" noun than when I construct the huge expression, although the sum
> noun is of course much faster to construct.
I'm not seeing that behavior. I've attached a script which solves an easy
maximum likelihood problem (computing mean and standard deviation)
via LBFGS. The log likelihood function is either a symbolic sum or a literal
sum (i.e. an expression which has operator = "+"). For the symbolic sum,
the execution time grows a little bit faster than linear. For the literal sum,
the execution time grows somewhat faster than linear, and except for
small numbers of data, the symbolic sum is faster than the literal sum.
If you 're seeing something else, I don't know what is going on.
You'll have to post additional details.
Btw I'm working with a cvs version close to 5.16.2. I tried the script
with 2 or 3 Lisp varieties with pretty much the same results.
FWIW
Robert Dodier
PS. Put the following in a file foo.mac and run it by batch(foo).
load(foo) won't work due to :lisp in it.
p (x, mu, sigma) := exp (-(1/2)*((x - mu)/sigma)^2) / (sigma * sqrt (2*%pi));
l : 'product (p ('x[i], mu, sigma), i, 1, n);
nll : - log (l), logexpand=super;
x : makelist (i, i, 1, 1280)$
nll_literal_10 : nll, sum, n=10$
nll_literal_20 : nll, sum, n=20$
nll_literal_40 : nll, sum, n=40$
nll_literal_80 : nll, sum, n=80$
nll_literal_160 : nll, sum, n=160$
nll_literal_320 : nll, sum, n=320$
nll_literal_640 : nll, sum, n=640$
nll_literal_1280 : nll, sum, n=1280$
load (lbfgs);
showtime : true;
:lisp (defmspec $gettime (x) `((mlist) ,@ (mapcar #'car (mapcar #'(lambda (a) (g
et a 'time)) (cdr x)))))
kill (labels);
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=10;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=20;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=40;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=80;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=160;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=320;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=640;
lbfgs (nll, '[mu, sigma], [1, 1], 1e-5, [-1, 0]), n=1280;
lbfgs (nll_literal_10, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_20, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_40, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_80, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_160, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_320, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_640, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
lbfgs (nll_literal_1280, '[mu, sigma], [1, 1], 1e-5, [-1, 0]);
times_symbolic : gettime (%o1, %o2, %o3, %o4, %o5, %o6, %o7, %o8);
times_literal : gettime (%o9, %o10, %o11, %o12, %o13, %o14, %o15, %o16);
plot2d
([[discrete, [10, 20, 40, 80, 160, 320, 640, 1280], times_symbolic],
[discrete, [10, 20, 40, 80, 160, 320, 640, 1280], times_literal]]);