if-then-else inferencing



I've been building a prototype implementation of if-then-else inferencing that might eventually find its way into Maxima.

The "ordered list of disjoint atomic numeric intervals" Boolean lattice has been built & functions correctly.

The harder part is building an inference system on top of this lattice to make appropriate inferences.

Consider a nested expression f(x,g(y,x)), where x,y range over the reals, and f,g: reals^2 -> reals.

We would like to compute 3 different elements from this lattice: X, Y, and R.

We have the schema R=f(X,g(Y,X)), in which f,g are extended in the obvious way to operate on sets.

This means "given the sets X, Y of numeric values, compute the range set R,
where R = {r|x in X, y in Y, r=f(x,g(y,x))}.

But this schema needs to be interpreted _symmetrically_, i.e., any a priori information
about the "output" set R should be fed back to constrain the "input" sets X and Y.

Thus, given an initial X_0, Y_0, R_0, we want to compute the _largest_ elements X, Y, R,
such that R=f(X,g(Y,X)).

One way to do this is to _iterate_ from X_0, Y_0, R_0, and using information from f and g
to further constrain X_i, Y_i, R_i.  This iteration is analogous to computing an approximation
of values of a mesh in a partial differential equation from the initial condition values
given at its borders.

At least in this simple case of fixed known functions f,g, such an iterative process will
quickly converge to lattice elements X_n, Y_n, R_n, where X_(n+1)=X_n, Y_(n+1)=Y_n, and
R_(n+1)=R_n.

Note that since there are two occurrences of the variable x in the expression f(x,g(y,x)),
we need to _intersect_ the X_(i+1) produced by the backwards inference from f with the
X_(i+1) produced by the backwards inference from g.

Thus, the backwards inference from f(x,g(y,x)) produces an element in a new lattice which
consists of the variable binding _environments_ which bind variables x and y.

Putting all of this together, we can finally infer information from expressions like
(if predicate(x,y) then truearm(x,y) else falsearm(x,y)).

ENV(x,y), R = (if predicate(x,y) then truearm(x,y) else falsearm(x,y)), where

ENV(x,y)=variable x bound to set X, variable y bound to set Y.

Given an initial (too large) approximation X and an initial (too large) approximation Y,
and an initial (too large) approximation R, we iterate until we produce lattice elements
X,Y,R such that R = (if predicate(X,Y) then truearm(X,Y) else falsearm(X,Y)).

Note that if predicate(X,Y) consists of the single element {nil}, then for x in X, and
y in Y, predicate(x,y) will always be false, and R=falsearm(X,Y).  Similarly if
predicate(X,Y)={t}, then for x in X, y in Y, predicate(x,y) will always be true, and
R=truearm(X,Y).  We may also have the case that predicate(X,Y)={}, meaning that
predicate(x,y) will "blow up"/"crash" for all values x in X, y in Y.

Using this type of inference, we get the following results:

For X=(minf,inf), times(X,X)=(minf,inf).

However, for X=(minf,inf), (if plusp(X) then times(X,X) else times(X,X)) = [0,inf);
i.e., this if-then-else expression always produces non-negative results.

We get the paradoxical result that simply checking the sign of x enables the system
to infer that x*x is always non-negative.

This is because the "true arm" is evaluated assuming that plusp(x) is true, so X=(0,inf)
and X*X=(0,inf), while the "false arm" is evaluated assuming that plusp(x) is false, so
X=(minf,0], and X*X=[0,inf).  Unioning the results from the true arm and the false arm
produces the conclusion that the entire if-expression produces results in [0,inf).

Thus, the following bizarre definition works both for implementation and inference:

declare(x,real);

square(x):=(if (x>0) then x*x else x*x);

The Lisp compiler will/should notice that checking the sign of x won't change the result, so
it _optimizes_ the sign check out of the code to produce the equivalent of

square(x):=x*x;

Nevertheless, during the type inference phase, the if expression forces the compiler
to do case analysis on the different cases: positive x's, non-positive x's, and concludes
that the result of square(x) is always non-negative.