2D Discrete Convolution



I suspect that you could encode your problem into a polynomial 
multiplication.
It might be worth examining the cost of doing this computation using 
exact rationals
rather than bfloats.

It is also possible that you could do this by a bfloat version of an FFT.

RJF