?iga Lenar?i? wrote:
> On Aug 23, 2009, at 10:47 PM, Robert Dodier wrote:
>
>
>> On 8/23/09, ?iga Lenar?i? <ziga.lenarcic at gmail.com> wrote:
>>
>>
>>> myfft uses bordeaux-fft. I wrote a thin layer to convert maxima list
>>> to a complex lisp array and back. The difference grows for bigger
>>> sample sizer (for which the reason must be inefficient list
>>> conversion of O(N^2) in current fft).
>>>
>>> Any interest?
>>>
>> Well, I suspect you're right about the list/array conversion
>> being the bottleneck. If so, then just replacing that with
>> faster code is going to get us most of the speed improvement
>> with a tiny fraction of the effort. I can merge in a patch
>> for the list/array conversion if you write it.
>>
>> Robert Dodier
>>
>
> Actually if you look at the source code - it's a real mess with a lot
> of badly coded parts. The author also makes some very bad assumptions
>
(That author would be me, who used the code from Oppenheim and Schafer.)
> about fft code which he has translated from a fortran version printed
> in a book (not a good thing in my book) - like using two arrays, one
> for real parts and one for imaginary is fast. It's never a good idea
> to access two big arrays in a tight loop instead of just one -
> processor has trouble caching two different memory locations which
> effectively can mean a lot of unused processor cycles when waiting to
> retrieve data. This is a very important factor when writing numerical
> code...
>
Processors should be able to cache different memory locations without
problems. If not, then how can a multiuser/multiprocess system ever get
anything but bad performance since everything is always in different
memory locations?
In addition, the algorithm accesses the elements in bit-reversed order,
which is not nice to caches. Are you going to write an FFT with
in-order access for each stage? I think they exist, but I don't think
they're commonly used.
Also, you assume that arithmetic with complex numbers is fast. This is
not true for most Lisps. Most lisps will cons up complex numbers and
that will kill performance. (Cmucl and sbcl have unboxed complex
floats. I don't know of any others.) But many lisps can operate on
floats without consing. For these lisps, the algorithm is good.
Ray