A considerably speed-up in Legendre transform runtime is possible by using a slightly modified version of FFTPACK, the freely available collection of Fortran routines developed at NCAR which compute 1-dimensional Fourier (and related) transforms. The original, unmodified version of the library is available here.
The modified version of FFTPACK, the version that SpharmonicKit uses, is available in the following formats:
We make the same statement concerning our modified FFTPACK as NCAR does about the original: the modified FFTPACK is provided free of charge, but there is no guarantee.Directions to make and install the modified FFTPACK library are included in the distribution of FFTPACK we provide.
The specific FFTPACK routines used by SpharmonicKit are:
To allow easy use of the FFTPACK-based routines (which are written in Fortran), small C-"interface" functions were written that call the appropriate FFTPACK-library routines. This should make substituting your own optimized FFT and DCT routines relatively straightfoward - just modify the definitions of the "interface" functions. If you use your own FFT and DCT routines, make sure that all scaling of coefficients and samples is done properly.
IMPORTANT: In order to use the FFTPACK-based (or your own) routines, define the symbol FFTPACK in the Makefile before compiling, and make sure the symbol FFTFLAGS in the Makefile is defined and is pointing to where the FFTPACK (or your own) library lives.
Exactly how was the scaling modified in the original, unmodified fftpack routines cosqb, cosqf ? Let X be an input array of length N. Let Y be the output of kFCT (or kFCTX - they're identical), Z be the output of the original, unmodified fftpack routine cosqb. Let SCALE = 1 / (2 * N) .Then (using C and not Fortran indexing)
Let X be an input array of length N. Let Y be the output of ExpIFCT. Now let me multiply the first coefficient of X by 2 ( so X[0] *= 2 ) and plug this modified X into the original, unmodified fftpack-routine cosqf. Let Z be the resulting output of cosqf. Let SCALE = 1/2. Then (using C and not Fortran indexing)
All this scaling is now done within the source files cosqf1.f and cosqb1.f . It's a little hokey, but it works.