Name: FFTR2C.ASM Type: Assembler Macro Version: 1.2 Last Change: 4-Feb-87 Description: Radix 2, In-place, Decimation-in-time Complex FFT Macro This macro performs a complete Fast Fourier Transform (FFT) on complex data. The basic algorithm is the Decimation-in-time (DIT), Radix 2 FFT algorithm using 24 bit fixed-point arithmetic. The algorithm uses a sine-cosine lookup table for the FFT coefficients (twiddle factors). FFTR2C can be called to perform any FFT from 16-32768 points. Simply call it with the arguments of number of FFT points, location of the data array and location of the sine-cosine table. All register initialization is performed by this macro. However, the macro assumes that registers which should not be altered by the FFT have already been saved by the main program. This allows the user to fit the FFT macro into his application and thus control the context switching overhead. No data scaling is performed and no overflow detection is done. Modifications to this routine could allow it to be used with the scaling modes and thus allow dynamic scaling for each FFT pass. All data and coefficients are complex, with the real part in X Data memory and the imaginary part in Y Data memory. For an N point FFT, the data buffer requires N X Data and N Y Data memory locations. The algorithm is performed "in-place", meaning that only one data buffer is required for both input and output data. The input data is assumed to be in normal (time-sequential) order and the output is in bit-reversed order. By using the reverse-carry address modifier and a separate output data buffer, the output data may be easily unscrambled. Other methods also exist to unscramble the output data without a separate output data buffer. The FFTR2C macro uses "twiddle factors" (-cosine and -sine tables) stored in data memory. For maximum speed, the FFT macro performs a lookup table operation to get new sine and cosine values for each group of butterflies. A SINCOS macro is available to generate these tables. For an N point FFT, N/2 X Data and N/2 Y Data locations are required. Sine and cosine values could be calculated in real-time to save data memory at the expense of execution time. The FFTR2C macro is faster than the FFTR2A and FFTR2B library macros. The speed increase is obtained by performing the first two passes in one combined operation without storing the results of the first pass. The first two passes do not require any multiplies since the coefficients are -1, 0 or +1. The last pass is also split out from the triple nested DO loop and given a separate DO loop (identical to the FFTR2B macro). The reason this is faster is that the inner loop always starts with a loop count of 1 on the last pass. Note that the separate last pass DO loop uses different addressing modes to increment through the butterflies, thus avoiding outer loops. Additional details are included in the source file; however, more algorithm description would be required for complete understanding by typical users. The FFTR2C macro can directly replace the FFTR2A macro using the calling procedure shown in the FFTR2AT test program.