Abstract: |
In this talk, we propose an implementation of parallel three-dimensional real fast Fourier transforms (FFTs) with
two-dimensional decomposition on manycore clusters.
The proposed parallel three-dimensional FFT algorithm is based on the conjugate symmetry property for the discrete Fourier transform (DFT) and the multicolumn FFT algorithm.
We show that a two-dimensional decomposition effectively improves performance by reducing the communication time for larger numbers of MPI processes.
We also present a computation-communication overlap method that introduces a communication thread with OpenMP.
Performance results of three-dimensional real FFTs on manycore clusters are reported. |
|