| |
| |
| (* Copyright (c) 2011 Stefan Krah. All rights reserved. *) |
| |
| |
| The Matrix Fourier Transform: |
| ============================= |
| |
| In libmpdec, the Matrix Fourier Transform [1] is called four-step transform |
| after a variant that appears in [2]. The algorithm requires that the input |
| array can be viewed as an R*C matrix. |
| |
| All operations are done modulo p. For readability, the proofs drop all |
| instances of (mod p). |
| |
| |
| Algorithm four-step (forward transform): |
| ---------------------------------------- |
| |
| a := input array |
| d := len(a) = R * C |
| p := prime |
| w := primitive root of unity of the prime field |
| r := w**((p-1)/d) |
| A := output array |
| |
| 1) Apply a length R FNT to each column. |
| |
| 2) Multiply each matrix element (addressed by j*C+m) by r**(j*m). |
| |
| 3) Apply a length C FNT to each row. |
| |
| 4) Transpose the matrix. |
| |
| |
| Proof (forward transform): |
| -------------------------- |
| |
| The algorithm can be derived starting from the regular definition of |
| the finite-field transform of length d: |
| |
| d-1 |
| ,---- |
| \ |
| A[k] = | a[l] * r**(k * l) |
| / |
| `---- |
| l = 0 |
| |
| |
| The sum can be rearranged into the sum of the sums of columns: |
| |
| C-1 R-1 |
| ,---- ,---- |
| \ \ |
| = | | a[i * C + j] * r**(k * (i * C + j)) |
| / / |
| `---- `---- |
| j = 0 i = 0 |
| |
| |
| Extracting a constant from the inner sum: |
| |
| C-1 R-1 |
| ,---- ,---- |
| \ \ |
| = | r**k*j * | a[i * C + j] * r**(k * i * C) |
| / / |
| `---- `---- |
| j = 0 i = 0 |
| |
| |
| Without any loss of generality, let k = n * R + m, |
| where n < C and m < R: |
| |
| C-1 R-1 |
| ,---- ,---- |
| \ \ |
| A[n*R+m] = | r**(R*n*j) * r**(m*j) * | a[i*C+j] * r**(R*C*n*i) * r**(C*m*i) |
| / / |
| `---- `---- |
| j = 0 i = 0 |
| |
| |
| Since r = w ** ((p-1) / (R*C)): |
| |
| a) r**(R*C*n*i) = w**((p-1)*n*i) = 1 |
| |
| b) r**(C*m*i) = w**((p-1) / R) ** (m*i) = r_R ** (m*i) |
| |
| c) r**(R*n*j) = w**((p-1) / C) ** (n*j) = r_C ** (n*j) |
| |
| r_R := root of the subfield of length R. |
| r_C := root of the subfield of length C. |
| |
| |
| C-1 R-1 |
| ,---- ,---- |
| \ \ |
| A[n*R+m] = | r_C**(n*j) * [ r**(m*j) * | a[i*C+j] * r_R**(m*i) ] |
| / ^ / |
| `---- | `---- 1) transform the columns |
| j = 0 | i = 0 |
| ^ | |
| | `-- 2) multiply |
| | |
| `-- 3) transform the rows |
| |
| |
| Note that the entire RHS is a function of n and m and that the results |
| for each pair (n, m) are stored in Fortran order. |
| |
| Let the term in square brackets be f(m, j). Step 1) and 2) precalculate |
| the term for all (m, j). After that, the original matrix is now a lookup |
| table with the mth element in the jth column at location m * C + j. |
| |
| Let the complete RHS be g(m, n). Step 3) does an in-place transform of |
| length n on all rows. After that, the original matrix is now a lookup |
| table with the mth element in the nth column at location m * C + n. |
| |
| But each (m, n) pair should be written to location n * R + m. Therefore, |
| step 4) transposes the result of step 3). |
| |
| |
| |
| Algorithm four-step (inverse transform): |
| ---------------------------------------- |
| |
| A := input array |
| d := len(A) = R * C |
| p := prime |
| d' := d**(p-2) # inverse of d |
| w := primitive root of unity of the prime field |
| r := w**((p-1)/d) # root of the subfield |
| r' := w**((p-1) - (p-1)/d) # inverse of r |
| a := output array |
| |
| 0) View the matrix as a C*R matrix. |
| |
| 1) Transpose the matrix, producing an R*C matrix. |
| |
| 2) Apply a length C FNT to each row. |
| |
| 3) Multiply each matrix element (addressed by i*C+n) by r**(i*n). |
| |
| 4) Apply a length R FNT to each column. |
| |
| |
| Proof (inverse transform): |
| -------------------------- |
| |
| The algorithm can be derived starting from the regular definition of |
| the finite-field inverse transform of length d: |
| |
| d-1 |
| ,---- |
| \ |
| a[k] = d' * | A[l] * r' ** (k * l) |
| / |
| `---- |
| l = 0 |
| |
| |
| The sum can be rearranged into the sum of the sums of columns. Note |
| that at this stage we still have a C*R matrix, so C denotes the number |
| of rows: |
| |
| R-1 C-1 |
| ,---- ,---- |
| \ \ |
| = d' * | | a[j * R + i] * r' ** (k * (j * R + i)) |
| / / |
| `---- `---- |
| i = 0 j = 0 |
| |
| |
| Extracting a constant from the inner sum: |
| |
| R-1 C-1 |
| ,---- ,---- |
| \ \ |
| = d' * | r' ** (k*i) * | a[j * R + i] * r' ** (k * j * R) |
| / / |
| `---- `---- |
| i = 0 j = 0 |
| |
| |
| Without any loss of generality, let k = m * C + n, |
| where m < R and n < C: |
| |
| R-1 C-1 |
| ,---- ,---- |
| \ \ |
| A[m*C+n] = d' * | r' ** (C*m*i) * r' ** (n*i) * | a[j*R+i] * r' ** (R*C*m*j) * r' ** (R*n*j) |
| / / |
| `---- `---- |
| i = 0 j = 0 |
| |
| |
| Since r' = w**((p-1) - (p-1)/d) and d = R*C: |
| |
| a) r' ** (R*C*m*j) = w**((p-1)*R*C*m*j - (p-1)*m*j) = 1 |
| |
| b) r' ** (C*m*i) = w**((p-1)*C - (p-1)/R) ** (m*i) = r_R' ** (m*i) |
| |
| c) r' ** (R*n*j) = r_C' ** (n*j) |
| |
| d) d' = d**(p-2) = (R*C) ** (p-2) = R**(p-2) * C**(p-2) = R' * C' |
| |
| r_R' := inverse of the root of the subfield of length R. |
| r_C' := inverse of the root of the subfield of length C. |
| R' := inverse of R |
| C' := inverse of C |
| |
| |
| R-1 C-1 |
| ,---- ,---- 2) transform the rows of a^T |
| \ \ |
| A[m*C+n] = R' * | r_R' ** (m*i) * [ r' ** (n*i) * C' * | a[j*R+i] * r_C' ** (n*j) ] |
| / ^ / ^ |
| `---- | `---- | |
| i = 0 | j = 0 | |
| ^ | `-- 1) Transpose input matrix |
| | `-- 3) multiply to address elements by |
| | i * C + j |
| `-- 3) transform the columns |
| |
| |
| |
| Note that the entire RHS is a function of m and n and that the results |
| for each pair (m, n) are stored in C order. |
| |
| Let the term in square brackets be f(n, i). Without step 1), the sum |
| would perform a length C transform on the columns of the input matrix. |
| This is a) inefficient and b) the results are needed in C order, so |
| step 1) exchanges rows and columns. |
| |
| Step 2) and 3) precalculate f(n, i) for all (n, i). After that, the |
| original matrix is now a lookup table with the ith element in the nth |
| column at location i * C + n. |
| |
| Let the complete RHS be g(m, n). Step 4) does an in-place transform of |
| length m on all columns. After that, the original matrix is now a lookup |
| table with the mth element in the nth column at location m * C + n, |
| which means that all A[k] = A[m * C + n] are in the correct order. |
| |
| |
| -- |
| |
| [1] Joerg Arndt: "Matters Computational" |
| http://www.jjj.de/fxt/ |
| [2] David H. Bailey: FFTs in External or Hierarchical Memory |
| http://crd.lbl.gov/~dhbailey/dhbpapers/ |
| |
| |
| |