Background and Motivation

A-16.1 Background and Motivation

Given two $(n - 1)$ -degree polynomials:

$A (X) = \sum_{i = 0}^{n - 1} a_{i} X^{i}$ , ... $B (X) = \sum_{i = 0}^{n - 1} b_{i} X^{i}$

, the polynomial multiplication $C (X) = A (X) \cdot B (X)$ is computed as follows:

$C (X) = \sum_{i = 0}^{2 n - 2} c_{i} X^{i}$ , where $c_{i} = \sum_{k = 0}^{i} a_{k} b_{i - k}$

This operation of computing $\vec{c} = (c_{0}, c_{1}, \dots, c_{2 n - 1})$ is also called the convolution of $\vec{a}$ and $\vec{b}$ , denoted as $\vec{c} = \vec{a} \otimes \vec{b}$ . The time complexity of this operation (i.e., the total number of multiplications between two numbers) is $O (n^{2})$ .

Another way of multiplying two polynomials is based on point-value representation. The point-value representation of an $(n - 1)$ -degree (or lesser degree) polynomial $A (X)$ is a set of $n$ coordinates ${(x_{0}, y_{0}), (x_{1}, y_{1}), \dots (x_{n - 1}, y_{n - 1})}$ , where each $x_{i}$ is a distinct $X$ coordinate (whereas each $y_{i}$ is not necessarily a distinct $Y$ coordinate). Given a point-value representation of an $(n - 1)$ -degree (or lesser degree) polynomial, we can use polynomial interpolation (§A-15) to derive the polynomial. Let’s denote the point-value representation of $(n - 1)$ -degree (or lesser degree) polynomial $A (X)$ and $B (X)$ as follows:

$A (X)$ : $((x_{0}, y_{0}^{⟨ a ⟩}), (x_{1}, y_{1}^{⟨ a ⟩}), \dots (x_{n - 1}, y_{n - 1}^{⟨ a ⟩}))$

$B (X)$ : $((x_{0}, y_{0}^{⟨ b ⟩}), (x_{1}, y_{1}^{⟨ b ⟩}), \dots (x_{n - 1}, y_{n - 1}^{⟨ b ⟩}))$

Then, the point-value representation of the polynomial $C (X) = A (X) \cdot B (X)$ can be computed as a Hadamard product (Definition A-10.1 in §A-10.1) of the $y$ values of the point-value representation of $A (X)$ and $B (X)$ as follows:

$C (X)$ : $((x_{0}, y_{0}^{⟨ c ⟩}), (x_{1}, y_{1}^{⟨ c ⟩}), \dots (x_{n - 1}, y_{n - 1}^{⟨ c ⟩}))$ , where $y_{i}^{⟨ c ⟩} = y_{i}^{⟨ a ⟩} \cdot y_{i}^{⟨ b ⟩}$

However, we cannot derive polynomial $C (X)$ based on these $n$ coordinates because the degree of $C (X)$ is $2 n - 2$ (or less than $2 n - 2$ ). But if we regard all polynomials (including $A (X), B (X)$ and $C (X)$ ) to be in the polynomial ring $ℝ [X] ∕ (X^{n} + 1)$ (or $ℤ_{p} [X] ∕ (X^{n} + 1)$ ), then we can reduce the $(2 n - 2)$ -degree polynomial $C (X)$ to a congruent $(n - 1)$ -degree (or lesser degree) polynomial in the ring. Then, the $n$ coordinates of $C (X)$ are sufficient to derive $C (X)$ .

However, the time complexity of this new method is still $O (n^{2})$ . The Hadamard product between two polynomials’ point-value representations takes $O (n)$ , but evaluating a polynomial at $n$ distinct $x$ values takes $O (n^{2})$ (because each polynomial has $n$ terms, and we have to compute each term for $n$ distinct $x$ values). The polynomial interpolation for deriving $C (X)$ also takes $O (n^{2})$ .

To solve this efficiency problem, this section will introduce an efficient technique for polynomial evaluation, which can evaluate a polynomial at $n$ distinct roots of unity in $O (n \log n)$ . This technique is classified into 2 types: Fast Fourier Transform (FFT) and Number-theoretic Transform(NTT). These two types are technically almost the same, with the only difference being that the FFT assumes a polynomial ring over complex numbers (§A-7), whereas the NTT assumes a polynomial ring over a finite field (e.g., integers modulo a prime) (§A-9). Polynomial multiplication based on FFT (or NTT) comprises 3 steps: (1) forward FFT (or NTT); (2) point-value multiplication; and (3) inverse FFT (or NTT).

[parent][next]