D-3.13 Modulus Bootstrapping

- Reference: Bootstrapping for Approximate Homomorphic Encryption [18]

During CKKS’s ciphertext-to-ciphertext multiplication, each ciphertext is associated with a particular multiplicative level and it decreases by 1 upon each ciphertext-to-ciphertext multiplication (by its internal modulus rescaling operation). Reaching multiplicative level 0 is equivalent to reaching the end of a ciphertext’s modulus chain and no more ciphertext-to-ciphertext multiplication can be performed. To continue with further ciphertext-to-ciphertext multiplication, CKKS provides a special operation called bootstrapping, which is a process of resetting the ciphertext’s end-of-chain modulus q0 to the initial maximum modulus qL (which is either q0 ΔL in the vanilla rescaling scheme, or m=0Lwm in the case of using CRT, as explained in §D-3.5.4).

Suppose we have a ciphertext (A,B) with multiplicative depth 0. If we decrypt a ciphertext whose multiplicative level is 0 (i.e., the ciphertext’s modulus is q0), then decrypting it without reduction modulo q0 would output:

𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B))

= B + A S = ΔM + E + q0 K # since B + A S mod q0 = ΔM + E

, where q0 K accounts for wrap-around modulo q0 values– each coefficient of polynomial q0K is some multiple of q0. CKKS’s bootstrapping procedure is equivalent to safely transforming a ciphertext’s modulus from q0 to qL (where qL q0).

D-3.13.1 High-level Idea

As the first step of bootstrapping, we forcibly change the modulus of the ciphertext (A,B) from q0 to qL. Then, its decryption with reduction modulo qL would output:

𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B)) mod qL

= B + A S mod qL

= ΔM + E + q0K mod qL

Here, we assume that qL is large enough such that ΔM + E + q0K qL. This is true given S has small coefficients which are either {1,0,1}, and thus the coefficients of B + A S would not grow much.

In the ΔM + E + q0K mod qL term, notice that because of the q0K term which is not modulo-reduced by q0 anymore, the ciphertext’s decrypted plaintext polynomial’s each i-th term would get a corrupted coefficient Δmi + ei + q0 ki mod qL instead of Δmi + ei mod qL. So, we now need to eliminate the garbage term q0 ki mod qL in each coefficient and distill the pure plaintext coefficient Δmi + ei.

PIC

Figure 16: Sine function f(x) = q0 2π sin (2𝜋𝑥 q0 ) such that f(Δmi + ei + q0ki) Δmi + ei (provided Δmi + ei q0) (Source)

To do so, we will take an approximated approach by using a sine function described in Figure 16, which has a period of q0 with the amplitude q0 2π. This sine function has the following two useful properties:

1.
When f(x) is evaluated at x values near the multiple of q0, the result approximates to that of a line function y = x. This is because the derivative (slope) of sinx is y = cosx, and if x is a multiple of 2π, the slope is: y = cos2π = 1.
2.
The evaluation of f(x) eliminates the multiples of q0 from the input (i.e., modulo reduction q0)

Combining these two properties, given input x = Δmi + ei + q0ki,

f(Δmi + ei + q0ki) = q0 2π sin (2π (Δmi + ei + q0ki)) q0 ) = q0 2π sin (2π (Δmi + ei)) q0 ) Δmi + ei

, provided Δmi + ei is very close to 0 relative to q0 (i.e., Δmi + ei q0). This is true, because by construction of the CKKS scheme, the plaintext modulus (even with scaling it up by Δ), is significantly smaller than the ciphertext modulus. Therefore, to remove q0K from ΔM + E + q0K, we can update each coefficient of the polynomial ΔM + E + q0K by evaluating it with the f(x) sine function. However, we cannot directly update the coefficients of the polynomial, because the CKKS scheme (the RLWE scheme in general) only supports the input vector’s slot-wise (+,⋅) operations. Therefore, to update the polynomial coefficients, we need to express the update logic in terms of slot-wise input vector arithmetic (+,⋅). Considering all these, CKKS’s overall bootstrapping procedure is described in Table 6.

1 ModRaise: Given ciphertext (A,B) mod q0, we forcibly modify its modulus from q0 to qL.
Then, it ends up encrypting ΔM + E + q0k instead of ΔM + E.
2 CoeffToSlot: Based on step 1’s ciphertext (A,B) mod qL, we generate a new ciphertext
that encrypts an input vector whose its each i-th slot stores Δmi + ei + q0ki.
This is equivalent to moving the coefficients of polynomial ΔM + E + q0K to
the input vector slots.
3 EvalExp: We convert the sine function into an approximated polynomial by using
the Taylor series, as well as other optimizations such as
Euler’s formula (e𝑖𝜃 = cos(𝜃) + i sin(𝜃)). Then, we generate a CKKS plaintext that encodes
this approximated sine function, and then use this to homomorphically evaluate
step 2’s encrypted vector elements (to homomorphically remove every q0ki.
4 SlotToCoeff: Based on the resulting ciphertext from step 3, we generate a new ciphertext
whose encrypted polynomial’s each i-th coefficient is (approximately) Δmi + ei.
This is equivalent to moving the q0ki-eliminated values stored in the input vector slots in
step 3 back to the positions of the polynomial coefficients. The final ciphertext is
our goal ciphertext that (approximately) encrypts ΔM + E under modulus qL.
Table 6: High-level Description of CKKS’s Bootstrapping Procedure
D-3.13.2 Mathematical Expansion of the High-level Idea

We will mathematically walk through how the bootstrapping procedure (Table 6) correctly updates the modulus of the input ciphertext from q0 to qL.

For ease of understanding, we will first explain how we would do modulus bootstrapping for a ciphertext with multiplicative level 0 (i.e., its modulus is q0) in case we have access to the secret key S(X). Using this key, we can decrypt the ciphertext as follows:

𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B)) # where 𝖼𝗍 = (A,B) = 𝖱𝖫𝖶𝖤S,σ(ΔM)

= B + A S = ΔM + E mod q0

= ΔM + E + q0K # where q0K accounts for any potential wrap-around modulo q0 values

Our initial goal is to bootstrap the modulus of the ciphertext from q0 to qL by using only the following three tools:

After explaining the above, we will then explain how to achieve the same bootstrapping without having access to the secret key S.

ModRaise: This step forcibly changes the ciphertext’s modulus from q0 to qL and then decrypts the ciphertext as follows:

𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B)) = B + A S = ΔM + E + q0K mod qL

Notice that the ciphertext’s decrypted plaintext polynomial’s each i-th coefficient gets corrupted to mi + ei + q0 ki mod qL. So, we now need to eliminate the garbage term q0ki mod qL in each coefficient and distill the pure plaintext coefficient Δmi + ei.

CoeffToSlot: This step generates a new plaintext polynomial whose each i-th input vector slot stores the corrupted coefficient (mi + ei + q0ki). The trick of doing this is to apply CKKS’s batch-encoding mapping σ1 (which represents the transformation m = W~ InR v n as explained in §D-3.1) to the input vector slots that encode the polynomial ΔM + E + q0K mod qL. Let vc be the input vector that corresponds to polynomial ΔM + E + q0K. Then, vc and ΔM + E + q0K satisfy the following relation over the encoding mapping σ1:

σ1(vc) = Mc = i=0n1(Δmi + ei + q0ki) Xi # i.e., polynomial ΔM + E + q0K

This implies that if we homomorphically apply the σ1 transformation to the elements of the input vector vc, then the resulting input vector vs will store vc’s encoded polynomial coefficient values as follows:

σ1 vc = vs = (Δm0 + e0 + q0k0, Δm1 + e1 + q0k1, , Δmn1 + en1 + q0kn1)

# where represents a linear transformation operation comprising (+,⋅)

However, remember that at the end of the ModRaise step, we get the decrypted (but corrupted by q0k) polynomial Mc = 𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B)) = ΔM + E + q0K and we are not allowed to decode it into vc. Therefore, we will instead encode the matrix W~ InR in the encoding transformation σ1 (m = W~ InR v n ) into its equivalent polynomials (treating a matrix as a combination of vectors) and then perform batched slot-wise (+,⋅) operation between Mc and the polynomial version of W~ InR. We express this polynomial-based computation as follows:

Ms = σσ11 Mc mod qL # σσ11 is the polynomial-encoded version of the σ1 transformation

Then, the resulting polynomial Ms’s corresponding input vector slots (i.e., the decoded version of Ms) will store vs = (Δm0 + e0 + q0k0, Δm1 + e1 + q0k1, , Δmn1 + en1 + q0kn1). In other words, the above computation effectively moves the coefficients of Mc to the input vector slots of a new plaintext polynomial.

However, remember that in CKKS, an input vector can store only up to n 2 slots, whereas we need to store a total of n coefficients of Mc in the input vector slots. Therefore, we technically need to create 2 pieces of Ms as Ms1 and Ms2, where the input vector of Ms1 stores (Δm0 + e0 + q0k0, Δm1 + e1 + q0k1, , Δmn 2 1 + en 2 1 + q0kn 2 1), and the input vector of Ms2 stores (Δmn 2 + en 2 + q0kn 2 , , Δmn1 + en1 + q0kn1).

EvalExp: Our next step is to update vs’s each element mi + ei + q0ki to mi + ei by evaluating it with the sine function f(x). Since the output of the CoeffToSlot step is polynomial Ms (technically Ms1 and Ms2), we need to apply the evaluation transformation in an encoded form. First, we approximate f(x) as a linear combination comprising only (+,⋅) operations by using the Taylor series and Euler’s formula (will be explained later). Then, we encode (i.e., σ) the approximated formula into a polynomial form, and we denote it as σf. Finally, we apply the σf transformation to Ms as follows:

σf1 Ms mod qL # Applying the sine function’s linear transformation to vs’s each slot storing Δmi + ei + q0ki

= Mt = σ(vt) = σ((Δmi + ei)i=0n1) mod qL

After the linear transformation by the sine function, notice that each q0ki term gets eliminated from vs’s slots (i.e. modulo reduction by q) and the resulting vector vt stores only the Δmi + ei terms.

SlotToCoeff: Now that we have a polynomial Mt whose corresponding input vector vt’s slots store garbage-removed coefficients of (i.e., Δmi + ei) our initial plaintext polynomial, our next step is to put these coefficients stored in vt back to the polynomial. This is an exact reverse operation of CoeffToSlot as follows:

σσ1 Mt = Mb # σσ1 is a polynomial-encoded form of the batch-decoding formula v = W~m (§D-3.1)

The result is polynomial Mb whose coefficients are garbage-eliminated (i.e., q0ki-free) versions of Mc. Finally, we re-encrypt Mb as 𝖱𝖫𝖶𝖤S,σ(Mb) as the final modulus-bootstrapped ciphertext.

Bootstrapping Without a Secret Key: So far, we have assumed that we have access to the secret key S. With decryption and re-encryption enabled, the above bootstrapping steps described are mathematically equivalent to computing the following:

1.
INPUT: 𝖼𝗍 = (A,B) mod q0 # where 𝖼𝗍 = (A,B) = 𝖱𝖫𝖶𝖤S,σ(ΔM)
2.
ModRaise: 𝖼𝗍 = (A,B) mod qL
3.
Decryption: 𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B))) mod qL
4.
CoeffToSlot: σσ11 𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B))) mod qL
5.
EvalExp: σf1 (σσ11 𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B)))) mod qL
6.
SlotToCoeff: σσ1 (σf1 (σσ11 𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B))))) mod qL
7.
Re-encryption: 𝖱𝖫𝖶𝖤S,σ(σσ1 (σf1 (σσ11 𝖱𝖫𝖶𝖤1(𝖼𝗍 = (A,B)))))) mod qL

However, the ultimate goal of CKKS bootstrapping is to reset the modulus of a ciphertext from q0 to qL without having access to S.

Meanwhile, one important insight is that CKKS’s ModRaise procedure on the ciphertext (A,B) mod q0 from q0 qL effectively transforms the ciphertext into a new one which is an encryption of ΔM + q0K. Before ModRaise, ciphertext (A,B) mod q0’s decryption relation is as follows:

A A + B = ΔM + E + Kq0 mod q0 = ΔM + E

After ModRaise to (A,B) mod qL, its decryption relation is as follows:

A S + B = ΔM + E + Kq0 mod qL = ΔM + E + Kq0 # because ΔM + E + Kq0 qL

Therefore, the mod-raised ciphertext (A,B) mod qL = 𝖱𝖫𝖶𝖤S,σ(ΔM + Kq0) with noise E. Thus, CKKS’s homomorphic bootstrapping strategy is to run the subsequent CoeffToSlot, EvalExp, and SlotToCoeff steps homomorphically based on the ciphertext (A,B) mod qL. Running these 3 steps consumes a few multiplicative levels due to the ciphertext-to-ciphertext multiplication operations when homomorphically multiplying the coefficient-to-slot and slot-to-coefficient transformation matrices and homomorphically computing powers of X (i.e., Xk) during sine approximation. Therefore, upon completion of these 3 steps, the ciphertext modulus reduces from qL ql (where l is some integer such that l < L).

Note that the result of homomorphic bootstrapping is equal to the explicit bootstrapping based on decryption & re-encryption (if we ignore the small differences in the final ciphertext modulus and the noise). In the following subsections, we will explain the algebraic details of CoeffToSlot, EvalExp and SlotToCoeff steps.

D-3.13.3 Details: CoeffToSlot

Homomorphically moving the coefficients of Mc (i.e., Δmi + ei + q0ki for 0 i n 1) to a new ciphertext’s input vector slots is mathematically equivalent to homomorphically computing σσ11 (𝖱𝖫𝖶𝖤S,σ(𝖼𝗍 = (A,B)), which is equivalent to applying the encoding formula to the input vector slot values of 𝖱𝖫𝖶𝖤S,σ(𝖼𝗍 = (A,B)).

As explained in Summary D-3.9 (in §D-3.9), the encoding formula for converting an input vector into a list of polynomial coefficients is m = W~ InR v n , where W~ is a basis of the n-dimensional vector space crafted as follows:

W~ = [ 1 1 1 1 1 1 (ωJ(n 2 1)) (ωJ(n 2 2)) (ωJ(0)) (ωJ(n 2 1)) (ωJ(n 2 2)) (ωJ(0)) (ωJ(n 2 1))2 (ωJ(n 2 2))2 (ωJ(0))2 (ωJ(n 2 1))2 (ωJ(n 2 2))2 (ωJ(0))2 (ωJ(n 2 1))n1 (ωJ(n 2 2))n1 (ωJ(0))n1 (ωJ(n 2 1))n1 (ωJ(n 2 2))n1 (ωJ(0))n1 ]

# where the rotation helper function J(h) = 5h mod 2n

Therefore, given the input ciphertext 𝖼𝗍𝖼 = 𝖱𝖫𝖶𝖤S,σ(𝖼𝗍 = (A,B)) mod qL whose plaintext polynomial Mc contains corrupted coefficients, computing σσ11   𝖼𝗍𝖼 is equivalent to computing W~ InR 𝖼𝗍𝖼 n . However, one problem here is that each CKKS ciphertext encodes only n 2 input vector slots, whereas our goal is to move n (corrupted) coefficients of the plaintext polynomial Mc encrypted in 𝖼𝗍𝖼. Therefore, we will instead generate 2 ciphertexts, 𝖼𝗍𝗌1 and 𝖼𝗍𝗌2, such that each 𝖼𝗍𝗌1’s input vector slots store (Δmi + ei + q0ki)0i<n 2 and 𝖼𝗍𝗌2’s input vector slots store (Δmi + ei + q0ki)n 2 i<n.

We split the n ×n matrix W~ InR into four n 2 ×n 2 matrices as follows:

Then, we can compute 𝖼𝗍𝗌1 and 𝖼𝗍𝗌2 as follows:

𝖼𝗍𝗌1 = [W~InR]11 𝖼𝗍𝖼 + [W~InR]12 In 2 R 𝖼𝗍𝖼¯ n

𝖼𝗍𝗌2 = [W~InR]21 𝖼𝗍𝖼 + [W~InR]22 In 2 R 𝖼𝗍𝖼¯ n

Each homomorphic matrix-vector multiplication (e.g., [W~InR]21 𝖼𝗍𝖼) can be done in an efficient manner that reduces the number of homomorphic rotations (§D-2.10). 𝖼𝗍𝖼¯ can be computed by applying homomorphic conjugation to ctc (§D-3.11).

D-3.13.4 Details: EvalExp

We will use the sine function f(x) = q 2π sin (2𝜋𝑥 q0 ) to approximately eliminate q0ki from Δmi + ei + q0ki by computing f(Δmi + ei + q0ki) Δmi + ei. This approximation works if Δmi + ei is very close to x = 0 relative to q0 (i.e., Δmi + ei q0). Still, the elimination of q0ki is approximate (i.e., Δmi + ei), because f(x) is y x nearby x = 0, not exactly y = x.

One issue is that we need to evaluate f(x) homomorphically based on cts1 and cts2 as inputs (i.e., f(𝖼𝗍𝗌1) and f(𝖼𝗍𝗌2)), but FHE supports only (+,⋅) operations, whereas the sine graph cannot be formulated by only (+,⋅). Therefore, we will approximate the sine function f(x) by using the Taylor series (§A-14):

f(x) = f(a) + f(a) 1! (x a) + f(a) 2! (x a)2 + f(a) 3! (x a)3 + = d=0f(d)(a) d! (x a)d

If we approximate f(x) around x = 0, then the approximated polynomial is as follows:

f(x) = q0 2π sin (2π q0 0) + q0 2π 2π q0 cos (2π q0 0) 1! x + q0 2π (2π q0 ) 2 sin (2π q0 0) 2! x2 +

= q0 2π j=0 ( (1)j (2j + 1)! (2𝜋𝑥 q0 ) 2j+1)

q0 2π j=0h ( (1)j (2j + 1)! (2𝜋𝑥 q0 ) 2j+1) = f^(x)

, where f^(x) is a (2h + 1)-degree polynomial.

Remember that in the RLWE cryptosystem, B + 𝐴𝑆 mod q0 = ΔM + E, or B + 𝐴𝑆 = ΔM + E + q0K with some polynomial K representing the wrapping around values of modulo q0. Since the secret key S is an (n 1)-degree polynomial whose coefficients are small (i.e., si {1,0,1}), the coefficients of K will have some reasonably small upper bound, which decreases with the sparsity of S (i.e., the frequency of 0 coefficients in S). Therefore, the degree of our approximated f^(x) only needs to be high enough to accurately evaluate y values between q0 𝑘𝑚𝑎𝑥 x q0 𝑘𝑚𝑎𝑥. The required minimum degree of our approximated Taylor polynomial f^(x) increases with q0𝑘𝑚𝑎𝑥 (i.e., the upper bound of x). Our one issue is that the computation overhead for homomorphic evaluation of a polynomial generally increases exponentially with the degree of the polynomial, which will slow down bootstrapping. To reduce this computation cost, we will leverage Euler’s formula (§A-11) and its square arithmetic:

{ ei𝜃 = cos𝜃 + i sin𝜃 (ei𝜃)2 = ei2𝜃

By substituting 𝜃 = 2𝜋𝑥 q0 , we will use Euler’s formula. We will also approximate e𝑖𝜃 with the Taylor series, but instead of directly approximating e𝑖𝜃, we will first approximate e𝑖𝜃 2r for some large 2r. After that, we will iteratively square e𝑖𝜃 2r a total r times. Then, we get an approximation of (e𝑖𝜃 2r )2r = e𝑖𝜃. The reason why we start with the approximation of e𝑖𝜃 2r instead of e𝑖𝜃 is that its approximation requires a small degree of polynomial, as 𝜃 2r (i.e., the input to the complex exponential function) is small provided 2r is sufficiently large. Specifically, we learned that x ( = Δmi + ei + q0ki) is upper-bounded by q0𝑘𝑚𝑎𝑥, thus 𝜃 = 2𝜋𝑥 q0 is upper-bounded by 2π𝑘𝑚𝑎𝑥 2r . As the targeted range of x for approximation in f(x) is small, we need a small degree of Taylor series polynomial.

Using the Taylor series with degree d0 around x = 0, we can approximate e2𝜋𝑖𝑥 2rq0 as:

fe(x) = e2𝜋𝑖𝑥 2rq0 d=0d0 1 d! (2𝜋𝑖𝑥 2rq0 ) d = f^e(x)

Then, we iteratively square f^e total r times to get:

(f^e(x))2r (fe(x))2r = ei2𝜋𝑥 q0 = e𝑖𝜃

Then, based on Euler’s formula ei𝜃 = cos𝜃 + i sin𝜃, we can derive the following relations:

ei𝜃¯ = cos𝜃 + i sin𝜃¯

ei𝜃 = cos𝜃 i sin𝜃

ei𝜃 ei𝜃 = (cos𝜃 + i sin𝜃) (cos𝜃 i sin𝜃) = 2isin𝜃

sin𝜃 = i 2 (ei𝜃 ei𝜃)

q0 2π sin𝜃 = q0 2π i 2 (ei𝜃 ei𝜃)

Substituting 𝜃 = 2𝜋𝑥 q0 , we finally get:

q0 2π sin (2𝜋𝑥 q0 ) = q0 2π i 2 (ei2𝜋𝑥 q0 ei2𝜋𝑥 q0 )

Using the final relation above, the EvalExp step homomorphically evaluates the approximation of q0 2π sin (2𝜋𝑥 q0 ) where x = Δmi + ei + q0ki as follows:

1.
Homomorphically approximately compute f^(x) = ei2𝜋𝑥 q0 .
2.
Homomorphically approximately compute f^(x)¯ = ei2𝜋𝑥 q0 by applying homomorphic conjugation. (§D-3.11) to f^(x)
3.
Homomorphically compute f^(x) f^(x)¯ = ei2𝜋𝑥 q0 ei2𝜋𝑥 q0 , and then multiply the result by i 2 encoded as CKKS plaintext.

The result of EvalExp is two ciphertexts whose input vector slots store the bootstrapped coefficients of Mc, which are modulo-reduced q0 from Δmi + ei + q0ki to Δmi + ei + e𝑏𝑖. Note that e𝑏𝑖 is a bootstrapping error introduced by the following three factors: (1) the intrinsic homomorphic (+,⋅) computation noises of the CoeffToSlot, EvalExp, and SlotToCoeff steps; (2) the EvalExp step’s Taylor polynomial approximation error of the exponential function e𝑖𝜃; (3) the EvalExp step’s sine graph error, since the graph is not exactly y = x around x = 0, but only y x.

Note that since the output of the CoeffToSlot step was split into 2 ciphertexts (cts1 and cts2), the output of the EvalExp step is also in 2 ciphertexts: (ctb1 and ctb2). The input vector slots of ctb1 store (Δmi + ei + e𝑏𝑖)i=0n 2 1, whereas the input vector slots of ctb2 store (Δmi + ei + e𝑏𝑖)i=n 2 n1.

D-3.13.5 Details: SlotToCoeff

This step is an exact inverse of the CoeffToSlot step, which is moving the bootstrapped (i.e. modulo-reduced q0) coefficients of Mv stored in the input vector slots back to the final plaintext polynomial Mf. Remember that the decoding formula from a polynomial to an input vector (§D-3.1) is v = W~m, where:

W~ = [ 1 (ωJ(0)) (ωJ(0))2 (ωJ(0))n1 1 (ωJ(1)) (ωJ(1))2 (ωJ(1))n1 1 (ωJ(2)) (ωJ(2))2 (ωJ(2))n1 1 (ωJ(n 2 1)) (ωJ(n 2 1))2 (ωJ(n 2 1))n1 1 (ωJ(n 2 1)) (ωJ(n 2 1))2 (ωJ(n 2 1))n1 1 (ωJ(1)) (ωJ(1))2 (ωJ(1))n1 1 (ωJ(0)) (ωJ(0))2 (ωJ(0))n1 ]

We denote [W~]11, [W~]12, [W~]21, and [W~]22 as n 2 ×n 2 matrices corresponding to the upper-left, upper-right, lower-left, and lower-right sections of W~. Then, homomorphically applying the decoding formula results in the final bootstrapped ciphertext ctfinal modulo qL whose plaintext polynomial is garbage-eliminated from ΔM + E + q0K mod qL to ΔM + E + Eb mod ql (where Eb is the bootstrapping error polynomial). Note that the ciphertext modulus changed from qL ql because we consumed some multiplicative levels for computing ciphertext-to-ciphertext multiplications during polynomial evaluation (i.e., Xk). We can derive ctfinal by homomorphically applying the decoding transformation W~T to the results of the EvalExp step (ctb1 and ctb2) as follows:

𝖼𝗍𝖿𝗂𝗇𝖺𝗅 = 𝖱𝖫𝖶𝖤S,σ(ΔM + E + Eb) = [W~]11 𝖼𝗍𝖻1 + [W~]12 𝖼𝗍𝖻2

Note that we do not use [W~]21 and [W~]22, because we only need to derive the n 2 input vector slots whose decoding would result in the n coefficients (Δmi + ei + e𝑏𝑖)i=0n1 of the final (n 1)-degree polynomial. Once we generate a new ciphertext ctfinal whose n 2 input vector slots store (Δmi + ei + e𝑏𝑖)i=0n1, then its latter n 2 conjugate slots get automatically filled with the conjugates of the first n 2 slot values.

D-3.13.6 Reducing the Bootstrapping Overhead by Sparsely Packing Ciphertext

In many cases, the application of CKKS may use only a small number of input vector slots (e.g., n 2 ) out of n 2 slots. Suppose that such n is some number that divides n. Then, we can do a series of homomorphic rotations and multiplications to make the input vector slots store n n repetitions of the n 2 -slot values. Specifically, we can do this in total n n rounds of rotations and additions: initially, we zero-mask between the n 2 -th slot and the n 2 1-th slots and save as ct, and then in each i-th round we compute 𝖼𝗍 = 𝖼𝗍 + 𝖱𝗈𝗍𝖺𝗍𝖾(𝖼𝗍,n2i).

Then, we apply the optimization of sparsely packing ciphertext in Summary D-3.12 (§D-3.12): if an n 2-dimensional input vector is structured as n n consecutive repetitions of the first n 2 slot values, then its encoded polynomial M(X) [X](Xn + 1) has the structure such that all its coefficients whose degree term is not a multiple of n n are zero as follows:

M(X) = c0 + c n nX n n + c2n n X2n n + + cn n nXn n n.

Remember that in the CoeffToSlot step (§D-3.13.3), we use the formula m = W~ InR v n to move the q0k-contaminated polynomial’s coefficients to the input vector slots. But by the principle of sparsely packed ciphertext, we know that all the slots of m which are not a multiple of n n slots would store a zero coefficient. This means that we will get the same computation result even if we only compute the m = W~ InR v n formula with the rows of W~ whose row index is a multiple of n n. Mathematically, we can update the encoding formula to ms = E InR v n where the n × n n matrix E is an elimination of all those columns from W~ whose column index is not a multiple of n n:

E = [ 1 1 1 1 1 1 (ξJ(0nn n)) (ξJ(1nn )) (ξJ(n nn)) (ξJ(n nn)) (ξJ(1nn )) (ξJ(0nn n)) (ξJ(0nn n))2 (ξJ(1nn n))2 (ξJ(n nn))2 (ξJ(n nn))2 (ξJ(1nn n))2 (ξJ(0nn n))2 (ξJ(0nn n))n1 (ξJ(1nn n))n1 (ξJ(n nn))n1 (ξJ(n nn))n1 (ξJ(1nn n))n1 (ξJ(0nn n))n1 ]

Remember that in the original CoeffToSlot step (§D-3.13.3), we had to split cts into cts1 and cts2 because in CKKS each input vector can store a maximum of n 2 slots but we need to move a total of n coefficient values to the input vector slots for bootstrapping. On the other hand, the computation result of the above updated encoding formula (using a sparsely packed ciphertext) is ms, having only n n coefficient slots instead of n coefficient slots, and each slot index i in ms corresponds to the encoded polynomial’s coefficient with degree term i n n (we do not compute any other coefficient terms, because we know that they are 0 anyway, so no need to bootstrap them). And notice that n n n 2 , because n divides n. Therefore, without computing two ciphertexts 𝖼𝗍𝗌1 = [W~InR]11 𝖼𝗍𝖼 + [W~InR]12 In 2 R 𝖼𝗍𝖼¯ and 𝖼𝗍 𝗌2 = [W~InR]21 𝖼𝗍𝖼 + [W~InR]22 In 2 R 𝖼𝗍𝖼¯ separately, we can directly compute 𝖼𝗍𝖼 = E InR 𝖼𝗍𝖼 n , because all coefficients for bootstrapping fit in n 2 slots. Therefore, the number of homomorphic computations and memory requirement for the CoeffToSlot step can be reduced by half. And the same is true for the EvalExp step (§D-3.13.4).

Similarly, as for the SlotToCoeff step (§D-3.13.5), we update the decoding formula v = W~m to v = ET mc. This again reduces the number of homomorphic computations and memory requirements for the SlotToCoeff step by half. Notice that ET is a matrix where those columns whose column index is not a multiple of n n are zero. This zero-enforcement to the columns of ET still outputs the same computation result, because mc is a vector such that those slots whose slot index is not a multiple of n n are zero, which makes the computation result with their corresponding columns of ET (i.e., the columns whose index is not a multiple of n n) zero, anyway.

D-3.13.7 Summary

We summarize the CKKS bootstrapping procedure as follows.

Summary D-3.13.7 CKKS Bootstrapping

1.
INPUT: 𝖼𝗍 = (A,B) mod q0 # where 𝖼𝗍 = (A,B) = 𝖱𝖫𝖶𝖤S,σ(ΔM)

, which satisfies the decryption relation: A S + B = ΔM + E + Kq0

2.
ModRaise: View the polynomials A and B as plaintext polynomials whose each coefficient is in qL (i.e., (A,B) mod qL). This change of viewpoint automatically changes the ciphertext as 𝖱𝖫𝖶𝖤S,σ(ΔM + Kq0). The ModRaise step does not require any actual computation.

3.
CoeffToSlot:

Move the coefficients of the encrypted plaintext ΔM + E + q0K to the input vector slots by homomorphically multiplying n1 W~ InR to it follows:

𝖱𝖫𝖶𝖤S,σ(Z1) = n1 W~ InR 𝖱𝖫𝖶𝖤S,σ(ΔM + E + q0K) mod qL

4.
EvalExp:

Remove the wrap-around garbage value q0K in ΔM + E + q0K by homomorphically evaluating the polynomial σf which approximates a sine function with period q0 as follows:

𝖱𝖫𝖶𝖤S,σ(Z2) = σf 𝖱𝖫𝖶𝖤S,σ(Z1) mod ql

This step is equivalent to homomorphically performing modulo reduction by q0 to the input value. This step reduces the ciphertext modulus from qL ql as it consumes multiplicative levels when homomorphically evaluating the polynomial approximation of the sine function.

5.
SlotToCoeff:

Move the modulo-q0-reduced plaintext value ΔM + E stored in the input vector slots back to the plaintext coefficient positions by homomorphically multiplying the encoding matrix W~ as follows:

𝖱𝖫𝖶𝖤S,σ(ΔM + E) = W~𝖱𝖫𝖶𝖤S,σ(Z2) mod ql

Limitation: The noise slowly grows over each bootstrapping due to the bootstrapping error and will eventually overflow the message and the ciphertext modulus.

Comparison between BFV and CKKS Bootstrapping: In the case of CKKS’s bootstrapping, it does not reduce the magnitude of the old noise E and keeps it the same as before, because the sine approximation function converts ΔM + E + Kq0 into ΔM + E. However, as the ciphertext modulus gets increased from q0 qL, the noise-to-ciphertext-modulus ratio decreases, since E qL E q0. On the other hand, the bootstrapping procedure introduces a new bootstrapping noise, which can be viewed as a fixed amount. However, this fixed amount of new noise accumulates over each bootstrapping. Therefore, after a very large number of bootstrappings, the noise will eventually overflow the message and even the ciphertext modulus.

In the case of BFV’s bootstrapping, it reduces the noise, but does not change the ciphertext modulus. However, there is no need to reset the ciphertext modulus, because BFV does not have a leveled ciphertext modulus chain, and BFV’s ciphertext-to-ciphertext multiplication does not consume ciphertext modulus. Furthermore, since BFV’s bootstrapping directly removes the noise, the noise is guaranteed to be kept under a certain threshold even after an infinite number of bootstrappings.

Another important difference is that CKKS’s bootstrapping does not require homomorphic decryption, primarily because it maintains the plaintext’s scaling factor to be the same across the entire bootstrapping procedure. On the other hand, BFV’s bootstrapping needs to change the plaintext’s scaling factor to run the digit extraction algorithm. Therefore, homomorphic decryption is required to change the plaintext scaling factor (p𝜀) while preserving the same ciphertext modulus (q).

D-3.13.8 Reducing the Bootstrapping Noise

As explained in §D-3.13.4, the bootstrapping procedure generates three types of noises:

The Type-1 noise is inevitable by the design of FHE. The Type-2 noise can be either avoided or unavoidable depending on the tradeoff setup between the bootstrapping accuracy and efficiency. Unlike these two types of noises, the Type-3 noise can be effectively reduced by newer bootstrapping techniques.

PIC

Figure 17: Arc-sine graph for smaller approximation error (Source)

𝐚𝐫𝐜𝐬𝐢𝐧(𝐬𝐢𝐧(𝐱)) Approximation (EUROCRYPT 2021 [19]): Using the arcsin(sin(x)) function instead of the sinx function can reduce the Type-3 noise, because its line is not curved but straight, as shown in Figure 17 (comprising a series of y = x and y = x segments). This technique also uses the Remez algorithm that evenly distributes the approximation error over a specified region. However, one downside of this technique is that it consumes 3 multiplicative levels.

Meta-BTS (CCS 2022 [20]): This is thus far the most computationally efficient and accurate bootstrapping technique, whose procedure is as follows:

1.
Perform the regular bootstrapping based on the sine graph to the input ciphertext.
2.
Rescale step 1’s bootstrapped ciphertext to modulus q0.
3.
Subtract step 2’s ciphertext from the initial un-bootstrapped ciphertext (where both ciphertexts are modulo q0), whose result is a modulo q0 ciphertext storing the bootstrapping error.
4.
Bootstrap the output ciphertext of step 3 (storing the bootstrapping error) to modulus ql.
5.
Subtract step 4’s ciphertext from step 1’s ciphertext (where both ciphertexts are modulo ql), which gives a new modulo ql ciphertext with a reduced bootstrapping error.

Limitation in Noise Handling: The above bootstrapping techniques can reduce the Type-3 noise, because the bootstrapping error is smaller than the plaintext message and a smaller input x value to the approximating sine function outputs a value closer to y = x. Running this algorithm multiple times, the Type-3 noise becomes exponentially smaller, because the size of the target plaintext (i.e., the extracted bootstrapping error as the output of step 3 above) is much smaller than before. Meanwhile, Type-1 and Type-2 noises do not decrease over multiple bootstrapping rounds, relatively keeping their same level, because each round generates new Type-1 and Type-2 noises.