Fixed the sum range variable subscipts.
[binaryfield.git] / main.tex
1 \documentclass[10pt]{article}
2
3 % correct bad hyphenation here
4 \hyphenation{op-tical net-works semi-conduc-tor}
5
6 \input{style}
7
8 \begin{document}
9
10 % paper title
11 % can use linebreaks \\ within to get better formatting as desired
12 \title{Converting to Binary in Prime-Order Galois Fields}
13
14
15 % author names and affiliations
16 % use a multiple column layout for up to two different
17 % affiliations
18
19 \author{David Llewellyn-Jones}
20 %\IEEEauthorblockA{School of Computing and Mathematical Sciences\\
21 %Liverpool John Moores University\\
22 %Liverpool, UK\\
23 %\{D.Llewellyn-Jones, Q.Shi, M.Merabti\}@ljmu.ac.uk}
24
25 % make the title area
26 \maketitle
27
28
29 %\begin{abstract}
30 %The abstract goes here. DO NOT USE SPECIAL CHARACTERS, SYMBOLS, OR MATH IN YOUR TITLE OR ABSTRACT.
31 %\end{abstract}
32
33 \section{Assumptions}
34
35 We consider operations over the finite Galois Field $\GF(p)$ of prime order $p$. The field has addition $+$ and multiplication $.$ which are commutative, associative and distributive. It also has additive and multiplicative identities $0$ and $1$ respectively.
36
37 Every element $x \in \GF(p)$ has an additive inverse ${-x} = p - x$ and every non-zero element $x \in \GF(p)^* = \GF(p) \setminus \{ 0 \}$ has a multiplicative inverse $x^{-1}$. By Lagrange's Theorem we have that $x^p = 1$, except where $x = 0$, in which case $x^p = 0$.
38
39 We consider only the $n$-bit numbers in $\GF(p)$ where $p > 2^{n}$. For example if $n = 8$ we're considering 8-bit numbers with a maximum value of 255 and where $p$ is a prime number likely to be considerably larger than this.
40
41 \section{Method}
42
43 We start by defining a function that allows us to determine whether a number is larger than or equal to the constant value $m$.
44
45 \begin{lemma}
46 \label{lemma:comparison}
47 Let
48 \begin{equation}
49 \label{equation:comparison}
50 d_m(x) = \left( \vphantom\prod \smash{\prod_{i = 0}^{m}} (x - i) \right)^p.
51 \end{equation}
52 We claim that
53 $$
54 d_m(x) =
55 \begin{cases}
56 0 & x \le m \\
57 1 & \text{otherwise}
58 \end{cases}
59 $$
60 \end{lemma}
61 \begin{proof}
62 We start by showing that $\prod_{i = 0}^{m} (x - i) = 0$ iff $x \le m$.
63
64 Suppose $x \le m$. Then for some $0 \le i \le m$ we must have $x - i = 0$. Hence $\prod_{i = 0}^{m} (x - i) = 0$ as required.
65
66 For the counterfactual, suppose that $\prod_{i = 0}^{m} (x - i) = 0$. We aim to show that in this case $x \le m$. We do this by induction on $m$.
67
68 For the base case, set $m = 0$. Hence we have
69 \begin{align*}
70 \prod_{i = 0}^{m} (x - i) & = 0, \\
71 \prod_{i = 0}^{0} (x - i) & = 0, \\
72 (x - 0) & = 0, \\
73 x & = 0.
74 \end{align*}
75
76 For the inductive step we assume it's true for $x = a$ and want to show it's therefore true for $x = a + 1$. We have (by our inductive hypothesis) that if $\prod_{i = 0}^{a} (x - i) = 0$ then $x \le a$.
77
78 Now suppose $\prod_{i = 0}^{a + 1} (x - i) = 0$. By rearranging terms (commutativity) we can write this as
79 $$
80 (x - (a + 1)) \times \prod_{i = 0}^{a} (x - i).
81 $$
82 Suppose $\prod_{i = 0}^{a} (x - i) = 0$, then the result immediately follows by the inductive hypothesis since therefore $x \le a < a + 1$. If we assume this isn't the case, then by the existence of inverses we can post-multiply by the inverse of the product to get
83 $$
84 (x - (a + 1)) = 0.
85 $$
86 Again, by rearranging we see that in this case $x = (a + 1)$ and hence that $x \le a + 1$ as required.
87
88 From the above discussion we know that
89 $$
90 \prod_{i = 0}^{m} (x - i) = 0 \qquad \iff \qquad x \le m.
91 $$
92 For any value $v \in \GF(p)$ we know that
93 $$
94 v^p =
95 \begin{cases}
96 0 & v = 0 \\
97 1 & \text{otherwise}
98 \end{cases}
99 $$
100 We conclude that
101 $$
102 d_m(x) = \left( \vphantom\prod \smash{\prod_{i = 0}^{m}} (x - i) \right)^p =
103 \begin{cases}
104 0 & x \le m \\
105 1 & \text{otherwise}
106 \end{cases}
107 $$
108 \end{proof}
109
110 For notational simplicity, we define the following function.
111 \begin{definition}
112 Let $d_m'(x) = d_{2^m}(x).$
113 \end{definition}
114
115 We're now in a position to define the function that will output the bits of our $n$-bit number. We define separate functions $b_i : \GF(p) \to \{0, 1 \}$ for each bit $1 \le i \le n$ starting at $n$ and working downwards.
116 \begin{definition}
117 Define $b_n : \GF(p) \to \{0, 1\}$ to be the function
118 $$
119 b_n (x) = d_{n - 1}'(x).
120 $$
121 Starting from $m = n - 1$ and decreasing to $m = 1$ we define functions $b_m : \GF(p) \to \{0, 1 \}$ inductively as follows.
122 \begin{equation}
123 \label{equation:getbit}
124 b_m (x) = d_{m - 1}' \left( x - \vphantom{\sum} \smash{\sum_{i = m}^{n - 1}} 2^i b_i (x) \right).
125 \end{equation}
126 \end{definition}
127 We claim that the functions $b_m (x)$ for $1 \le m \le n$ will generate the binary bits of any number $x < 2^n$ where $b_n$ is the MSB and $b_1$ is the LSB.
128 \begin{theorem}
129 For any number $x \in \GF(p)$ where $x < 2^n < p$ the function $b_m (x)$ will output the $m$-th binary digit of $x$ for $1 \le m \le n$.
130 \end{theorem}
131 \begin{proof}
132 We start by noting that for any number $x \le 2^i$, the $i$-th bit is 0 if and only if $x \le 2^{i - 1}$. We also note that if $b_i \in \{0, 1 \}$ is this $i$-th bit, then $x - 2^{i - 1} b_i \le 2^{i - 1}$.
133
134 We now prove the result by induction on $m$ starting at $m = n$ and working downwards. So for the base case take $m = n$. Then $b_n (x) = d_{n - 1}'(x)$. However, we also know that $x \le 2^n$ and that the $n$-th bit is zero iff $x \le 2^{n - 1}$. But this therefore follows directly from Lemma \ref{lemma:comparison} setting $m = 2^{n - 1}$.
135
136 For the inductive step, assume the result holds for $m = a + 1$. Our inductive hypothesis is that $b_i$ will output the binary bits for $a + 1 \le i \le n$ and we want to show that $b_a$ will output the $a$-th bit. Set
137 $$
138 x' = x - \sum_{i = a + 1}^{n - 1} 2^i b_i (x).
139 $$
140 From the results noted at the start of the proof, and the inductive hypothesis, we can deduce that
141 $$
142 x' \le 2^{a}
143 $$
144 (the proof of this is also by induction on $a$, but is straightforward and omitted for clarity). It follows that the $a$-th bit of $x'$ is 0 if and only if $x' \le 2^{a - 1}$, and so from Lemma \ref{lemma:comparison} we see that $d_{2^{a - 1}} (x')$ will output the $a$-th bit of $x'$ which is also the $a$-th but of $x$. Going back to our original definition of $b_a$ we see that
145 $$
146 b_a (x) = d_{a - 1}' (x') = d_{2^{a - 1}} (x')
147 $$
148 and so have the result we need.
149 \end{proof}
150
151 \section{Notes}
152 Although this gives a function we can use in $\GF(p)$ to determine the binary digits of a number, we haven't quite proved everything we need. In order to ensure the function is usable in the context of homomorphic encryption, we can only use multplication and addition. This introduces restrictions on the algebra we can perform on encrypted data. For example, although exponentiation is a special case of multiplication (performing the same multiplication multiple times), it turns out we can only do this where the exponent is a {\it constant\/} value. The reason is that, if the exponent were a variable depending on some other value in the formula, we would need to know how many multiplications to perform, which wouldn't be possible for an encrypted exponent. Similarly for ranged sums, the ranges must be constat values, otherwise we wouldn't know how many additions to perform.
153
154 In the functions defined above, we use exponentiation and ranged sums in several places. For example, in Equation \ref{equation:comparison} we exponentiate by the power $p$. It's clear that in this case $p$ is a constant and so can also be represented as a fixed number of multiplications.
155
156 Less clear may be the ranged sum of Equation \ref{equation:getbit}. In this case we're saved by the fact each $b_m (x)$ is defined as a separate function. Hence both $m$ and $n$ are constant for each $b_m$ and so we can alternatively represent the sum as a fixed number of individual additions.
157
158 The catch is that each $b_m (x)$ involves a large number of operations, each of which will introduce noise into the homomorphically encrypted data. Calculation of the inverse (exponentiation by $p$) in Equation \ref{equation:comparison} is particularly problemmatic, since the noise will grow with $p$ (and hence we can't just choose a larger $p$ to compensate). The purpose of the exponentiation is simply to map 0 to 0 and any other number to 1; although fundamental to the process, there may be a way to achieve it without introducing so much noise. A possible avenue to consider might be the use of the Extended Euclidean Algorithm \cite{Cormen01} for calculating the inverse instead.
159
160
161 \bibliographystyle{IEEEtran}
162 \bibliography{main}
163
164 % that's all folks
165 \end{document}
166
167