Representation of complex numbers in redundant numeration systems | Counting maximum optimal representations
We have seen in Theorem [t-naf-optimal] that with , the -NAF representation of a given number is always optimal. However, it might not be strictly optimal – that is, there might exist other representations of the same number with equal Hamming weight. For certain applications, it is important to know the number of such optimal representations. In [num-binary-signed-reprs], it is shown that a binary NAF representation has at most equivalent reduced optimal representations (including itself), where denotes the Fibonacci sequence defined by the recurrence . The proof makes use of a transducer that converts any -representation into the equivalent NAF representation. In this chapter, we are going to prove a similar statement about -NAF extended Penney representations of Gaussian integers by constructing an analogous transducer.
We are going to work with the extended Penney system, so we shall use the symbol .
Transducer
Definition 3.1. A transducer is a function , where is its set of states, is a set of digits and is the set of finite sequences of digits.
Given a transducer, a sequence and an initial state , we can use the transducer to transform the sequence as follows:
Intuitively, we can think of the transducer as a machine that “consumes” the input from the right one digit at a time and based on the digit and its internal state, it outputs some digits and changes its internal state. In each step, we obtain a sequence which forms a part of the output. By concatenating them, we get the result of the transformation: . We can define an extended version of the transducer which performs all the steps at once: .
The state of our transducer is going to represent the difference between the numbers represented by the digits that have been read so far and the digits that have been written so far. The set of possible states, shown in Figure [f-Q-points], shall be
We define the transducer itself as follows:
Note that in the second case, the transducer consumes three digits at once, so it does not strictly satisfy Definition 3.1, but it can still be used to unambiguously define an extended transducer . We just need to verify that the definition is correct in terms of always producing a valid new state and valid output. The fact that we can always find in the second case follows immediately from Lemma [t-eps-odd-last-digit]. It remains to show that always , which can be done by manually checking all finitely many inputs (which is made easier by the symmetry present).
Now we need to prove that the transducer actually turns any representation into the equivalent -NAF representation, starting with .
Lemma 3.2. Let be a representation in the extended Penney system. Assume that the transducer described above reads all digits of the representation. Then its output and the final state will satisfy
Proof. We shall prove the statement by induction. Given the empty representation, the transducer produces an empty representation and stays in the state
, which clearly satisfies the equation. Now assume that the statement is true for representations of length
. As the induction step, we will show that if the transducer consumes
digits and then performs one more step, the statement will still hold. Naturally, we are going to distiguish two cases based on the definition of
:
- If , the transducer is going to write and set its state to . We then have
- If , the transducer is going to read two more digits and , find such that , then output three digits and set its state to . We then have
Note. It is possible that the transducer will fail to read the whole representation, being left with one or two digits that fall into the second case, which requires three digits. In this case, we can simply prepend up to two zeros to finish the transformation.
Theorem 3.3. Let be a representation in the extended Penney system. Then it is possible to prepend finitely many zeros to the representation so that the transducer described above reads all digits, ends up with a final state and outputs the equivalent -NAF representation.
Proof. By
Lemma 3.2 and the note below it, we can prepend
to
zeros so that the transducer reads all digits, ends up in a state
and outputs a representation
satisfying
Notice that when the input to the transducer consists entirely of zeros, it is identical to the algorithm described in
Theorem [t-eps-naaf-uniq], finding the
-NAF representation of
. Therefore, after consuming the original input, we can input a few more zeros into it to get it into the zero state, causing it to output the
-NAF representation of
as
. We can substitute this into the above equation:
Therefore,
is an equivalent representation. Since the transducer always outputs non-zero digits with two zeros before them, this representation is also
-NAF.
Transducer as an oriented graph
We can naturally represent the transducer as an oriented graph , where vertices represent the possible states and edges represent transitions. An edge corresponding to reading a digit and outputting a zero will be labelled , one that reads three digits and outputs three digits will be labelled . Since the graph has many edges, a picture of it would be unreadable, but for illustration, Figure [f-G-edges-example] shows three nodes and two edges from this graph.
Definition 3.4. Let be a graph with vertices and edges . An oriented walk in of length is a sequence , where and .
Naturally, a transformation using the extended transducer corresponds to an oriented walk in the graph of the original transducer, where are the visited states and are the performed transformations. Notice that since we are reading and writing the representations from right to left, this walk will be written in the “reverse” order in contrast with its input and output.
We are now ready to use the graph for analyzing the optimality of representations. For this, we need to introduce the notion of the weight of an edge, which shall indicate how many non-zero digits the transduces removes when performing the corresponding transition.
Definition 3.5. Let be an edge of our graph with label . Then we define its weight as .
Definition 3.6. Let be an oriented walk in . Then its weight is defined as .
Note that some edges in have a negative weight, so we cannot straight up say that no edge increases the Hamming weight and therefore the -NAF representation is optimal. However, it is the case that if we make a complete walk starting and ending in the state , then the sum of all edges on the walk is non-negative, as can be easily proven:
Lemma 3.7. Let be a walk in our graph starting and ending in . Then .
Proof. We can use the Bellman-Ford algorithm [bellman-ford] to find the minimum-weight walk from to itself. The algorithm indicates that the shortest walk has length .
Theorem 3.8. Every -NAF representation in the extended Penney system is optimal.
Proof. Let
be another representation of the same number and
the walk taken in
when transforming
into its
-NAF representation using
. From
Theorem [t-eps-naaf-uniq], we know that the output is equal up to leading zeros to
. Using this and
Lemma 3.7, we have
Therefore, an arbitrary equivalent representation is at least as long as the
-NAF representation, which was to be proven.
Lemma 3.7 and the proof of Theorem 3.8 motivate the following definition and trivial lemma:
Definition 3.9. A walk in is optimal if .
Lemma 3.10. An extended Penney representation is optimal if and only if the walk in produced by using the transducer on is optimal.
Simplifying the graph
Although the graph has many edges, we can exploit the symmetries present in the problem in order to simplify it.
Lemma 3.11. Let
be an edge in the graph
of the transducer
described above and
. Then
and
- if the label of is , then the label of is , and its label is ,
- if the label of is , then the label of is , and its label is .
Proof. Notice that
.
- If is labelled , it means that . Then also and .
- If is labelled , it means that Then also and
The symmetry demonstrated in Lemma 3.11 allows us to introduce an equivalence relation on the set of states where if or for some . We can then group the states into equivalence groups:
Lemma 3.12. Let be a walk in our graph starting in and ending in . Let . Then there exists a walk of the same length and weight as which starts in and ends in some .
Proof. By definition of
, we can transform
into
by multiplication with a number from
and possibly complex conjugation. By
Lemma 3.11, there is an edge from
to
, where
. We can repeat this argument for all edges on
. It is also easy to check that the equivalent edges have the same weights.
Notice also that if a representation is optimal, then converting it to -NAF using the transducer does not increase its Hamming weight, therefore the weight of the corresponding walk in is zero. Such a walk shall be called an optimal walk.
Lemma 3.13. Let be an edge from to that is contained in some optimal walk . Then has the minimum weight out of all edges from to .
Proof. Assume for the sake of contradiction that there exists an edge
from
to
with
. Then we could replace
with
in
and obtain a walk from
to
with a negative weight. However, according to
Lemma 3.7, this is impossible.
These lemmas allow us to construct a much simpler graph that can still be used to analyze the optimality of representations. Its vertices will be the equivalence classes of . By Lemma 3.11, we can group edges in that differ only in symmetry into one edge in . By Lemma 3.12, the weights of edges and walks are still going to be unambiguously defined, so we can label the edges with the common weight of all corresponding original edges. Lemma 3.13 also allows us to discard edges that cannot be used in an optimal walk. The result is shown in Figure [f-Gamma].
If we use the Bellman-Ford algorithm on this new graph to calculate the minimum-weight walk between each pair of vertices, we will notice that some edges are not the minimum-weight walk between their start and end. Such edges cannot lie on any optimal walk, because if they did, we could replace them with the shorter walk, similarly to the proof of Lemma 3.13. Additionally, the vertices and have the property that the minimum-weight walk from to either of them and then back to has a positive weight. Therefore, no optimal walk can go through these vertices. If we remove the problematic vertices and edges, we get the graph depicted in Figure [f-Gammatilde], which has only vertices and edges.
However, by collapsing the vertices of into equivalence classes, we have lost information about the specific outputs of the edges, since they are not identical for edges from a given equivelence class to a given equivalence class. We are going to need this information, so we shall introduce yet another graph , a subgraph of consisting only of the edges that are represented in . This graph, shown in Figure [f-Gtilde], is still quite small, with vertices and edges.
Lemma 3.14. All walks in , as well as , are optimal.
Proof. Consider first . The only edges that have a non-zero weight are the ones between and the other two vertices, with those from having and the one to having . Any walk that starts and ends in uses an equal number of edges to and from , so the sum of weights is . Since the edges in have the same weights as the corresponding edges in , this argument also applies to .
Theorem 3.15. Let . Then the number of optimal reduced representations of is equal to the number of walks in whose output is the -NAF representation of .
Proof. A direct consequence of
Lemma 3.10,
Lemma 3.14 and the fact that
is a subgraph of
containing all optimal walk.
Converting to a matrix problem
In Theorem 3.15, we have proven that the graph is a good tool for counting optimal extended Penney representations. We still need a way to count all the possible walks in . The standard graph-theoretic way to count walks is using adjacency matrices. However, we do not actually want to count all walks, just the ones with a specific output. We can achieve this by defining a separate adjecency matrix for each subgraph consisting only of edges with the same output.
First, we need to put the vertices in a specific order, represented by a tuple of the vertex labels:
We also assign each possible output of an edge to a single digit in the natural way:
Then, for each , we define the adjacency matrix like so:
The matrices and look like this:
The remaining three matrices can be expressed in terms of using the symmetries described in Lemma 3.11. To be specific, we define matrices as follows:
These matrices have the following properties, which can be verified by direct computation:
Due to these properties, the matrices form a group with elements:
The properties in the following lemma establish relationships between the adjacency matrices:
Lemma 3.16.
Proof. The relationships follow from
Lemma 3.11 and the way the graph and matrices are constructed.
Lemma 3.17. Let
and
. Then there exist
and
such that
and
.
Proof. If we express
, then by
Lemma 3.16,
.
- If , we choose an such that and . Then
- If , we choose an such that and . Then, using all the formulas in Lemma 3.16,
Lemma 3.18. Let be a sequence of digits. Then the number of walks in starting at and ending at such that the output label of each is is .
Proof. We shall prove the statement by induction. The case follows directly from the definition of the adjacency matrices. Assume that the statement is true for and let be some vertices. Then for each vertex , there are walks from to with edges and edges from to . Each walk from to with consists of one said walk and one said edge, where can be arbitrary. Therefore, the total number of such walks is
Theorem 3.19. Let be the -NAF extended Penney representation of . Then the number of optimal reduced extended Penney representations of is equal to , where is the first row unit vector.
At last, we have converted our problem to a matrix problem, making it easier to reason about. Our ultimate question is: Given a number , which -NAF representation with Hamming weight has the most equivalent representations? And how many such -NAF representations exist? We can formulate the first question using matrices as follows:
Definition 3.20. Let be row vectors. We define the relation as
Definition 3.21. Let be row vectors. We define the partial ordering as
We say that is majorized by . If also , we denote .
Definition 3.22. Let be row vectors. We define the relation as
We say that is strictly majorized by . If also , we denote .
Lemma 3.23. is an equivalence relation on
and
is a partial ordering on
.
Proof. We will show that is transitive; all other properties trivially follow from the definitions. Let be vectors such that . Then there exist matrices such that . Then also , with since is closed under multiplication.
Lemma 3.24. Let
be sequences of digits such that
and
. Then for any
we have
Proof. Let
. Since
for all
, the first component of all the vectors is positive:
. Let
be a matrix such that
. Then
Now it remains to show that
. By repeated application of
Lemma 3.17, there exist
such that
, with each
. Also,
because all matrices in
have
as the first column. Therefore, by definition of
,
Lemma 3.25. Let be digits such that . Let . Denote
Then either or .
Proof. Notice that , so for any vector , and also . Therefore, we only need to check vectors that do not contain or start with . in other words, . There are only finitely many such vectors for , so we can verify the theorem manually.
Definition 3.26. We define the recurrent sequence of integers
and the recurrent sequence of vectors
Note. Obviously, is a strictly increasing sequence.
Lemma 3.27. For each ,
Proof. Straightforward proof by induction.
Lemma 3.28. Let
. Then
- ,
- ,
- ,
Proof. - .
- By expressing each component in terms of the sequence (using Lemma 3.27), we obtain , with the inequality being strict in the first component.
- By expressing each component in terms of the sequence , we can verify and . Therefore, we just need to show that and . We shall show this by induction. For , the inequalities can be verified manually. Now assume that either of the inequalities is true for , and . By taking these three inequalities, multiplying the first two by and adding them together with the third one, we get the inequality for , which completes the induction step.
- If , then (due to the fact that ). If , then we can again express each component in terms of the sequence and verify for each individually.
Lemma 3.29. For each , .
Proof. From
Lemma 3.24 and
Lemma 3.25, it follows that we only need to consider products of the form
, since swapping out
for anything else with the same weight would not increase the result. We consider
because
already contains
non-zero digits. By
Lemma 3.16, we can rewrite
as a product of matrices from the set
. Denote
Let
be the maximum index such that
. That is,
. Clearly,
contains exactly
matrices. If
, then
, therefore
(by
Lemma 3.27). This shows a lower bound
. It remains to show that if
, then this bound is not exceeded, that is,
. Due to how we chose
,
cannot be of the form
, because then we could have chosen a higher
. We shall consider several different cases, one of which has to happen:
-
Then (by Lemma 3.16 and Lemma 3.28). Therefore, the maximum cannot be reached by Lemma 3.24.
-
Then (by Lemma 3.16 and Lemma 3.28). Therefore, the maximum cannot be reached by Lemma 3.24.
-
It can be verified that is component-wise less-or-equal to , therefore , so this is reduced to the previous case.
-
This implies that and . For , we can manually check that . For , we can express in terms of using Lemma 3.27, then compute that . It remains to prove that this is less than .
-
We assumed that , so B contains at least one matrix. That is, there exists a and such that (making use of Lemma 3.16 to separate the matrices and matrices). From Lemma 3.28, and therefore, by Lemma 3.24, it cannot start a product reaching the maximum.
Theorem 3.30. Let . Then each -NAF representation in the extended Penney system with exactly non-zero digits has at most equivalent representations, and there are exactly such -NAF representations which have exactly equivalent representations and do not end in .
Proof. For
, the second statement follows from
Lemma 3.25: The only vectors corresponding to
-NAF representations with the maximum number of equivalent representations are
with
, giving a total of
vectors, which are all distinct. The maximum number of equivalent representations can then be calculated manually, such as by using a non-deterministic variant of the modular algorithm. For
,
Lemma 3.29 immediately gives the first statement. Its proof also shows that the vector of any representation with exactly
equivalent representations is of the form
. Since we are only counting representations that do not end in
, it follows that
consists only of
matrices, therefore
. By definition, there are
such vectors, which are distinct due to
Lemma 3.27.