Representation of complex numbers in redundant numeration systems

We have seen in Theorem [t-naf-optimal] that with

β = 2, 𝒟 = {0, \pm 1}

, the

2

-NAF representation of a given number is always optimal. However, it might not be strictly optimal – that is, there might exist other representations of the same number with equal Hamming weight. For certain applications, it is important to know the number of such optimal representations. In [num-binary-signed-reprs], it is shown that a binary NAF representation

(x_{n}, \dots, x_{0})

has at most

F_{W (x_{n}, \dots, x_{0})}

equivalent reduced optimal representations (including itself), where

{(F_{n})}_{n = 0}^{\infty}

denotes the Fibonacci sequence defined by the recurrence

F_{0} = 0, F_{1} = 1, F_{n + 2} = F_{n} + F_{n + 1}

. The proof makes use of a transducer that converts any

(2, {0, \pm 1})

-representation into the equivalent NAF representation. In this chapter, we are going to prove a similar statement about

3

-NAF extended Penney representations of Gaussian integers by constructing an analogous transducer.

We are going to work with the extended Penney system, so we shall use the symbol

β_{P} ≔ i - 1

Transducer

Definition 3.1. A transducer is a function

δ : Q \times 𝒟 \to Q \times 𝒟^{*}

, where

Q

is its set of states,

𝒟

is a set of digits and

𝒟^{*}

is the set of finite sequences of digits.

Given a transducer, a sequence $(x_{n}, \dots, x_{0}) \in 𝒟^{*}$ and an initial state $q_{0} \in Q$ , we can use the transducer to transform the sequence as follows:

\begin{aligned} (q_{1}, η_{0}) & ≔ δ (q_{0}, x_{0}), \\ (q_{2}, η_{1}) & ≔ δ (q_{1}, x_{1}), \\ ⋮ \\ (q_{n + 1}, η_{n}) & ≔ δ (q_{n}, x_{n}) . \end{aligned}

Intuitively, we can think of the transducer as a machine that “consumes” the input from the right one digit at a time and based on the digit and its internal state, it outputs some digits and changes its internal state. In each step, we obtain a sequence $η_{k}$ which forms a part of the output. By concatenating them, we get the result of the transformation: $η ≔ (η_{n}, \dots, η_{0})$ . We can define an extended version of the transducer $δ^{*} : Q \times 𝒟^{*} \to Q \times 𝒟^{*}$ which performs all the steps at once: $δ^{*} (q_{0}, (x_{n}, \dots, x_{0})) ≔ (q_{n + 1}, η)$ .

The state of our transducer is going to represent the difference between the numbers represented by the digits that have been read so far and the digits that have been written so far. The set of possible states, shown in Figure [f-Q-points], shall be

Q = {0, \pm 1, \pm i, \pm 1 \pm i, \pm 2, \pm 2 i, \pm 1 \pm 2 i, \pm 2 \pm i}

We define the transducer itself as follows:

\begin{aligned} δ (q, x) & ≔ (q^{'}, 0) & if β_{P} | (x + q), q^{'} = \frac{x + q}{β_{P}} \\ δ (q, (z, y, x)) & ≔ (q^{'}, (0,0, r)) & if β_{P} ∤ (x + q), {(z y x)}_{β_{P}} + q = q^{'} β_{P}^{3} + r, r \in {\pm 1, \pm i} . \end{aligned}

Note that in the second case, the transducer consumes three digits at once, so it does not strictly satisfy Definition 3.1, but it can still be used to unambiguously define an extended transducer $δ^{*}$ . We just need to verify that the definition is correct in terms of always producing a valid new state and valid output. The fact that we can always find $q^{'}, r$ in the second case follows immediately from Lemma [t-eps-odd-last-digit]. It remains to show that always $q^{'} \in Q$ , which can be done by manually checking all finitely many inputs (which is made easier by the symmetry present).

Now we need to prove that the transducer $δ_{0}$ actually turns any representation into the equivalent $3$ -NAF representation, starting with $q_{0} = 0$ .

Lemma 3.2. Let

(x_{n}, \dots, x_{0})

be a representation in the extended Penney system. Assume that the transducer

δ^{*}

described above reads all digits of the representation. Then its output

(y_{n}, \dots, y_{0})

and the final state

q_{n + 1}

will satisfy

{(x_{n} \dots x_{0})}_{β_{P}} = q_{n + 1} β_{P}^{n + 1} + {(y_{n} \dots y_{0})}_{β_{P}} .

Proof. We shall prove the statement by induction. Given the empty representation, the transducer produces an empty representation and stays in the state

q_{0} = 0

, which clearly satisfies the equation. Now assume that the statement is true for representations of length

n

. As the induction step, we will show that if the transducer consumes

n

digits and then performs one more step, the statement will still hold. Naturally, we are going to distiguish two cases based on the definition of

δ

If $β_{P} | (x_{n + 1} + q_{n + 1})$ , the transducer is going to write $y_{n + 1} = 0$ and set its state to $q_{n + 2} = \frac{x_{n + 1} + q_{n + 1}}{β_{P}}$ . We then have $\begin{aligned} {(x_{n + 1} \dots x_{0})}_{β_{P}} & = x_{n + 1} β_{P}^{n + 1} + {(x_{n} \dots x_{0})}_{β_{P}} \\ = x_{n + 1} β_{P}^{n + 1} + q_{n + 1} β_{P}^{n + 1} + {(y_{n} \dots y_{0})}_{β_{P}} \\ = q_{n + 2} β_{P}^{n + 2} + 0 β_{P}^{n + 1} + {(y_{n} \dots y_{0})}_{β_{P}} \\ = q_{n + 2} β_{P}^{n + 2} {(y_{n + 1} \dots y_{0})}_{β_{P}} . \end{aligned}$
If $β_{P} ∤ (x_{n + 1} + q_{n + 1})$ , the transducer is going to read two more digits $x_{n + 2}$ and $x_{n + 3}$ , find $q^{'} \in Q, r \in {\pm 1, \pm i}$ such that ${(x_{n + 3} x_{n + 2} x_{n + 1})}_{β_{P}} + q_{n + 1} = q^{'} β_{P}^{3} + r$ , then output three digits $y_{n + 1} = r, y_{n + 2} = y_{n + 3} = 0$ and set its state to $q_{n + 4} = q^{'}$ . We then have $\begin{aligned} {(x_{n + 3} \dots x_{0})}_{β_{P}} & = x_{n + 3} β_{P}^{n + 3} + x_{n + 2} β_{P}^{n + 2} + x_{n + 1} β_{P}^{n + 1} + {(x_{n} \dots x_{0})}_{β_{P}} \\ = x_{n + 3} β_{P}^{n + 3} + x_{n + 2} β_{P}^{n + 2} + x_{n + 1} β_{P}^{n + 1} + q_{n + 1} β_{P}^{n + 1} + {(y_{n} \dots y_{0})}_{β_{P}} \\ = (q^{'} β_{P}^{3} + r) β_{P}^{n + 1} + {(y_{n} \dots y_{0})}_{β_{P}} \\ = q_{n + 4} β_{P}^{n + 4} + {(y_{n + 3} \dots y_{0})}_{β_{P}} . \end{aligned}$

Note. It is possible that the transducer will fail to read the whole representation, being left with one or two digits that fall into the second case, which requires three digits. In this case, we can simply prepend up to two zeros to finish the transformation.

Theorem 3.3. Let

(x_{n}, \dots, x_{0})

be a representation in the extended Penney system. Then it is possible to prepend finitely many zeros to the representation so that the transducer

δ^{*}

described above reads all digits, ends up with a final state

q_{l + 1} = 0

and outputs the equivalent

3

-NAF representation.

Proof. By Lemma 3.2 and the note below it, we can prepend

0

2

zeros so that the transducer reads all digits, ends up in a state

q_{m + 1}

and outputs a representation

(y_{m}, \dots, y_{0})

satisfying

{(x_{m} \dots x_{0})}_{β_{P}} = q_{m + 1} β_{P}^{m + 1} + {(y_{m} \dots y_{0})}_{β_{P}} .

Notice that when the input to the transducer consists entirely of zeros, it is identical to the algorithm described in Theorem [t-eps-naaf-uniq], finding the

3

-NAF representation of

q

. Therefore, after consuming the original input, we can input a few more zeros into it to get it into the zero state, causing it to output the

3

-NAF representation of

q_{m + 1}

(y_{l}, \dots, y_{m + 1})

. We can substitute this into the above equation:

{(x_{m} \dots x_{0})}_{β_{P}} = {(y_{l} \dots y_{m + 1})}_{β_{P}} β_{P}^{m + 1} + {(y_{m} \dots y_{0})}_{β_{P}} = {(y_{l} \dots y_{0})}_{β_{P}} .

Therefore,

(y_{l}, \dots, y_{0})

is an equivalent representation. Since the transducer always outputs non-zero digits with two zeros before them, this representation is also

3

-NAF.

Transducer as an oriented graph

We can naturally represent the transducer $δ$ as an oriented graph $G$ , where vertices represent the possible states $Q$ and edges represent transitions. An edge corresponding to reading a digit $x$ and outputting a zero will be labelled $x | 0$ , one that reads three digits $(z, y, x)$ and outputs three digits $(0,0, r)$ will be labelled $z y x | 00 r$ . Since the graph has many edges, a picture of it would be unreadable, but for illustration, Figure [f-G-edges-example] shows three nodes and two edges from this graph.

Definition 3.4. Let

G

be a graph with vertices

Q

and edges

E

. An oriented walk in

G

of length

l

is a sequence

(q_{0}, e_{1}, q_{1}, e_{2}, \dots, e_{l}, q_{l})

, where

q_{0}, \dots, q_{l} \in Q

and

\forall k \in {1, \dots, l} : e_{k} = (q_{k - 1}, q_{k}) \in E

Naturally, a transformation using the extended transducer $δ^{*}$ corresponds to an oriented walk in the graph $G$ of the original transducer, where $q_{0}, \dots, q_{l}$ are the visited states and $e_{1}, \dots, e_{l}$ are the performed transformations. Notice that since we are reading and writing the representations from right to left, this walk will be written in the “reverse” order in contrast with its input and output.

We are now ready to use the graph for analyzing the optimality of representations. For this, we need to introduce the notion of the weight of an edge, which shall indicate how many non-zero digits the transduces removes when performing the corresponding transition.

Definition 3.5. Let

e

be an edge of our graph

G

with label

x_{m} \dots x_{0} | y_{m} \dots y_{0}

. Then we define its weight as

W (e) ≔ W (x_{m}, \dots, x_{0}) - W (y_{m}, \dots, y_{0})

Definition 3.6. Let

P = (q_{0}, e_{1}, \dots, e_{l}, q_{l})

be an oriented walk in

G

. Then its weight is defined as

W (P) ≔ \sum_{k = 1}^{l} W (e_{k})

Note that some edges in $G$ have a negative weight, so we cannot straight up say that no edge increases the Hamming weight and therefore the $3$ -NAF representation is optimal. However, it is the case that if we make a complete walk starting and ending in the state $0$ , then the sum of all edges on the walk is non-negative, as can be easily proven:

Lemma 3.7. Let

P

be a walk in our graph

G

starting and ending in

0

. Then

W (P) \geq 0

Proof. We can use the Bellman-Ford algorithm [bellman-ford] to find the minimum-weight walk from

0

to itself. The algorithm indicates that the shortest walk has length

0

Theorem 3.8. Every

3

-NAF representation

(x_{n}, \dots, x_{0})

in the extended Penney system is optimal.

Proof. Let

(y_{m}, \dots, y_{0})

be another representation of the same number and

(q_{0}, e_{1}, \dots, e_{l}, q_{l})

the walk taken in

G

when transforming

(y_{m}, \dots, y_{0})

into its

3

-NAF representation using

δ^{*}

. From Theorem [t-eps-naaf-uniq], we know that the output is equal up to leading zeros to

(x_{n}, \dots, y_{0})

. Using this and Lemma 3.7, we have

W (x_{n}, \dots, x_{0}) - W (y_{m}, \dots, y_{0}) = \sum_{k = 1}^{l} W (e_{k}) \geq 0 .

Therefore, an arbitrary equivalent representation is at least as long as the

3

-NAF representation, which was to be proven.

Lemma 3.7 and the proof of Theorem 3.8 motivate the following definition and trivial lemma:

Definition 3.9. A walk

P

G

is optimal if

W (P) = 0

Lemma 3.10. An extended Penney representation

(x_{n}, \dots, x_{0})

is optimal if and only if the walk in

G

produced by using the transducer

δ^{*}

(x_{n}, \dots, x_{0})

is optimal.

Simplifying the graph

Although the graph $G$ has many edges, we can exploit the symmetries present in the problem in order to simplify it.

Lemma 3.11. Let

e = (q, q^{'}) \in E

be an edge in the graph

G

of the transducer

δ

described above and

d \in {\pm 1, \pm i}

. Then

d e ≔ (d q, d q^{'}) \in E

and

if the label of $e$ is $x | 0$ , then the label of $d e$ is $(d \cdot x) | 0$ , $e ≔ (q, i q^{'}) \in E$ and its label is $x | 0$ ,
if the label of $e$ is $z y x | 00 r$ , then the label of $d e$ is $(d \cdot z) (d \cdot y) (d \cdot x) | 00 (d \cdot r)$ , $e ≔ (q, - i q^{'}) \in E$ and its label is $(- z) (i \cdot y) x | 00 r$ .

Proof. Notice that

β_{P} = i β_{P}, β_{P}^{2} = - β_{P}^{2} and β_{P}^{3} = - i β_{P}^{3}

If $e$ is labelled $x | 0$ , it means that $x + q = q^{'} β_{P}$ . Then also $d x + d q = d q^{'} β_{P}$ and $x + q = q^{'} β_{P} = i q^{'} β_{P}$ .
If $e$ is labelled $z y x | 00 r$ , it means that $z β_{P}^{2} + y β_{P} + x + q = q^{'} β_{P}^{3} + r .$ Then also $d z β_{P}^{2} + d y β_{P} + d x + d q = d q^{'} β_{P}^{3} + d r$ and $\begin{aligned} z β_{P}^{2} + y β_{P} + x + q & = q^{'} β_{P}^{3} + r \\ - z β_{P}^{2} + i y β_{P} + x + q & = - i q^{'} β_{P}^{3} + r . \end{aligned}$

The symmetry demonstrated in Lemma 3.11 allows us to introduce an equivalence relation $\sim$ on the set of states $Q$ where $q_{1} \sim q_{2}$ if $q_{1} = d q_{2}$ or $q_{1} = d q_{2}$ for some $d \in {\pm 1, \pm i}$ . We can then group the states into equivalence groups:

\begin{aligned} [0] & = {0} \\ [1] & = {\pm 1, \pm i} [i + 1] & = {\pm 1 \pm i} [2] & = {\pm 2, \pm 2 i} [i + 2] & = {\pm 2 \pm i, \pm 1 \pm 2 i} \end{aligned}

Lemma 3.12. Let

P

be a walk in our graph

G

starting in

q_{0}

and ending in

q_{l}

. Let

q_{0}^{'} \in Q, q_{0}^{'} \sim q_{0}

. Then there exists a walk

P^{'}

of the same length and weight as

P

which starts in

q_{0}^{'}

and ends in some

q_{l}^{'} \in Q, q_{l}^{'} \sim q_{l}

Proof. By definition of

\sim

, we can transform

q_{0}

into

q_{0}^{'}

by multiplication with a number from

{\pm 1, \pm i}

and possibly complex conjugation. By Lemma 3.11, there is an edge from

q_{0}^{'}

q_{1}^{'}

, where

q_{1}^{'} \sim q_{1}

. We can repeat this argument for all edges on

P

. It is also easy to check that the equivalent edges have the same weights.

Notice also that if a representation is optimal, then converting it to $3$ -NAF using the transducer does not increase its Hamming weight, therefore the weight of the corresponding walk in $G$ is zero. Such a walk shall be called an optimal walk.

Lemma 3.13. Let

e

be an edge from

q

q^{'}

that is contained in some optimal walk

P

. Then

e

has the minimum weight out of all edges from

q

q^{'}

Proof. Assume for the sake of contradiction that there exists an edge

e^{'}

from

q

q^{'}

with

W (e^{'}) < W (e)

. Then we could replace

e

with

e^{'}

P

and obtain a walk from

0

0

with a negative weight. However, according to Lemma 3.7, this is impossible.

These lemmas allow us to construct a much simpler graph $Γ$ that can still be used to analyze the optimality of representations. Its vertices will be the equivalence classes of $Q$ . By Lemma 3.11, we can group edges in $G$ that differ only in symmetry into one edge in $Γ$ . By Lemma 3.12, the weights of edges and walks are still going to be unambiguously defined, so we can label the edges with the common weight of all corresponding original edges. Lemma 3.13 also allows us to discard edges that cannot be used in an optimal walk. The result is shown in Figure [f-Gamma].

If we use the Bellman-Ford algorithm on this new graph to calculate the minimum-weight walk between each pair of vertices, we will notice that some edges are not the minimum-weight walk between their start and end. Such edges cannot lie on any optimal walk, because if they did, we could replace them with the shorter walk, similarly to the proof of Lemma 3.13. Additionally, the vertices $2$ and $[i + 2]$ have the property that the minimum-weight walk from $0$ to either of them and then back to $0$ has a positive weight. Therefore, no optimal walk can go through these vertices. If we remove the problematic vertices and edges, we get the graph $\tilde{Γ}$ depicted in Figure [f-Gammatilde], which has only $3$ vertices and $7$ edges.

However, by collapsing the vertices of $G$ into equivalence classes, we have lost information about the specific outputs of the edges, since they are not identical for edges from a given equivelence class to a given equivalence class. We are going to need this information, so we shall introduce yet another graph $\tilde{G}$ , a subgraph of $G$ consisting only of the edges that are represented in $\tilde{Γ}$ . This graph, shown in Figure [f-Gtilde], is still quite small, with $9$ vertices and $61$ edges.

Lemma 3.14. All walks in

\tilde{Γ}

, as well as

\tilde{G}

, are optimal.

Proof. Consider first

\tilde{Γ}

. The only edges that have a non-zero weight are the ones between

[0]

and the other two vertices, with those from

[0]

having

+ 1

and the one to

[0]

having

- 1

. Any walk that starts and ends in

[0]

uses an equal number of edges to and from

[0]

, so the sum of weights is

0

. Since the edges in

\tilde{G}

have the same weights as the corresponding edges in

\tilde{Γ}

, this argument also applies to

\tilde{G}

Theorem 3.15. Let

x \in Z [i]

. Then the number of optimal reduced representations of

x

is equal to the number of walks in

\tilde{G}

whose output is the

3

-NAF representation of

x

Proof. A direct consequence of Lemma 3.10, Lemma 3.14 and the fact that

\tilde{G}

is a subgraph of

G

containing all optimal walk.

Converting to a matrix problem

In Theorem 3.15, we have proven that the graph $\tilde{G}$ is a good tool for counting optimal extended Penney representations. We still need a way to count all the possible walks in $\tilde{G}$ . The standard graph-theoretic way to count walks is using adjacency matrices. However, we do not actually want to count all walks, just the ones with a specific output. We can achieve this by defining a separate adjecency matrix for each subgraph consisting only of edges with the same output.

First, we need to put the vertices in a specific order, represented by a tuple of the vertex labels:

V ≔ (0, 1, i, - 1, - i, 1 + i, - 1 + i, - 1 - i, 1 - i)

We also assign each possible output of an edge to a single digit in the natural way:

O_{0} ≔ (0), O_{d} ≔ (0,0, d), d \in {\pm 1, \pm i}

Then, for each $d \in {0, \pm 1, \pm i}$ , we define the adjacency matrix $A_{d} \in ℕ_{0}^{9 \times 9}$ like so:

{(A_{d})}_{i, j} ≔ the number of edges in \tilde{G} from V_{i} to V_{j} whose output is O_{d} .

The matrices

A_{0}

and

A_{1}

look like this:

A_{0} = (\begin{array}{ccccccccc} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}), A_{1} = (\begin{array}{ccccccccc} 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 2 & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 0 \\ 2 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}) .

The remaining three matrices $A_{i}, A_{- 1}, A_{- i}$ can be expressed in terms of $A_{1}$ using the symmetries described in Lemma 3.11. To be specific, we define matrices $R, C \in ℕ_{0}^{9 \times 9}$ as follows:

R ≔ (\begin{array}{ccccccccc} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{array}), C ≔ (\begin{array}{ccccccccc} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{array}) .

These matrices have the following properties, which can be verified by direct computation:

C = C^{T}, R^{T} = R^{- 1}, C = C^{- 1}, R C = C R^{T},

Due to these properties, the matrices form a group with $8$ elements:

𝒫 ≔ {I, R, R^{2}, R^{3}, C, C R, C R^{2}, C R^{3}}

The properties in the following lemma establish relationships between the adjacency matrices:

Lemma 3.16.

\begin{aligned} \forall d \in {0, \pm 1, \pm i} : A_{i \cdot d} & = R^{- 1} A_{d} R, \\ \forall d \in {\pm 1, \pm i} : A_{d} & = R C A_{d} C, \\ A_{0} & = C R A_{0} C . \end{aligned}

Proof. The relationships follow from Lemma 3.11 and the way the graph and matrices are constructed.

Lemma 3.17. Let

d \in {\pm 1, \pm i}

and

P \in 𝒫

. Then there exist

h \in {\pm 1, \pm i}

and

S, T \in 𝒫

such that

P A_{d} = A_{h} S

and

P A_{0} = A_{0} T

Proof. If we express

d ≕ i^{n}, n \in {0,1,2,3}

, then by Lemma 3.16,

A_{d} = R^{- n} A_{1} R^{n}

If $P = R^{p}, p \in {0,1,2,3}$ , we choose an $m \in {0,1,2,3}$ such that $m \equiv n - p (m o d 4)$ and $h ≔ i^{m}, S ≔ R^{p}, T ≔ R^{p}$ . Then $P A_{d} = R^{p} R^{- n} A_{1} R^{n} = R^{- m} A_{1} R^{m} R^{p} = A_{h} S,$ $P A_{0} = R^{p} R^{- p} A_{0} R^{p} = A_{0} T .$
If $P = C R^{p}, p \in {0,1,2,3}$ , we choose an $m \in {0,1,2,3}$ such that $n - p - m \equiv 1 (m o d 4)$ and $h ≔ i^{- m}, S ≔ C R^{p + 1}, T ≔ C R^{p - 1}$ . Then, using all the formulas in Lemma 3.16, $P A_{d} = C R^{p} R^{- n} A_{1} R^{n} = C R^{p} R^{- n} R^{m} A_{h} R^{- m} R^{n} = R^{n - p - m} C A_{h} C C R^{n - m} = A_{h} S,$ $P A_{0} = C R^{p} R^{1 - p} A_{0} C C R^{p - 1} = A_{0} T .$

Lemma 3.18. Let

(d_{1}, \dots, d_{l})

be a sequence of digits. Then the number of walks

(q_{0}, e_{1}, \dots, e_{l}, q_{l})

\tilde{G}

starting at

V_{i}

and ending at

V_{j}

such that the output label of each

e_{k}

O_{d_{k}}

{(A_{d_{1}} \dots A_{d_{l}})}_{i, j}

Proof. We shall prove the statement by induction. The case

l = 1

follows directly from the definition of the adjacency matrices. Assume that the statement is true for

l

and let

V_{i}, V_{j}

be some vertices. Then for each vertex

V_{k}

, there are

{(A_{d_{1}} \dots A_{d_{l}})}_{i, k}

walks from

V_{i}

V_{k}

with

l

edges and

{(A_{d_{l + 1}})}_{k, j}

edges from

V_{k}

V_{j}

. Each walk from

V_{i}

V_{j}

with

l + 1

consists of one said walk and one said edge, where

V_{k}

can be arbitrary. Therefore, the total number of such walks is

\sum_{k = 1}^{9} {(A_{d_{1}} \dots A_{d_{l}})}_{i, k} {(A_{d_{l + 1}})}_{k, j} = {(A_{d_{1}} \dots A_{d_{l + 1}})}_{i, j} .

Theorem 3.19. Let

(O_{d_{n}}, \dots, O_{d_{1}})

be the

3

-NAF extended Penney representation of

x \in ℤ [i]

. Then the number of optimal reduced extended Penney representations of

x

is equal to

{(A_{d_{1}} \dots A_{d_{n}})}_{1,1} = e A_{d_{1}} \dots A_{d_{n}} e^{T}

, where

e = I_{1, \cdot}

is the first row unit vector.

Proof. Follows directly from Theorem 3.15 and Lemma 3.18.

At last, we have converted our problem to a matrix problem, making it easier to reason about. Our ultimate question is: Given a number $N \in ℕ$ , which $3$ -NAF representation with Hamming weight $N$ has the most equivalent representations? And how many such $3$ -NAF representations exist? We can formulate the first question using matrices as follows:

M (N) ≔ \max {e A_{d_{1}} \dots A_{d_{n}} e^{T} | d_{k} \in {0, \pm 1, \pm i}, \sum_{k = 1}^{n} | d_{k} | = N} = ?

Definition 3.20. Let

u, v \in ℤ^{9}

be row vectors. We define the relation

\leq

u \leq v ⟺ \forall i \in {1, \dots, 9} : u_{i} \leq v_{i} .

Definition 3.21. Let

u, v \in ℤ^{9}

be row vectors. We define the partial ordering

⪯

u ⪯ v ⟺ \exists P \in 𝒫 : u P \leq v .

We say that

u

is majorized by

v

. If also

v ⪯ u

, we denote

u \sim v

Definition 3.22. Let

u, v \in ℤ^{9}

be row vectors. We define the relation

≺

u ≺ v ⟺ u ⪯ v \land u_{1} < v_{1} .

We say that

u

is strictly majorized by

v

. If also

v ⪯ u

, we denote

u \sim v

Lemma 3.23.

\sim

is an equivalence relation on

ℤ^{9}

and

⪯

is a partial ordering on

ℤ^{9} \slash \sim

Proof. We will show that

⪯

is transitive; all other properties trivially follow from the definitions. Let

u, v, w

be vectors such that

u ⪯ v \land v ⪯ w

. Then there exist matrices

P, Q \in 𝒫

such that

u P \leq v \land v Q \leq w

. Then also

u P Q \leq w

, with

P Q \in 𝒫

since

𝒫

is closed under multiplication.

Lemma 3.24. Let

d_{1}, \dots, d_{l}, f_{1}, \dots, f_{m} \in {0, \pm 1, \pm i}

be sequences of digits such that

\sum_{k = 1}^{l} | d_{k} | = \sum_{k = 1}^{m} | f_{k} |

and

e A_{d_{1}} \dots A_{d_{l}} ≺ e A_{f_{1}} \dots A_{f_{m}}

. Then for any

d_{l + 1}, \dots, d_{n} \in {0, \pm 1, \pm i}

we have

e A_{d_{1}} \dots A_{d_{n}} e^{T} < M (N), N ≔ \sum_{k = 1}^{n} | d_{k} | .

Proof. Let

u ≔ e A_{d_{1}} \dots A_{d_{l}}, v ≔ e A_{f_{1}} \dots A_{f_{m}} and w ≔ A_{d_{l + 1}} \dots A_{d_{n}} e^{T}

. Since

{(A_{d})}_{1,1} = 1

for all

d

, the first component of all the vectors is positive:

u_{1} > 0, v_{1} > 0, w_{1} > 0

. Let

P \in 𝒫

be a matrix such that

u P \leq v \land {(u P)}_{1} < v_{1}

. Then

e A_{d_{1}} \dots A_{d_{n}} e^{T} = u w = u P P^{- 1} w < v P^{- 1} w .

Now it remains to show that

v P^{- 1} w \leq M (N)

. By repeated application of Lemma 3.17, there exist

h_{l + 1}, \dots, h_{n} \in {\pm 1, \pm i}, S \in 𝒫

such that

P^{- 1} w = A_{h_{l + 1}} \dots A_{h_{n}} S e^{T}

, with each

| h_{k} | = | d_{k} |

. Also,

S e^{T} = e^{T}

because all matrices in

𝒫

have

e^{T}

as the first column. Therefore, by definition of

M (N)

v P^{- 1} w = e A_{f_{1}} \dots A_{f_{m}} A_{h_{l + 1}} \dots A_{h_{n}} e^{T} \leq M (N) .

Lemma 3.25. Let

d_{1}, \dots, d_{n}

be digits such that

N ≔ \sum_{k = 1}^{n} | d_{k} | \in {2,3,4}

. Let

u ≔ e A_{d_{1}} \dots A_{d_{n}}

. Denote

\begin{aligned} B_{2} & ≔ A_{1} A_{- 1}, & v_{2} ≔ e B_{2} = (3,1,1,1,0,1,0,0,0) \\ B_{3} & ≔ A_{1} A_{- 1} A_{- i}, & v_{3} ≔ e B_{3} = (8,1,3,1,3,1,1,0,0) \\ B_{4} & ≔ A_{1} A_{- 1} A_{- i} A_{- i}, & v_{4} ≔ e B_{4} = (17, 1,5,3,8,1,3,0,0) \end{aligned}

Then either

v_{N} ≺ u

v_{N} \sim u \land n = N \land \exists P, S \in 𝒫 : S^{T} B_{N} P = A_{d_{1}} \dots A_{d_{n}}

Proof. Notice that

B_{0}^{2} = e^{T} e

, so

w B_{0}^{2} \leq w B_{0}

for any vector

w

, and also

e A_{0} = e

. Therefore, we only need to check vectors

u

that do not contain

B_{0}^{2}

or start with

A_{0}

. in other words,

u = e A_{f_{1}} A_{0}^{l_{1}} A_{f_{2}} \dots A_{f_{N}} A_{0}^{l_{N}}, f_{k} \in {\pm 1, \pm i}, l_{k} \in {0,1}

. There are only finitely many such vectors for

N \in {2,3,4}

, so we can verify the theorem manually.

Definition 3.26. We define the recurrent sequence of integers

r_{- 1} ≔ 3, r_{0} ≔ 8, r_{1} ≔ 17, r_{N + 3} ≔ r_{N + 2} + 2 r_{N + 1} + 2 r_{N}

and the recurrent sequence of vectors

\begin{aligned} t_{0} & ≔ e A_{1} A_{- 1} A_{- i} A_{- i} R^{3} = (17, 5,3,8,1,3,0,0,1), \\ t_{N + 1} & ≔ t_{N} A_{1} R^{2} . \end{aligned}

Note. Obviously,

r_{N}

is a strictly increasing sequence.

Lemma 3.27. For each

n \in ℕ^{+}

t_{N} = (r_{N + 1}, r_{N - 2} + r_{N - 1}, r_{N - 1}, r_{N}, r_{N - 2}, r_{N - 1}, 0, 0, r_{N - 2}) .

Proof. Straightforward proof by induction.

Lemma 3.28. Let

N, l \in ℕ^{+}, d \in {\pm 1, \pm i}

. Then

$t_{N} A_{1} \sim t_{N + 1}$ ,
$t_{N} A_{- 1} ≺ t_{N + 1} \land t_{N} A_{i} ≺ t_{N + 1}$ ,
$t_{N} A_{- i} A_{d} ≺ t_{N + 2}$ ,
$t_{N} A_{0}^{l} A_{d} ≺ t_{N + 1}$

Proof.

$t_{N} A_{1} \sim t_{N} A_{1} R^{2} = t_{N + 1}$ .
By expressing each component in terms of the sequence $r_{N}$ (using Lemma 3.27), we obtain $t_{N} A_{- 1} C = t_{N} A_{i} R C \leq t_{N + 1}$ , with the inequality being strict in the first component.
By expressing each component in terms of the sequence $r_{N}$ , we can verify $t_{N} A_{- i} A_{1} C R^{2} \leq t_{N} A_{- i} A_{- i} R^{3}$ and $t_{N} A_{- i} A_{i} R \leq t_{N} A_{- i} A_{- 1} C$ . Therefore, we just need to show that $t_{N} A_{- i} A_{- i} R^{3} \leq t_{N + 2}$ and $t_{N} A_{- i} A_{- 1} C \leq t_{N + 2}$ . We shall show this by induction. For $n \in {1,2,3}$ , the inequalities can be verified manually. Now assume that either of the inequalities is true for $N$ , $N + 1$ and $N + 2$ . By taking these three inequalities, multiplying the first two by $2$ and adding them together with the third one, we get the inequality for $N + 3$ , which completes the induction step.
If $l \geq 2$ , then $t_{N} A_{0}^{l} A_{d} = r_{N + 1} e \leq t_{N + 2}$ (due to the fact that $B_{0}^{2} = e^{T} e$ ). If $l = 1$ , then we can again express each component in terms of the sequence $r_{N}$ and verify $t_{n} A_{0} A_{d} ≺ t_{N + 1}$ for each $d$ individually.

Lemma 3.29. For each

N \in ℕ^{+}

M (N + 4) = t_{N} e^{T} = r_{N + 1}

Proof. From Lemma 3.24 and Lemma 3.25, it follows that we only need to consider products of the form

t_{0} A_{d_{1}} A_{d_{2}} \dots A_{d_{n}} e^{T} ≕ t_{0} Π e^{T}

, since swapping out

t_{0}

for anything else with the same weight would not increase the result. We consider

\sum_{k = 1}^{n} | d_{k} | = N

because

t_{0}

already contains

4

non-zero digits. By Lemma 3.16, we can rewrite

Π

as a product of matrices from the set

A_{0}, A_{1}, R

. Denote

\begin{aligned} ℬ & ≔ {B_{1} \dots B_{m} | m \in ℕ, B_{k} \in {A_{0}, A_{1}, R}}, \\ ℬ_{0} & ≔ {B_{1} \dots B_{m} | m \in ℕ, B_{k} \in {A_{0}, R}} . \end{aligned}

Let

l

be the maximum index such that

Π = {(A_{1} R)}^{l} B, B \in ℬ

. That is,

t_{0} Π = t_{l} B

. Clearly,

B

contains exactly

N - l

A_{1}

matrices. If

l = N

, then

B \in ℬ_{0}

, therefore

t_{0} Π e^{T} = t_{N} B e^{T} = t_{N} e^{T} = r_{N + 1}

(by Lemma 3.27). This shows a lower bound

M (N + 4) \geq r_{N + 1}

. It remains to show that if

l < N

, then this bound is not exceeded, that is,

t_{0} Π e^{T} < r_{N + 1}

. Due to how we chose

l

B

cannot be of the form

A_{1} \tilde{B}, \tilde{B} \in ℬ

, because then we could have chosen a higher

l

. We shall consider several different cases, one of which has to happen:

$B = R^{j} A_{1} \tilde{B}, j \in {2,3}, \tilde{B} \in ℬ$: Then $t_{l} R^{j} A_{1} \sim t_{l} A_{i^{- j}} ≺ t_{l + 1}$ (by Lemma 3.16 and Lemma 3.28). Therefore, the maximum cannot be reached by Lemma 3.24.
$B = R A_{1} R^{j} A_{1} \tilde{B}, j \in {0,1,2,3}, \tilde{B} \in ℬ$: Then $t_{l} R A_{1} R^{j} A_{1} = t_{l} A_{- i} R^{j + 1} A_{1} \sim t_{l} A_{- i} A_{i^{- j - 1}} ≺ t_{l + 2}$ (by Lemma 3.16 and Lemma 3.28). Therefore, the maximum cannot be reached by Lemma 3.24.
$B = R A_{1} R^{j} A_{0}^{k} A_{1} \tilde{B}, k \in ℕ^{+}, j \in {0,1,2,3}, \tilde{B} \in ℬ$: It can be verified that $A_{0} A_{1}$ is component-wise less-or-equal to $A_{1}$ , therefore $t_{l} R A_{1} R^{j} A_{0}^{k} A_{1} < t_{l} R A_{1} R^{j} A_{1}$ , so this is reduced to the previous case.
$B = R A_{1} \tilde{B}, \tilde{B} \in ℬ_{0}$: This implies that $l = N - 1$ and $e Π e^{T} = t_{l} R A_{1} \tilde{B} e^{T} = t_{l} R A_{1} e^{T}$ . For $N \leq 3$ , we can manually check that $e Π e^{T} = t_{N - 1} R A_{1} e^{T} < r_{N + 1}$ . For $N \geq 4$ , we can express $t_{N - 1}$ in terms of $r_{N}$ using Lemma 3.27, then compute that $e Π e^{T} = t_{N - 1} R A_{1} e^{T} = r_{N} + 5 r_{N - 2} + 2 r_{N - 3}$ . It remains to prove that this is less than $r_{N + 1}$ . $\begin{aligned} r_{N + 1} & = r_{N} + 2 r_{N - 1} + 2 r_{N - 2} \\ = r_{N} + 4 r_{N - 2} + 4 r_{N - 3} + 4 r_{N - 4} \\ = r_{N} + 5 r_{N - 2} + 3 r_{N - 3} + 2 r_{N - 4} - 2 r_{N - 5} \\ > r_{N} + 5 r_{N - 2} + 2 r_{N - 3} . \end{aligned}$
$B = R A_{0} \tilde{B}, \tilde{B} \in ℬ$: We assumed that $l < N$ , so B contains at least one $A_{1}$ matrix. That is, there exists a $k \in ℕ^{+}$ and $j \in {0,1,2,3}$ such that $B = A_{0}^{k} R^{m} A_{1} \tilde{B}, \tilde{B} \in ℬ$ (making use of Lemma 3.16 to separate the $A_{0}$ matrices and $R$ matrices). From Lemma 3.28, $t_{l} A_{0}^{k} R^{m} A_{1} ≺ t_{l + 1}$ and therefore, by Lemma 3.24, it cannot start a product reaching the maximum.

Theorem 3.30. Let

n \in ℕ, n \geq 2

. Then each

3

-NAF representation in the extended Penney system with exactly

N

non-zero digits has at most

r_{N - 3}

equivalent representations, and there are exactly

8

such

3

-NAF representations which have exactly

r_{N - 3}

equivalent representations and do not end in

0

Proof. For

N \leq 4

, the second statement follows from Lemma 3.25: The only vectors corresponding to

3

-NAF representations with the maximum number of equivalent representations are

v_{N} P

with

P \in 𝒫

, giving a total of

8

vectors, which are all distinct. The maximum number of equivalent representations can then be calculated manually, such as by using a non-deterministic variant of the modular algorithm. For

N > 4

, Lemma 3.29 immediately gives the first statement. Its proof also shows that the vector of any representation with exactly

r_{N - 3}

equivalent representations is of the form

e A_{d_{1}} \dots A_{d_{n}} \sim t_{N} B, B \in 𝒩_{0}

. Since we are only counting representations that do not end in

0

, it follows that

B

consists only of

R

matrices, therefore

e A_{d_{1}} \dots A_{d_{n}} \sim t_{N}

. By definition, there are

8

such vectors, which are distinct due to Lemma 3.27.

Representation of complex numbers in redundant numeration systems | Counting maximum optimal representations

Transducer

Transducer as an oriented graph

Simplifying the graph

Converting to a matrix problem