Representation of complex numbers in redundant numeration systems | Counting maximum optimal representations

We have seen in Theorem [t-naf-optimal] that with β=2,𝒟={0,±1}, the 2-NAF representation of a given number is always optimal. However, it might not be strictly optimal – that is, there might exist other representations of the same number with equal Hamming weight. For certain applications, it is important to know the number of such optimal representations. In [num-binary-signed-reprs], it is shown that a binary NAF representation (xn,,x0) has at most FW(xn,,x0) equivalent reduced optimal representations (including itself), where (Fn)n=0 denotes the Fibonacci sequence defined by the recurrence F0=0,F1=1,Fn+2=Fn+Fn+1. The proof makes use of a transducer that converts any (2,{0,±1})-representation into the equivalent NAF representation. In this chapter, we are going to prove a similar statement about 3-NAF extended Penney representations of Gaussian integers by constructing an analogous transducer.

We are going to work with the extended Penney system, so we shall use the symbol βPi1.

Transducer

Definition 3.1. A transducer is a function δ:Q×𝒟Q×𝒟*, where Q is its set of states, 𝒟 is a set of digits and 𝒟* is the set of finite sequences of digits.

Given a transducer, a sequence (xn,,x0)𝒟* and an initial state q0Q, we can use the transducer to transform the sequence as follows:

(q1,η0)δ(q0,x0),(q2,η1)δ(q1,x1),(qn+1,ηn)δ(qn,xn).

Intuitively, we can think of the transducer as a machine that “consumes” the input from the right one digit at a time and based on the digit and its internal state, it outputs some digits and changes its internal state. In each step, we obtain a sequence ηk which forms a part of the output. By concatenating them, we get the result of the transformation: η(ηn,,η0). We can define an extended version of the transducer δ*:Q×𝒟*Q×𝒟* which performs all the steps at once: δ*(q0,(xn,,x0))(qn+1,η).

The state of our transducer is going to represent the difference between the numbers represented by the digits that have been read so far and the digits that have been written so far. The set of possible states, shown in Figure [f-Q-points], shall be

Q={0,±1,±i,±1±i,±2,±2i,±1±2i,±2±i}

We define the transducer itself as follows:

δ(q,x)(q,0)if βP|(x+q),q=x+qβPδ(q,(z,y,x))(q,(0,0,r))if βP(x+q),(zyx)βP+q=qβP3+r,r{±1,±i}.

Note that in the second case, the transducer consumes three digits at once, so it does not strictly satisfy Definition 3.1, but it can still be used to unambiguously define an extended transducer δ*. We just need to verify that the definition is correct in terms of always producing a valid new state and valid output. The fact that we can always find q,r in the second case follows immediately from Lemma [t-eps-odd-last-digit]. It remains to show that always qQ, which can be done by manually checking all finitely many inputs (which is made easier by the symmetry present).

Now we need to prove that the transducer δ0 actually turns any representation into the equivalent 3-NAF representation, starting with q0=0.

Lemma 3.2. Let (xn,,x0) be a representation in the extended Penney system. Assume that the transducer δ* described above reads all digits of the representation. Then its output (yn,,y0) and the final state qn+1 will satisfy (xnx0)βP=qn+1βPn+1+(yny0)βP.
Proof. We shall prove the statement by induction. Given the empty representation, the transducer produces an empty representation and stays in the state q0=0, which clearly satisfies the equation. Now assume that the statement is true for representations of length n. As the induction step, we will show that if the transducer consumes n digits and then performs one more step, the statement will still hold. Naturally, we are going to distiguish two cases based on the definition of δ:
Note. It is possible that the transducer will fail to read the whole representation, being left with one or two digits that fall into the second case, which requires three digits. In this case, we can simply prepend up to two zeros to finish the transformation.
Theorem 3.3. Let (xn,,x0) be a representation in the extended Penney system. Then it is possible to prepend finitely many zeros to the representation so that the transducer δ* described above reads all digits, ends up with a final state ql+1=0 and outputs the equivalent 3-NAF representation.
Proof. By Lemma 3.2 and the note below it, we can prepend 0 to 2 zeros so that the transducer reads all digits, ends up in a state qm+1 and outputs a representation (ym,,y0) satisfying (xmx0)βP=qm+1βPm+1+(ymy0)βP. Notice that when the input to the transducer consists entirely of zeros, it is identical to the algorithm described in Theorem [t-eps-naaf-uniq], finding the 3-NAF representation of q. Therefore, after consuming the original input, we can input a few more zeros into it to get it into the zero state, causing it to output the 3-NAF representation of qm+1 as (yl,,ym+1). We can substitute this into the above equation: (xmx0)βP=(ylym+1)βPβPm+1+(ymy0)βP=(yly0)βP. Therefore, (yl,,y0) is an equivalent representation. Since the transducer always outputs non-zero digits with two zeros before them, this representation is also 3-NAF.

Transducer as an oriented graph

We can naturally represent the transducer δ as an oriented graph G, where vertices represent the possible states Q and edges represent transitions. An edge corresponding to reading a digit x and outputting a zero will be labelled x|0, one that reads three digits (z,y,x) and outputs three digits (0,0,r) will be labelled zyx|00r. Since the graph has many edges, a picture of it would be unreadable, but for illustration, Figure [f-G-edges-example] shows three nodes and two edges from this graph.

Definition 3.4. Let G be a graph with vertices Q and edges E. An oriented walk in G of length l is a sequence (q0,e1,q1,e2,,el,ql), where q0,,qlQ and k{1,,l}:ek=(qk1,qk)E.

Naturally, a transformation using the extended transducer δ* corresponds to an oriented walk in the graph G of the original transducer, where q0,,ql are the visited states and e1,,el are the performed transformations. Notice that since we are reading and writing the representations from right to left, this walk will be written in the “reverse” order in contrast with its input and output.

We are now ready to use the graph for analyzing the optimality of representations. For this, we need to introduce the notion of the weight of an edge, which shall indicate how many non-zero digits the transduces removes when performing the corresponding transition.

Definition 3.5. Let e be an edge of our graph G with label xmx0|ymy0. Then we define its weight as W(e)W(xm,,x0)W(ym,,y0).
Definition 3.6. Let P=(q0,e1,,el,ql) be an oriented walk in G. Then its weight is defined as W(P)k=1lW(ek).

Note that some edges in G have a negative weight, so we cannot straight up say that no edge increases the Hamming weight and therefore the 3-NAF representation is optimal. However, it is the case that if we make a complete walk starting and ending in the state 0, then the sum of all edges on the walk is non-negative, as can be easily proven:

Lemma 3.7. Let P be a walk in our graph G starting and ending in 0. Then W(P)0.
Proof. We can use the Bellman-Ford algorithm [bellman-ford] to find the minimum-weight walk from 0 to itself. The algorithm indicates that the shortest walk has length 0.
Theorem 3.8. Every 3-NAF representation (xn,,x0) in the extended Penney system is optimal.
Proof. Let (ym,,y0) be another representation of the same number and (q0,e1,,el,ql) the walk taken in G when transforming (ym,,y0) into its 3-NAF representation using δ*. From Theorem [t-eps-naaf-uniq], we know that the output is equal up to leading zeros to (xn,,y0). Using this and Lemma 3.7, we have W(xn,,x0)W(ym,,y0)=k=1lW(ek)0. Therefore, an arbitrary equivalent representation is at least as long as the 3-NAF representation, which was to be proven.

Lemma 3.7 and the proof of Theorem 3.8 motivate the following definition and trivial lemma:

Definition 3.9. A walk P in G is optimal if W(P)=0.
Lemma 3.10. An extended Penney representation (xn,,x0) is optimal if and only if the walk in G produced by using the transducer δ* on (xn,,x0) is optimal.

Simplifying the graph

Although the graph G has many edges, we can exploit the symmetries present in the problem in order to simplify it.

Lemma 3.11. Let e=(q,q)E be an edge in the graph G of the transducer δ described above and d{±1,±i}. Then de(dq,dq)E and
Proof. Notice that βP=iβP,βP2=βP2 and βP3=iβP3.

The symmetry demonstrated in Lemma 3.11 allows us to introduce an equivalence relation on the set of states Q where q1q2 if q1=dq2 or q1=dq2 for some d{±1,±i}. We can then group the states into equivalence groups:

[0]={0}[1]={±1,±i}[i+1]={±1±i}[2]={±2,±2i}[i+2]={±2±i,±1±2i}
Lemma 3.12. Let P be a walk in our graph G starting in q0 and ending in ql. Let q0Q,q0q0. Then there exists a walk P of the same length and weight as P which starts in q0 and ends in some qlQ,qlql.
Proof. By definition of , we can transform q0 into q0 by multiplication with a number from {±1,±i} and possibly complex conjugation. By Lemma 3.11, there is an edge from q0 to q1, where q1q1. We can repeat this argument for all edges on P. It is also easy to check that the equivalent edges have the same weights.

Notice also that if a representation is optimal, then converting it to 3-NAF using the transducer does not increase its Hamming weight, therefore the weight of the corresponding walk in G is zero. Such a walk shall be called an optimal walk.

Lemma 3.13. Let e be an edge from q to q that is contained in some optimal walk P. Then e has the minimum weight out of all edges from q to q.
Proof. Assume for the sake of contradiction that there exists an edge e from q to q with W(e)<W(e). Then we could replace e with e in P and obtain a walk from 0 to 0 with a negative weight. However, according to Lemma 3.7, this is impossible.

These lemmas allow us to construct a much simpler graph Γ that can still be used to analyze the optimality of representations. Its vertices will be the equivalence classes of Q. By Lemma 3.11, we can group edges in G that differ only in symmetry into one edge in Γ. By Lemma 3.12, the weights of edges and walks are still going to be unambiguously defined, so we can label the edges with the common weight of all corresponding original edges. Lemma 3.13 also allows us to discard edges that cannot be used in an optimal walk. The result is shown in Figure [f-Gamma].

If we use the Bellman-Ford algorithm on this new graph to calculate the minimum-weight walk between each pair of vertices, we will notice that some edges are not the minimum-weight walk between their start and end. Such edges cannot lie on any optimal walk, because if they did, we could replace them with the shorter walk, similarly to the proof of Lemma 3.13. Additionally, the vertices 2 and [i+2] have the property that the minimum-weight walk from 0 to either of them and then back to 0 has a positive weight. Therefore, no optimal walk can go through these vertices. If we remove the problematic vertices and edges, we get the graph Γ~ depicted in Figure [f-Gammatilde], which has only 3 vertices and 7 edges.

However, by collapsing the vertices of G into equivalence classes, we have lost information about the specific outputs of the edges, since they are not identical for edges from a given equivelence class to a given equivalence class. We are going to need this information, so we shall introduce yet another graph G~, a subgraph of G consisting only of the edges that are represented in Γ~. This graph, shown in Figure [f-Gtilde], is still quite small, with 9 vertices and 61 edges.

Lemma 3.14. All walks in Γ~, as well as G~, are optimal.
Proof. Consider first Γ~. The only edges that have a non-zero weight are the ones between [0] and the other two vertices, with those from [0] having +1 and the one to [0] having 1. Any walk that starts and ends in [0] uses an equal number of edges to and from [0], so the sum of weights is 0. Since the edges in G~ have the same weights as the corresponding edges in Γ~, this argument also applies to G~.
Theorem 3.15. Let xZ[i]. Then the number of optimal reduced representations of x is equal to the number of walks in G~ whose output is the 3-NAF representation of x.
Proof. A direct consequence of Lemma 3.10, Lemma 3.14 and the fact that G~ is a subgraph of G containing all optimal walk.

Converting to a matrix problem

In Theorem 3.15, we have proven that the graph G~ is a good tool for counting optimal extended Penney representations. We still need a way to count all the possible walks in G~. The standard graph-theoretic way to count walks is using adjacency matrices. However, we do not actually want to count all walks, just the ones with a specific output. We can achieve this by defining a separate adjecency matrix for each subgraph consisting only of edges with the same output.

First, we need to put the vertices in a specific order, represented by a tuple of the vertex labels:

V(0,1,i,1,i,1+i,1+i,1i,1i)

We also assign each possible output of an edge to a single digit in the natural way:

O0(0),Od(0,0,d),d{±1,±i}

Then, for each d{0,±1,±i}, we define the adjacency matrix Ad09×9 like so:

(Ad)i,jthe number of edges in G~ from Vi to Vj whose output is Od. The matrices A0 and A1 look like this: A0=(100000000000000100000000010000000001000001000000000000000000000000000000000000000),A1=(110000000000000000201100100200110010000000000000000000100100000000000000000000000).

The remaining three matrices Ai,A1,Ai can be expressed in terms of A1 using the symmetries described in Lemma 3.11. To be specific, we define matrices R,C09×9 as follows:

R(100000000001000000000100000000010000010000000000000100000000010000000001000001000),C(100000000010000000000010000000100000001000000000000001000000010000000100000001000).

These matrices have the following properties, which can be verified by direct computation:

C=CT,RT=R1,C=C1,RC=CRT,

Due to these properties, the matrices form a group with 8 elements:

𝒫{I,R,R2,R3,C,CR,CR2,CR3}

The properties in the following lemma establish relationships between the adjacency matrices:

Lemma 3.16. d{0,±1,±i}:Aid=R1AdR,d{±1,±i}:Ad=RCAdC,A0=CRA0C.
Lemma 3.17. Let d{±1,±i} and P𝒫. Then there exist h{±1,±i} and S,T𝒫 such that PAd=AhS and PA0=A0T.
Lemma 3.18. Let (d1,,dl) be a sequence of digits. Then the number of walks (q0,e1,,el,ql) in G~ starting at Vi and ending at Vj such that the output label of each ek is Odk is (Ad1Adl)i,j.
Proof. We shall prove the statement by induction. The case l=1 follows directly from the definition of the adjacency matrices. Assume that the statement is true for l and let Vi,Vj be some vertices. Then for each vertex Vk, there are (Ad1Adl)i,k walks from Vi to Vk with l edges and (Adl+1)k,j edges from Vk to Vj. Each walk from Vi to Vj with l+1 consists of one said walk and one said edge, where Vk can be arbitrary. Therefore, the total number of such walks is k=19(Ad1Adl)i,k(Adl+1)k,j=(Ad1Adl+1)i,j.
Theorem 3.19. Let (Odn,,Od1) be the 3-NAF extended Penney representation of x[i]. Then the number of optimal reduced extended Penney representations of x is equal to (Ad1Adn)1,1=eAd1AdneT, where e=I1, is the first row unit vector.
Proof. Follows directly from Theorem 3.15 and Lemma 3.18.

At last, we have converted our problem to a matrix problem, making it easier to reason about. Our ultimate question is: Given a number N, which 3-NAF representation with Hamming weight N has the most equivalent representations? And how many such 3-NAF representations exist? We can formulate the first question using matrices as follows:

M(N)max{eAd1AdneT|dk{0,±1,±i},k=1n|dk|=N}=?
Definition 3.20. Let u,v9 be row vectors. We define the relation as uvi{1,,9}:uivi.
Definition 3.21. Let u,v9 be row vectors. We define the partial ordering as uvP𝒫:uPv. We say that u is majorized by v. If also vu, we denote uv.
Definition 3.22. Let u,v9 be row vectors. We define the relation as uvuvu1<v1. We say that u is strictly majorized by v. If also vu, we denote uv.
Lemma 3.23. is an equivalence relation on 9 and is a partial ordering on 9\slash.
Lemma 3.24. Let d1,,dl,f1,,fm{0,±1,±i} be sequences of digits such that k=1l|dk|=k=1m|fk| and eAd1AdleAf1Afm. Then for any dl+1,,dn{0,±1,±i} we have eAd1AdneT<M(N),Nk=1n|dk|.
Lemma 3.25. Let d1,,dn be digits such that Nk=1n|dk|{2,3,4}. Let ueAd1Adn. Denote B2A1A1,v2eB2=(3,1,1,1,0,1,0,0,0)B3A1A1Ai,v3eB3=(8,1,3,1,3,1,1,0,0)B4A1A1AiAi,v4eB4=(17,1,5,3,8,1,3,0,0) Then either vNu or vNun=NP,S𝒫:STBNP=Ad1Adn.
Proof. Notice that B02=eTe, so wB02wB0 for any vector w, and also eA0=e. Therefore, we only need to check vectors u that do not contain B02 or start with A0. in other words, u=eAf1A0l1Af2AfNA0lN,fk{±1,±i},lk{0,1}. There are only finitely many such vectors for N{2,3,4}, so we can verify the theorem manually.
Definition 3.26. We define the recurrent sequence of integers r13,r08,r117,rN+3rN+2+2rN+1+2rN and the recurrent sequence of vectors t0eA1A1AiAiR3=(17,5,3,8,1,3,0,0,1),tN+1tNA1R2.
Note. Obviously, rN is a strictly increasing sequence.
Lemma 3.27. For each n+, tN=(rN+1,rN2+rN1,rN1,rN,rN2,rN1,0,0,rN2).
Proof. Straightforward proof by induction.
Lemma 3.28. Let N,l+,d{±1,±i}. Then
  1. tNA1tN+1,
  2. tNA1tN+1tNAitN+1,
  3. tNAiAdtN+2,
  4. tNA0lAdtN+1
Proof.
  1. tNA1tNA1R2=tN+1.
  2. By expressing each component in terms of the sequence rN (using Lemma 3.27), we obtain tNA1C=tNAiRCtN+1, with the inequality being strict in the first component.
  3. By expressing each component in terms of the sequence rN, we can verify tNAiA1CR2tNAiAiR3 and tNAiAiRtNAiA1C. Therefore, we just need to show that tNAiAiR3tN+2 and tNAiA1CtN+2. We shall show this by induction. For n{1,2,3}, the inequalities can be verified manually. Now assume that either of the inequalities is true for N, N+1 and N+2. By taking these three inequalities, multiplying the first two by 2 and adding them together with the third one, we get the inequality for N+3, which completes the induction step.
  4. If l2, then tNA0lAd=rN+1etN+2 (due to the fact that B02=eTe). If l=1, then we can again express each component in terms of the sequence rN and verify tnA0AdtN+1 for each d individually.
Lemma 3.29. For each N+, M(N+4)=tNeT=rN+1.
Proof. From Lemma 3.24 and Lemma 3.25, it follows that we only need to consider products of the form t0Ad1Ad2AdneTt0ΠeT, since swapping out t0 for anything else with the same weight would not increase the result. We consider k=1n|dk|=N because t0 already contains 4 non-zero digits. By Lemma 3.16, we can rewrite Π as a product of matrices from the set A0,A1,R. Denote {B1Bm|m,Bk{A0,A1,R}},0{B1Bm|m,Bk{A0,R}}. Let l be the maximum index such that Π=(A1R)lB,B. That is, t0Π=tlB. Clearly, B contains exactly Nl A1 matrices. If l=N, then B0, therefore t0ΠeT=tNBeT=tNeT=rN+1 (by Lemma 3.27). This shows a lower bound M(N+4)rN+1. It remains to show that if l<N, then this bound is not exceeded, that is, t0ΠeT<rN+1. Due to how we chose l, B cannot be of the form A1B~,B~, because then we could have chosen a higher l. We shall consider several different cases, one of which has to happen:
B=RjA1B~,j{2,3},B~
Then tlRjA1tlAijtl+1 (by Lemma 3.16 and Lemma 3.28). Therefore, the maximum cannot be reached by Lemma 3.24.
B=RA1RjA1B~,j{0,1,2,3},B~
Then tlRA1RjA1=tlAiRj+1A1tlAiAij1tl+2 (by Lemma 3.16 and Lemma 3.28). Therefore, the maximum cannot be reached by Lemma 3.24.
B=RA1RjA0kA1B~,k+,j{0,1,2,3},B~
It can be verified that A0A1 is component-wise less-or-equal to A1, therefore tlRA1RjA0kA1<tlRA1RjA1, so this is reduced to the previous case.
B=RA1B~,B~0
This implies that l=N1 and eΠeT=tlRA1B~eT=tlRA1eT. For N3, we can manually check that eΠeT=tN1RA1eT<rN+1. For N4, we can express tN1 in terms of rN using Lemma 3.27, then compute that eΠeT=tN1RA1eT=rN+5rN2+2rN3. It remains to prove that this is less than rN+1. rN+1=rN+2rN1+2rN2=rN+4rN2+4rN3+4rN4=rN+5rN2+3rN3+2rN42rN5>rN+5rN2+2rN3.
B=RA0B~,B~
We assumed that l<N, so B contains at least one A1 matrix. That is, there exists a k+ and j{0,1,2,3} such that B=A0kRmA1B~,B~ (making use of Lemma 3.16 to separate the A0 matrices and R matrices). From Lemma 3.28, tlA0kRmA1tl+1 and therefore, by Lemma 3.24, it cannot start a product reaching the maximum.
Theorem 3.30. Let n,n2. Then each 3-NAF representation in the extended Penney system with exactly N non-zero digits has at most rN3 equivalent representations, and there are exactly 8 such 3-NAF representations which have exactly rN3 equivalent representations and do not end in 0.
Proof. For N4, the second statement follows from Lemma 3.25: The only vectors corresponding to 3-NAF representations with the maximum number of equivalent representations are vNP with P𝒫, giving a total of 8 vectors, which are all distinct. The maximum number of equivalent representations can then be calculated manually, such as by using a non-deterministic variant of the modular algorithm. For N>4, Lemma 3.29 immediately gives the first statement. Its proof also shows that the vector of any representation with exactly rN3 equivalent representations is of the form eAd1AdntNB,B𝒩0. Since we are only counting representations that do not end in 0, it follows that B consists only of R matrices, therefore eAd1AdntN. By definition, there are 8 such vectors, which are distinct due to Lemma 3.27.