32leaves.net

Word generation using an n-dimensional discrete hypercube

This algorithm is something a friend of mine and I have developed quite some time ago (probably a year or so). It might also be already known under some name and I’m not aware of it. None the less it’s an interesting solution to a quite interesting problem.
Suppose that you wanted to generate the words w \in \Sigma^n of the alphabet \Sigma = \{\alpha_0, \alpha_1, \dots, \alpha_m\}. And that you wanted to do that in a parallel manner, such that you had a set of threads where each thread has a unique number out of 0\dots \vert\Sigma\vert^n assigned.
In that case it would be favorable to have a mapping in the form \underbrace{\{k: 0 \leq k < \vert\Sigma\vert^n\}}_S\to\Sigma^n, so that each thread could easily (reading with less computational effort) generate the word it’s supposed to work on. Notice that implicit definition of S:=\{k: 0 \leq k < \vert\Sigma\vert^n\}.
To solve the problem described previously, we start with a function that maps elements of S to points in an n-dimensional discrete hypercube. That mapping V: S\to\{ k: 0\leq k < \vert\Sigma\vert \}^n is defined as follows: V(s) = \begin{bmatrix} \left\lfloor\frac{s}{l^0}\right\rfloor \mod l \\ \left\lfloor\frac{s}{l^1}\right\rfloor \mod l \\ \vdots \\ \left\lfloor\frac{s}{l^{n-1}}\right\rfloor \mod l \end{bmatrix} where l:=\vert\Sigma\vert. This mapping obviously maps a natural number s < \vert\Sigma\vert^n to a point in a discrete n-dimensional hypercube with an edge length of \vert\Sigma\vert. Example: Consider \vert\Sigma\vert = 3 and n = 3. Then we can visualize V as follows:
Now that we have a mapping from a natural number to a point in the hypercube, we need a mapping from a point in the hypercube to the word itself. Such a mapping W: \{ k: 0\leq k < \vert\Sigma\vert \}^n \to \Sigma^n is easy to find. Consider W(p) := \alpha_{p_0}\alpha_{p_1}\dots\alpha_{p_n} where the p_i are the components of the vector p and \alpha_i \in \Sigma.
So our final mapping F: S\to\Sigma^n is F:=W\circ V and exactly what we desired. One might notice that F must be surjective so that each word is computed if there are just enough threads. To prove that, we’ll prove that V as well as W is surjective (since the functional composition of two surjective functions is surjective as well).
Theorem: V is surjective, so that \forall v \in \{ k: 0\leq k < \vert\Sigma\vert \}^n : \exists x \in S : V(x) = v.
Proof: Let l:=\vert\Sigma\vert, u_i\in\mathbb{N}, y\in S and v:=\begin{bmatrix} v_1 & \dots & v_n\end{bmatrix}^T. Assume that y = v_1\cdot l^0 + \dots + v_n\cdot l^{n-1}, then

V(y) = 	\begin{bmatrix}  		v_1+\underbrace{\dots+v_{n}\cdot l^{n-1}}_{u_1\cdot l} \mod l \\ 		\vdots \\ 		\underbrace{\left\lfloor\frac{v_1\cdot l^0 + \dots}{l^{n-1}}\right\rfloor}_{u_n\cdot l} + v_n \mod l 	\end{bmatrix} (1)
= \begin{bmatrix}  		v_1 \mod l & \dots & v_n \mod l 	\end{bmatrix}^T (2)
= \begin{bmatrix} v_1 & \dots & v_n\end{bmatrix}^T = v (3)

The transition from (1) to (2) is allowed as u_i\cdot l\mod l = 0. Similar goes for the equivalence of (2) and (3): v_i \mod l = v_i as v_i \leq l.

Theorem: W is surjective, so that \forall w \in \Sigma^n : \exists v \in \{ k: 0\leq k < \vert\Sigma\vert \}^n : W(v) = w.
Proof: Let w:=\alpha_{v_1}\dots\alpha_{v_n} then the vector v=\begin{bmatrix}v_1 & \dots & v_n\end{bmatrix} (which obviously fullfils W(v)=w) must exist, since v_i < \vert\Sigma\vert and v\in \{ k: 0\leq k < \vert\Sigma\vert \}^n.
So the mapping F is surjective and does exactly what we want. I consider that solution to be a particular elegant one and wanted of post it for about a year now. Finally I found some spare time to do so. If you use that solution, find a mistake or find it somewhere else (maybe even previously described) please drop a comment.

Leave a Reply

Fork me on GitHub