Encoding a Phrase in a Number

In order to use the RSA cryptosystem to encrypt messages, it is necessary to encode them as a sequence of numbers of size less than $ n=pq$ . We now describe a simple way to do this. Note that in any actual deployed implementation it is crucial that you add extra random characters (``salt'') at the beginning of each block of the message, so that the same plain text encodes differently each time. This helps thwart chosen plain text attacks.

Suppose $ s$ is a sequence of capital letters and spaces, and that $ s$ does not begin with a space. We encode $ s$ as a number in base $ 27$ as follows: a single space corresponds to 0 , the letter $ A$ to $ 1$ , $ B$ to $ 2$ , $ \ldots$ , $ Z$ to $ 26$ . Thus ``RUN NIKITA'' is a number written in base $ 27$ :

RUN NIKITA$\displaystyle \quad\leftrightarrow \quad$ $\displaystyle 27^9\cdot 18 + 27^8\cdot 21 + 27^7\cdot 14 + 27^6\cdot 0 + 27^5\cdot 14$    
  $\displaystyle + 27^4\cdot 9 + 27^3\cdot 11 + 27^2\cdot 9 + 27\cdot 20 + 1$    
  $\displaystyle = 143338425831991$    (in decimal)$\displaystyle .$    

To recover the letters from the decimal number, repeatedly divide by $ 27$ and read off the letter corresponding to each remainder:

\begin{displaymath}
\begin{array}{rcrcrr}
143338425831991 &=& 5308830586370\cdot...
...''}\\
18 &=& 0 \cdot 27 &+& 18 & \quad\text{\lq\lq R''}
\end{array}\end{displaymath}

If $ 27^k\leq n$ , then any sequence of $ k$ letters can be encoded as above using a positive integer $ \leq n$ . Thus if we can encrypt integers of size at most $ n$ , then we must break our message up into blocks of size at most  $ \log_{27}(n)$ .

SAGE Example 3.2   We use SAGE to implement conversion between a string and a number. The input string s on a computer is stored in a format called ASCII, so each ``letter'' corresponds to an integer between 0 and $ 255$ , inclusive. This number is obtained from the letter using the ord command.
sage: def encode(s):
...    s = str(s)      # make input a string
...    return sum(ord(s[i])*256^i for i in range(len(s)))
sage: def decode(n): 
...    n = Integer(n)  # make input an integer
...    v = []
...    while n != 0:
...        v.append(chr(n % 256))
...        n //= 256    # this replaces n by floor(n/256). 
...    return ''.join(v)
sage: m = encode('Run Nikita!'); m
40354769014714649421968722
sage: decode(m)
'Run Nikita!'

William 2007-06-01