QR codes have a limited ability to store binary data. In practice, binary data have to be encoded in characters according to one of the modes already defined in the standard for QR codes. The easiest mode to use in called Alphanumeric mode (see Section 7.3.4 and Table 2 of [
ISO18004]. Unfortunately Alphanumeric mode uses 45 different characters which implies neither Base32 nor Base64 are very effective encodings.
A 45-character subset of US-ASCII is used; the 45 characters usable in a QR code in Alphanumeric mode (see Section 7.3.4 and Table 2 of [
ISO18004]). Base45 encodes 2 bytes in 3 characters, compared to Base64, which encodes 3 bytes in 4 characters.
For encoding, two bytes [a, b]
MUST be interpreted as a number n in base 256, i.e. as an unsigned integer over 16 bits so that the number n = (a * 256) + b.
This number n is converted to base 45 [c, d, e] so that n = c + (d * 45) + (e * 45 * 45). Note the order of c, d and e which are chosen so that the left-most [c] is the least significant.
The values c, d, and e are then looked up in
Table 1 to produce a three character string. The process is reversed when decoding.
For encoding a single byte [a], it
MUST be interpreted as a base 256 number, i.e. as an unsigned integer over 8 bits. That integer
MUST be converted to base 45 [c d] so that a = c + (45 * d). The values c and d are then looked up in
Table 1 to produce a two-character string.
A byte string [a b c d ... x y z] with arbitrary content and arbitrary length
MUST be encoded as follows: From left to right pairs of bytes
MUST be encoded as described above. If the number of bytes is even, then the encoded form is a string with a length that is evenly divisible by 3. If the number of bytes is odd, then the last (rightmost) byte
MUST be encoded on two characters as described above.
For decoding a Base45 encoded string the inverse operations are performed.
If binary data is to be stored in a QR code, the suggested mechanism is to use the Alphanumeric mode that uses 11 bits for 2 characters as defined in Section 7.3.4 of [
ISO18004]. The Extended Channel Interpretation (ECI) mode indicator for this encoding is 0010.
On the other hand if the data is to be sent via some other transport, a transport encoding suitable for that transport should be used instead of Base45. For example, it is not recommended to first encode data in Base45 and then encode the resulting string in Base64 if the data is to be sent via email. Instead, the Base45 encoding should be removed, and the data itself should be encoded in Base64.
The Alphanumeric mode is defined to use 45 characters as specified in this alphabet.
Value |
Encoding |
Value |
Encoding |
Value |
Encoding |
Value |
Encoding |
00 |
0 |
12 |
C |
24 |
O |
36 |
Space |
01 |
1 |
13 |
D |
25 |
P |
37 |
$ |
02 |
2 |
14 |
E |
26 |
Q |
38 |
% |
03 |
3 |
15 |
F |
27 |
R |
39 |
* |
04 |
4 |
16 |
G |
28 |
S |
40 |
+ |
05 |
5 |
17 |
H |
29 |
T |
41 |
- |
06 |
6 |
18 |
I |
30 |
U |
42 |
. |
07 |
7 |
19 |
J |
31 |
V |
43 |
/ |
08 |
8 |
20 |
K |
32 |
W |
44 |
: |
09 |
9 |
21 |
L |
33 |
X |
|
|
10 |
A |
22 |
M |
34 |
Y |
|
|
11 |
B |
23 |
N |
35 |
Z |
|
|
Table 1: The Base45 Alphabet
It should be noted that although the examples are all text, Base45 is an encoding for binary data where each octet can have any value 0-255.
Encoding example 1:
The string "AB" is the byte sequence [[65 66]]. If we look at all 16 bits, we get 65 * 256 + 66 = 16706. 16706 equals 11 + (11 * 45) + (8 * 45 * 45), so the sequence in base 45 is [11 11 8]. Referring to
Table 1, we get the encoded string "BB8".
AB |
Initial string |
[[65 66]] |
Decimal value |
[16706] |
Value in base 16 |
[11 11 8] |
Value in base 45 |
BB8 |
Encoded string |
Table 2: Example 1 in Detail
Encoding example 2:
The string "Hello!!" as ASCII is the byte sequence [[72 101] [108 108] [111 33] [33]]. If we look at this 16 bits at a time, we get [18533 27756 28449 33]. Note the 33 for the last byte. When looking at the values in base 45, we get [[38 6 9] [36 31 13] [9 2 14] [33 0]], where the last byte is represented by two values. The resulting string "%69 VD92EX0" is created by looking up these values in
Table 1. It should be noted it includes a space.
Hello!! |
Initial string |
[[72 101] [108 108] [111 33] [33]] |
Decimal value |
[18533 27756 28449 33] |
Value in base 16 |
[[38 6 9] [36 31 13] [9 2 14] [33 0]] |
Value in base 45 |
%69 VD92EX0 |
Encoded string |
Table 3: Example 2 in Detail
Encoding example 3:
The string "base-45" as ASCII is the byte sequence [[98 97] [115 101] [45 52] [53]]. If we look at this two bytes at a time, we get [25185 29541 11572 53]. Note the 53 for the last byte. When looking at the values in base 45, we get [[30 19 12] [21 26 14] [7 32 5] [8 1]] where the last byte is represented by two values. Referring to
Table 1, we get the encoded string "UJCLQE7W581".
base-45 |
Initial string |
[[98 97] [115 101] [45 52] [53]] |
Decimal value |
[25185 29541 11572 53] |
Value in base 16 |
[[30 19 12] [21 26 14] [7 32 5] [8 1]] |
Value in base 45 |
UJCLQE7W581 |
Encoded string |
Table 4: Example 3 in Detail
Decoding example 1:
The string "QED8WEX0" represents, when looked up in Table 1, the values [26 14 13 8 32 14 33 0]. We arrange the numbers in chunks of three, except for the last one which can be two numbers, and get [[26 14 13] [8 32 14] [33 0]]. In base 45, we get [26981 29798 33] where the bytes are [[105 101] [116 102] [33]]. If we look at the ASCII values, we get the string "ietf!".
QED8WEX0 |
Initial string |
[26 14 13 8 32 14 33 0] |
Looked up values |
[[26 14 13] [8 32 14] [33 0]] |
Groups of three |
[26981 29798 33] |
Interpreted as base 45 |
[[105 101] [116 102] [33]] |
Values in base 8 |
ietf! |
Decoded string |
Table 5: Example 4 in Detail