Hexadecimal

Subtitle: "How did those letters get into my numeric addresses?"

Numbers are actually abstract concepts, but to talk about them, write them down, or do calculations with them, we need to represent them in some way. Most people are familiar with the decimal number system, which is the most widely used scheme to represent numbers today.

The Romans had a very different scheme for representing numbers, called Roman Numerals (not to be confused with Roman candles or Roman Polanski). This representation of numbers was so clumsy that they had to struggle to create even simple technology like roads and aqueducts. Quick, what is VII divided by XXII? - use only Roman Numerals. The Arabs made several significant contributions to math - the zero (yes, the Arabs added "nothing" to math), al-jabr (algebra) and (of relevance here) place value notation. That gave them, and those of us who inherited these simple but far reaching ideas, a completely unfair advantage over the Romans. So we managed to go to the moon, while the Romans could only look at it. This is similar to the way that English speakers had an unfair advantage over the Chinese and Japanese in the early days of computing (before we created displays and keyboards that could handle their written language ideographs - actually keyboards are still a challenge).

The first 10 numbers have simple names (zero, one, two, three...) and simple written representations (0, 1, 2, 3..). Beyond 9, we must make numeric "words" from this simple alphabet, just like we do with letters. The "numeric word" 423 is a shorthand notation for "four hundreds plus two tens plus three". The place of a digit in the "word" determines the multiplier used (1, 10, 100, 1000, etc). If you are a mathematician, you immediately recognize these multipliers as 100, 101, 102, 103, etc. Each of these multipliers are ten to some power. The number 10 is the base of each of these multipliers (which is raised to some integer power). Hence this numbering system is called "base 10" or "decimal". We happen to use base 10 because we have ten "digits" (fingers). Computing would have been simpler if we had evolved with 16 fingers instead of 10. If it had been simple to count on our toes as well, we could be using base 20 instead today.

What about numbers after the "decimal point"? The scheme actually extends to those very nicely. The multipliers there are 10-1 (.1) 10-2 (.01), 10-3 (.001), etc. You can do the same thing in other number bases, but fortunately, IP addresses are all non-negative integers (no signs and no fractional part), so we don't need to worry about that for now.

BTW, in case you were confused by the 100 = 1, anything to the zero-th power is 1. (How do you multiply something by itself zero times?). But, 0 to any power is 0. So what is 0 to the zero-th power? 0 or 1? Think about that one for a while! Extra credit for the right answer! Fortunately we don't need the answer to that to continue.

So the decimal number 1623 is actually

1 * 1000 = 1000
6 *  100 =  600
2 *   10 =   20
3 *    1 =    3
           ____
           1623

This looks a little obvious in decimal - you'll see why it is done like this later.

We have ten "digits" in the decimal system, each with a simple name and written representation ('0' to '9'). These are used in each place in place value notation. For counting, we use 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, etc.

 

Base 8 (Octal)

In base 8, the multipliers are 80 = 1, 81 = 8, 82 = 64, 83 = 512 (multipliers written here in decimal - in base 8, they would still be 18, 108, 1008 and 10008). Base 8 is also known as octal. The 8 written as a subscript indicates that those numbers are represented in base 8. So in the previous example, the decimal number 1623 could have been written 162310.

So, the octal number 16238 is converted to decimal like this:

1 * 512 = 512
6 *  64 = 384
2 *   8 =  16
3 *   1 =   3
          ___
          91510

In base 8, there are only 8 distinct digits ('0' to '7'). The digits '8' and '9' from decimal are not used. You would count like this: 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15, 16, 17, 20, 21, etc. The brilliant comedian Tom Lehrer once put it "base 8 is just like base 10, if you are missing two fingers!" (from the song "New Math").

 

Base 2 - Binary

Taking this to the extreme, we can have a representation with only two digits: '0' and '1'. The multipliers here are 20 = 1, 21 = 2, 22 = 4, 23 = 8, etc. Base 2 is also known as binary (as in the "binary language of moisture evaporators", which C3PO happened to speak).

I can't tell you what 1623 base 2 is, because some of those digits don't exist in binary. On the other hand, I can tell you that 162310 is 110010101112. This can be seen from the following:

1 * 1024 = 1024
1 *  512 =  512
0 *  256 =    0
0 *  128 =    0
1 *   64 =   64
0 *   32 =    0
1 *   16 =   16
0 *    8 =    0
1 *    4 =    4
1 *    2 =    2
1 *    1 =    1
           ____
           1623

As we said, in binary, there are only two "digits", '0' and '1'. We count like this: 0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, etc (especially if our name happens to be Bender).

I used my Windows calculator in programmer mode to convert 1623 to binary, but you can do it by hand, with successive divisions by powers of 2 (basically the above table reversed - the binary digits are the remainders of each division).

 

Base 16 - Hexadecimal


Hexadecimal is just like the other numbering systems covered, except that the multipliers are powers of 16, not of 10, 8 or 2. Those are (written in decimal) 160 = 1, 161 = 16, 162 = 256, 163 = 4096, etc. Base 16 is also known as hexadecimal (or often, inaccurately, "hex" for short - really "hex" would be base 6).

Hexadecimal also has 16 distinct digits. Now we have a problem. With octal, we could just leave out the decimal digits 8 and 9. Now we need six extra digits, for a total of 16 distinct one character symbols. We could create some new digit names, like "foo", "bar", "baz", "spa", "fon" and "chachie". We could even create six new characters (I won't try to draw any but you can imagine some). These would be a problem to display on computers, or enter on keyboards. So, instead we steal some of the other ASCII characters - 'a', 'b', 'c', 'd', 'e' and 'f', and make them do double duty as additional digits. So now you know how those letters have gotten mixed up in your numeric addresses. It's also acceptable to use 'A'-'F' (they have the same values as 'a'-'f'). Your Windows calculator can work with hexadecimal.

In Hexadecimal, we count like this: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 1a, 1b, 1c, 1d, 1e, 1f, 20, 21, etc. Once a friend of mine's four year old son came up and told me he had learned to count. He proceeded to count to 20 in hexadecimal while his father (another computer geek) was rolling on the floor laughing. The poor kid probably confused the heck out of his first grade teacher. On the other hand, last year my wife was happy to learn that in hexadecimal she was only 29. This year I couldn't find an 'A' candle for her cake. It seems birthday cake numerals are available only in decimal.

There is also something in computing called base64 - the digits there are 'A'-'Z', 'a'-'z', '0'-'9', '+' and '/' (that's 26 + 26 + 10 + 2 = 64 distinct digits). Unlike base 16, in base 64 is case sensitive. Why would anyone need base64? You can only encode four binary digits in each ASCII character using hexadecimal, but you can encode six binary digits in each ASCII character using base64. It takes 6 hexadecimal characters to encode each 3 bytes (24 bits) of binary, but it only takes only 4 base64 characters to encode the same 3 bytes - a savings of 33%. In large files or transmissions (when data is stored in ASCII) this is significant. Don't worry, we won't be using base64 in this training.

 

Conversions Between Different Bases

There is something cool about any base that is a power of two (2, 4, 8, 16) - it seems that each digit in a number in such bases contains an integer number of binary digits. Base 8 digits contain 3 binary digits each (000 - 111), because 23 is 8. Base 16 digits contain 4 binary digits each (0000 - 1111), because 24 is 16. On the other hand, decimal digits each contain approximately 3.321928 binary digits, because 23.321928 is 10, or alternatively, log2(10) = 3.321928. If that is confusing, the point it that is quick and easy to convert between binary and octal or hexadecimal representations, while it is difficult to convert between decimal and any of binary, octal or hexadecimal.

To convert between binary and hexadecimal, you only need one short table:

0 = 0000     4 = 0100     8 = 1000    c = 1100
1 = 0001     5 = 0100     9 = 1001    d = 1101
2 = 0010     6 = 0110     a = 1010    e = 1110
3 = 0011     7 = 0111     b = 1011    f = 1111

To convert from hexadecimal to binary, just replace each hexadecimal digit with the corresponding four bits in the table. So, 162316 0001 0110 0010 00112, and cafe16 is 1100 1010 1111 11102. Note: with binary (especially if you are converting to or from hexadecimal) it is handy to add a space every four digits, kind of like the comma every three digits in decimal.

To convert from binary to hexadecimal, start on the right (least significant bits), and arrange the bits into groups of four. Replace each group of four bits with the corresponding hexadecimal digit from the above table. So 0110 1011 01012 is 6b516.

With a little practice, you can convert between binary and hexadecimal in your head, on the fly. Until then, use the above table, or your Windows calculator in programmer mode.

 

Dotted Decimal - Base 256

Just for fun, you can consider the dotted decimal notation used to represent IPv4 addresses as base 256. There are 256 distinct digits ('0' to '255'). Unlike the other bases we looked at, some of these digits take more than one character. Because of that, something is needed to separate "digits" - in this case a "dot" (period). The multipliers (in decimal) are 2560 = 1, 2561 = 256, 2562 = 65,536 and 2563 = 16,777,216.

So, to convert from dotted decimal (base 256) to real decimal (base 10), do the same as we did before. So the address 172.20.2.1 (a four digit number in base 256), converted to base 10 would be:

172 * 16,777,216 = 2,885,681,152
 20 *     65,536 =     1,310,720
  2 *        256 =           512
  1 *          1 =             1
                   _____________
                   2,886,992,385
 
This number, converted to hexadecimal is AC14020116. As a check, AC16 is 17210, 1416 is 2010, 216 is 210 and 116 is 110. Voila - 172.20.2.1.
 
Sorry, the programmer mode in Windows calculator doesn't speak base 256. if they had used hexadecimal to represent IPv4 addresses, it would have been much simpler. Fortunately in IPv6, they did.