## IPv4 Address External Data Representation

- Details
- Written by Lawrence Hughes

Internally IP addresses are simply strings of bits (internal binary). This is what they are in memory or "on the wire". Humans cannot easily work with strings of 32 bits, let alone 128, even if they were strings of ASCII "0" and "1" characters. So for humans to read in output, or to enter as input, we have "external data representation".

IPv4 uses "Dotted decimal" notation for external data representation, which breaks the 32 bits into four groups of eight bits each. In the original "classful allocation" scheme, this actually made some sense. Because the creators of IPv4 thought that many people would not be be confused by hexadecimal, they (unfortunately) chose to represent each 8 bit group with a base 10 (decimal) number. Eight bits can hold binary (base 2) values from 0000 0000 to 1111 1111 . In decimal (base 10) this is from 0 to 255. In hexadecimal (base 16) this is from 0x00 to 0xff.

If you just ran the four decimal groups together with no separators, you would not be able to leave off leading zeros and the result would *look* like a regular 12 digit decimal number in "place value notation" (which is isn't). Without separators, a typical address would look like: *172020002001*. They added the "dot" (period, decimal point) as a separator between the 8 bit groups, which made it clearly not just a 12 digit decimal number, and also allowed dropping leading zeros in each 8 bit group. So, the final external data representation for IPv4 looks like *172.20.2.1*.

**Why Was Decimal a Bad Choice for IPv4?**

I said it was unfortunate that they chose decimal because it is often necessary to convert dotted decimal addresses to and from 32 bit binary. For example, masking is used to split a 32 bit address into a prefix (network number) and an address suffix (interface identifier). This involves doing a Boolean AND operation with a 32 bit mask value, which works OK with dotted decimal so long as the mask can only be 255.0.0.0 (Class A - first 8 bits are 1), 255.255.0.0 (Class B, first 16 bits are 1) or 255.255.255.0 (Class C, first 24 bits are 1). You basically keep the fields where there was a 255 in the mask and set the other fields to zero. Before CIDR these were the only masks allowed. It was clumsy, but it worked. Until CIDR came along.

All IP addresses in a given suffix have the same prefix (network number), but each interface must have a unique suffix (Interface Identifier). For example, in a /24 subnet (mask 255.255.255.0), the network number is the first 24 bits and the Interface Identifier is the last 8 bits. So IP address 192.168.1.100 has network number 192.168.1.0 and interface identifier 0.0.0.100. The network number is the IP address bitwise ANDed with the subnet mask (255.255.255.0). The interface identifier is the IP address bitwise ANDed with the bitwise NOT of the subnet mask (0.0.0.255).

The Boolean AND function produces 1 if both inputs are 1, otherwise it produces 0.

The Boolean OR (inclusive OR) function produces 1 is either or both of the inputs are 1.

The Boolean XOR (exclusive OR) function produces 1 if either of the inputs is 1, but not both.

The Boolean NOT function (ones' complement) inverts the input.

Boolean Operations (NOT, AND, OR and XOR)

A "bitwise" operation applies a Boolean operation (NOT, AND, OR, XOR) to each bit of the inputs to product each bit of the output.

So, "NOT 255.255.0.0" is "0.0.255.255". Also "192.168.1.100 AND 255.255.255.0" is "192.168.1.0". This makes sense if you realize that 255 decimal is 1111 1111 binary.

Doing a bitwise AND operation with a mask retains the bits in the IP address where there is a 1 bit in the mask, but clears any bits in the IP address where there is a 0 bit in the mask. Hence, "192.168.1.100 AND 255.255.255.0" retains the first three fields, but sets the last field to zero (producing the network number, 192.168.1.0). Likewise, "192.168.1.100 AND NOT 255.255.255.0" is "192.168.1.100 AND 0.0.0.255" retains the last field, but sets the first three fields to zero (producing the interface identifier, 0.0.0.100). This process is called "masking", and the second operand (e.g. 255.255.255.0) is called a "mask", or "subnet mask".

If the people who created IPv4 had gone with "dotted hex" instead of dotted decimal, the dotted-decimal address *192.168.1.100* would instead have been represented as *c0.a8.1.64*. The subnet mask *255.255.255.0* would have been *ff.ff.ff.0*. There is nothing stopping someone from writing software that uses these external data representations, but most people would be confused by them, and it would not match any training or documentation. It sure would be easier to understand and do subnetting though. Unfortunately we are pretty much stuck with dotted-decimal for IPv4.

**CIDR Arrives**

With CIDR, it became possible to have masks with any number of 1 bits from the left (from 8 to 30), with the remaining bits 0. For example, in a "/12" mask the first 12 bits are 1 and the remaining 20 bits are 0. In hex, that is simple: 0xfff00000. If there were something called "dotted hex" it would be ff.f0.0.0. Very simple. In dotted decimal, a /12 mask is 255.240.0.0. Where the heck does 240 come from? Well, 1111 1111 binary is 255 decimal, but 1111 0000 binary is 240 decimal (128 + 64 + 32 +16).

**Converting Between Decimal and Binary**

To convert a binary number to decimal, you must add together the products of each bit and the corresponding power of two (place value notation). In other words, 1100 0101 binary is:

**1*** 128 = 128

**1*** 64 = 64

**0*** 32 = 0

**0*** 16 = 0

**0*** 8 = 0

**1*** 4 = 4

**0*** 2 = 0

**1*** 1 = 1

To convert a decimal number to binary, you have to do successive integer division by powers of two and carry the remainders down to the next row.

**1**, remainder = 69

**1**, remainder = 5

**0**, remainder = 5

**0**, remainder = 5

**0**, remainder = 5

**1**, remainder = 1

**0**, remainder = 1

**1**, remainder = 0

The binary value is the series of quotients, or 1, 1, 0, 0, 0, 1, 0, 1. So 197 decimal = 1100 0101 binary. See, easy. Right? Wrong! Compared to converting hex to binary or binary to hex, this is a major pain. I definitely can't do it in my head. Converting even an 8 bit value between decimal and binary is a pain, let alone a 32 bit value. This is because 10 is not a power of two (2, 4, 8, 16, 32, 64, etc.). So each decimal digit converts into not some integer number of bits but into about 3.321928 bits, and every 3.321928 bits converts into one decimal digit. This is because *log base 2 of 10 is 3.321928... *(2 to the 3.32192 power is roughly 10).