While Virginia Tech has had a strong presence with IPv6 deployment for several years, the skills, awareness, and competency to support it at the department level remain variable. Nearly the entire campus infrastructure has had natively routed IPv6 — the current generation of the Internet Protocol, as we like to say — since 2004 or so. Early deployments at some scale I recall as far back as 1998. Now, in mid-2013, with Google, Facebook, and Netflix traffic largely traveling via IPv6, approximately half of the aggregate Internet traffic for campus uses it. When I ceased being a network engineer as my day job in Spring 2006, I gave a talk at our campus IT support symposium (DCSS) on advanced networking, focused on Internet2, National Lambda Rail, and IPv6. At that time, we had already gone beyond the killer app being “I can ping a funny looking address,” and today IPv6 is a mainstream Internet technology. Yet, many departmental IT support personnel and system administrators still find IPv6 an obscure technology.

In response to discussions on VT’s “techsupport” discussion list (e.g. http://listserv.vt.edu/cgi-bin/wa?A2=ind1308&L=TECHSUPPORT&F=&S=&P=96646), I wanted to discuss the experience we have had at VTTI with IPv6. A particularly salient point made by one discussant was the relative dearth of guidelines, documentation, and best practices for the support community, and I hope to address some of that here. I think the community would benefit from a simple FAQ on the VT computing site of recommended “do this” best practices, and hopefully this article will get this started; certainly adding the v6 blocks to the article I started years ago on “VT campus addresses” (https://computing.vt.edu/content/ip-addresses-virginia-tech) would be well received. Another point is that no one tells you much about IPv4, per se, so such a FAQ should not be considered a necessary condition for the average admin learning what they need to know.

Over the last five years, VTTI has migrated most of its Windows servers from Server 2003 to Server 2008(R2) or 2012, the Active Directory domain from Windows 2000 to 2003 and now 2008 (soon a new domain running 2012), Exchange environment from 2003 to 2010 (soon to go to Office 365), workstations mostly from XP to Windows 7. At the same time, we have added several Mac and Linux systems, especially several Linux servers, and we have begun using virtualization and cloud methodologies. During this time and with these changes, we have had steadily increased use of IPv6. The largest headaches have been poor support by Exchange 2003 (even when the operating system could support it fairly well) and the occasional misconfigured static AAAA record. We did have IPv6 working with XP and 2003, but the support was not great — I recommend not spending much effort on those platforms.

With the current state of the art — Windows 7 and 2008 Server, Mac OS X, modern Linux (even relatively ancient 2.6.18 kernels) — the support is much better and our consensus is “just use it.” The biggest issue now is lack of Dynamic DNS outside the Active Directory environment, but we assign static addresses to servers we care about anyway.

I would also add that IPv6 has not been a major effort among our advances — non-zero, to be sure, but not major. It is just something we have done on the margins along the way, like drinking water with the meal. Having pointed out the changes to our environment above, we have spent a lot more time and effort developing our significant computing, database, and file system environments. While doing this, we have (with the help of many others) deployed compute clusters, parallel database systems, automated data collection systems, complex project-specific servers, etc, while VTTI has been collecting and processing petabytes of research data in pursuit of our research mission. IPv6 does have some strategic value to us, as it has a place in intelligent vehicle communications, but it is far less critical than our data intensive research computing environment. I don’t point this out so much to brag about our work — though I confess I am (justifiably, I think) proud of what my team has accomplished — but rather to point out that our adoption of IPv6 as a technology has not been a major distraction, nor need it be for you. But it does require a little effort and awareness. Hopefully, if you are an IT professional, this is a Good Thing.

A challenge has just been getting comfortable using and seeing IPv6 addresses. All I can say is that you get used to it. Familiarize yourself with RFC 4291 (http://tools.ietf.org/rfc/rfc4291). In particular, know what link-local addresses are (FE80:…), what tunnel addresses look like (2001:0:… or 2002:…), what the campus prefixes are (2001:468:c80:… and 2607:b400:…), how an Ethernet MAC address can be used (http://packetlife.net/blog/2008/aug/4/eui-64-ipv6/), and how zero-fill (::) works (http://tools.ietf.org/html/rfc5952). The idea of filtering based on source addresses to allow campus addresses only is still not a horrible idea, though the idea of filtering to individual hosts is even more stupid. More on that later.

We have had a few lessons learned, that continue to be reflected in our operating practices, beyond “just use it.” The most significant of these is “don’t tunnel”. In general, you are better off to use whatever your local LAN environment supports, and if you are stuck at some remote site with antiquated legacy IP only infrastructure, you are probably better off to just use that. So, unless you specifically know you want to establish 6to4 or Teredo tunnels, disable these services (in the Windows world, we set a domain policy to disable the Helper service). As an IT geek, if you want to play with it at home to establish connections to your v6-only ssh server, by all means this is a Good Thing, but the Great Unwashed do not need to be volunteered for this kind of treatment. Your helpdesk will appreciate not getting those support calls.

VTTI has several — i.e. “more than one” — v6 only hosts, especially where they only need internal communication (e.g. hypervisors), and I have spoken with admins on campus who only bind ssh to v6, etc. Most of our systems at VTTI are dual stack, but where we don’t need to we try not to burn scarce legacy addresses. We use DHCP for most legacy address assignment and dynamic assignment for v6; we must manage our DHCP server for legacy IP assignments, but IPv6 dynamic assignment and discovery mostly Just Works. We seldom think about it. Our Windows domain uses DDNS and much intradomain traffic is v6. We seldom think about it. A very large portion of the global Internet traffic (google, facebook, netflix, apple, other edus) is v6. We seldom think about it. I hope you get the idea — we use IPv6, and we seldom think about it.

So in general, we use dual stack, take the native networking of the environment. What about IPv6-only environments? Here, we think about it more. I mentioned that we occasionally use v6-only hosts, but that is a relative corner case. On the other hand, we have recently upgraded the infrastructure of our Smart Road network to enable our research into vehicle-to-vehicle and vehicle-to-infrastructure application, especially using Dedicated short-range communications (DSRC). For this environment, we deploy a v6-only environment for new system; we do use RFC 1918 addresses for legacy devices, e.g. older traffic signals, but we resist any IPv4 NAT rules. We felt everything moving forward *must* be v6 — this isn’t really my requirement, it the industry standard. As you could expect there are issues due to legacy equipment, but the more significant challenges have been getting technical staff acclimated. In two or three years, your car will talk to other vehicles and infrastructure with v6 only.

For static assignment of hosts, e.g. database servers, we also have spent some time thinking about this. I will admit this is easy to “get wrong”. The wrongness is related to why IP filtering just doesn’t matter so much. What you don’t want is too much predictability to your numbering scheme. We do like to see some relatively obvious mapping of a host’s static IPv4 address to its static IPv6 address, so we map the lower 32 bits to its IPv4 value, then add a random byte elsewhere in the address to provide some randomness. This isn’t perfect, but it helps to prevent the obvious “scan the lower 32 bits” attack. For more details, check out RFC 5157 http://www.ietf.org/rfc/rfc5157.txt. As pointed out in this RFC, you aren’t going to have hostiles stumble on your randomly addressed hosts, but you will get them attacking your well known named hosts, like www, the MX record, etc. Note, we do use the MAC address based addressing, though I think this would be a good idea to dispense with.

So for a “best practice,” I would recommend only using random assignment of addresses in conjunction with DDNS, even though this isn’t strictly what VTTI does. Most hosts default to using MAC address based addressing, and it would probably be best if we turned this off. For static assignment of well-known hosts, you may as well use predictable addresses (e.g. www = local:network::80), but likewise you may as well use random, since these hosts will be in DNS.

What about firewall rules? The issue of address predictability is related to why IP filtering is increasingly less sensible with IPv6. To understand this, we need to first come to grips with why we use IP source address filtering in the first place. Part of what has shaped our belief that filtering based on source address is a good idea is that it is easy to sweep a 16 bit address space (or a 32 bit one). We’ve all seen the ssh probes, etc, that happen every day. Now, it is true that allowing hosts to connect based on their source IP is insufficient to actually grant them *access* to any resource. All it means is “if you send me a SYN, I send you SYN ACK”. It doesn’t imply you will then provide a legitimate credential to get access, and the whole concept of “trust” has nothing whatsoever to do with the accident of your network address. So, when you filter based on source address, these are not “trusted” networks, it is just a convenient (and quite minimal) risk mitigation in the event that someone with a zero day exploit happens to find you. Since random IPv6 hosts won’t be found, this is a lot less important with IPv6. So, if it makes you feel better to filter source addresses, do so with a very wide filter, like the entire campus. Anything like trying to list individual hosts is going to encourage you to use predictable host addresses, and this is a Bad Idea.

Admittedly, this discussion on firewall rules is far too cursory to do it justice. But I am trying to keep this article to IPv6 operations issues, and not the questionable practice of filtering network traffic based on source IP address. [I will just add: it isn’t so much that the practice itself is questionable, it is that far too much emphasis is placed upon it as a “security” practice and the costs of over-filtering are seldom appreciated.] For that discussion, we will need much more flame retardancy. So, let’s just suffice to say for now, as a best practice, only filter to the campus address blocks, if you feel you must filter based on source address.

Speaking of filtering, the true mark of stupidity is the admin who filters ICMP. If you do this with IPv6, the protocol doesn’t work. Period. So Stop Doing Things That Don’t Work!

So, to recap:

  • IPv6 Just Works
  • Learn
  • Don’t disable IPv6
  • Get familiar with IPv6 addresses
  • Use IPv6-only hosts that don’t need legacy IP (e.g. servers with known v6 capable clients)
  • Don’t use tunnels (turn off Helper service), unless you know what you’re doing
  • Use random IPv6 address assignment for non-well-known services
  • Use DDNS where you can
  • Don’t filter based on host address, at least use big network blocks
  • Stop filtering ICMP!

[This post follows a discussion with my operations team about different IP address formats: dotted decimal, hex, octal, etc, and how DHCP-assigned hostnames work at Virginia Tech.]

As a follow up to today’s tutorial on IP address number formats, it is fair to ask “where did you learn this?” Honest, I wasn’t born insane; it has taken years of striving effort to amass the esoterica that is my brain. But, the answer is one of the true beauties of Unix and today is such a lost concept: the manual – yes, Virginia, there is a manual! (Significantly, it is not “from the RFC” or “from the spec” … for comparison, I encourage you to read the IP RFC some time.)

One of the greatest innovations of the Unix revolution was the online manual. In 1971, Thompson (he wrote what became Unix in 1969) and Ritchie (with Brian Kernighan, he invented C and ported Ken’s system to it in 1970 or so) were coerced by their manager Doug McIlroy into writing the system documentation, notably “Unix Programmer’s Manual” for Unix version 1. The way typesetting was done for this manual was a markup language (in the same vein that HTML is a markup language): roff (for “run off”, as the pages were run off the typeset printing press). There were two formatting engines for roff: troff for high quality camera ready printing devices and nroff for character mode terminals and similar devices (terminals at the time were not CRTs, but were often paper printer, more like a typewriter with a big continuous spool of paper feeding into it.) So, since the manual had to be written, and since it was written in a manner that equally allowed it to be typeset on high quality paper or lo-fi paper terminals, the on-line manual was born. Later, the “man” command would automatically go find the entry from the manual you desired, format it through nroff, and display the formatted entry to your terminal.

One of the first computer systems I used (or at least one of the first I used to accomplish real work) was a Prime 9650 mini-computer. Like any system at the time, the system documentation was essential; it was also voluminous. There was some online help, but If you wanted to know how something worked you were walking across campus to one of the rooms where the manuals were kept (literally bolted to a large chest-high table.) What Ken and Dennis had done though, was take this many volume set for Unix (Prime didn’t run Unix) and put it *all* at the fingertips of any Unix user. Arguably, this more than any other of the numerous (sometimes brilliantly elegant, sometimes half-assed kludge) innovations has had more to do with the ultimate success of Unix-type systems. [The relatively poor quality of much of the Linux man pages for years was part of what led many of us old-timers to dismiss it as a serious Unix-type OS – this has thankfully largely changed. Some time you can get me to rant about “info” vs “man”.]

Now, when I say all of the documentation, I mean *all* — from the lowly “who” command to the deep arcana of the C library kernel system calls … even to the intro page itself. There it was; you just typed “man intro” and you had a few screens on what Unix was; “man who” and you learned that “who –m” was the same thing as “who am I”; “man signal” showed you how to use the signal kernel function in your C program as well as inventorying the signals you can send processes and which create core dumps. It was truly magical. And, after 4.2BSD in 1983 (this all pre-dates me, btw), you could even type “man inet” and learn about this new internet thingy (note the lower case “i”).

So what does it look like? http://www.gsp.com/cgi-bin/man.cgi?section=3&topic=inet (btw, section=3 indicates this is the “programmers guide” volume of the manual).

Note the section on INTERNET ADDRESS, and we have today’s lesson.

Pretty freaking cool, eh?


By the way, if you do some googling, you can find some interesting articles about how obscure IP address formats are used in phishing and similar escapades.

 Values specified using the `.' notation take one of the following forms:
 When four parts are specified, each is interpreted as a byte of data and
 assigned, from left to right, to the four bytes of an Internet address.
 Note that when an Internet address is viewed as a 32-bit integer quantity
 on the VAX the bytes referred to above appear as ``d.c.b.a''. That is,
 VAX bytes are ordered from right to left.
 When a three part address is specified, the last part is interpreted as a
 16-bit quantity and placed in the right-most two bytes of the network
 address. This makes the three part address format convenient for speci-
 fying Class B network addresses as ``128.net.host''.
 When a two part address is supplied, the last part is interpreted as a
 24-bit quantity and placed in the right most three bytes of the network
 address. This makes the two part address format convenient for specify-
 ing Class A network addresses as ``net.host''.
 When only one part is given, the value is stored directly in the network
 address without any byte rearrangement.
 All numbers supplied as ``parts'' in a `.' notation may be decimal,
 octal, or hexadecimal, as specified in the C language (i.e., a leading 0x
 or 0X implies hexadecimal; otherwise, a leading 0 implies octal; other-
 wise, the number is interpreted as decimal).
 The inet_aton() and inet_ntoa() functions are semi-deprecated in favor of
  the addr2ascii(3) family. However, since those functions are not yet
 widely implemented, portable programs cannot rely on their presence and
 will continue to use the inet(3) functions for some time.