Building a Redundant and Manageable DHCP infrastructure

Table of contents

  • Introduction
  • Designing your TCP/IP network
    • Performance considerations and the proper sizing of Subnets
    • Designing a clean “Binary nice” subnet site
    • Routing of Subnets with Layer 3 switches
  • DHCP Relay
    • Using an NT or Win2k DHCP relay server in each subnet
    • Using a Layer 3 switch to relay DHCP for all VLANs
  • DHCP Redundancy and Configuration
    • Setting up two non-overlapping DHCP servers for maximum availability
    • Building DHCP scopes and setting the scope and server options
  • Using MAC address “security” on DHCP
  • Advanced Switches with 802.1x Port Based Access Control and EAP
  • Conclusion

Introduction

This document is meant as an introduction and overview on how to build a redundant and easy to manage DHCP infrastructure with modern technology.  DHCP is a critical service that needs to be thoroughly integrated in to a good network design for a practical and functional network infrastructure.  Because it is impossible to talk about DHCP without talking about network infrastructure, I will start off by covering some basic TCP/IP network design.  Although there is a prerequisite for a basic understanding of TCP/IP networking concepts and the Cisco layer 3 switching configuration to fully comprehend all of the material, you can still read it on a high level to get a good basic understanding of this technology.  Doing so will help you work better with professional networking consultants.

Designing your TCP/IP network

Performance and Sizing of Subnets:

In order to design a high performance and low congestion network, we must understand what the enemies of network performance are.  The biggest enemy of network performance in the past was data collisions with the use of Ethernet Repeaters (AKA Hubs).  Any time any data is transmitted by one computer to another, the data is repeated to every single port of the Hub which causes congestion for every one.  In today’s network, this is a thing of the past because Ethernet switches have reached such a high economy of scale and are so affordable that it would almost be silly to continue to purchase Ethernet repeater technology.  Data collisions have all but become moot on modern Ethernet networks.  Ethernet Switches isolate traffic between two computers while keeping all other ports clear and open for all the other computers on the switch.  Because of this, the new king of congestion is the broadcast storm.  Computers (especially the ones running NetBEUI) have a nasty habit of calling out or announcing to the entire TCP/IP subnet forcing the Ethernet Switch that normally likes to keep traffic isolated to send that data stream to every port on the switch on the same subnet.  Even worst, sometimes every computer on that subnet has to respond to the sender causing the original broadcast to be amplified a thousand times.  Unfortunately, this sometimes puts us back into the same predicament that Ethernet Hubs had to constantly deal with.  The only way to combat this is to keep the number of hosts on a single broadcast domain to a minimum.  That means probably no more than 128 computers on a single TCP/IP subnet.  I have seen sites with thousands of computers on a single subnet and I can tell you it is not pretty when monitoring the broadcast storms.  In fact, it was so bad that it was enough to kick people out of their terminal server sessions a dozen times a day because of network instability!

Designing a clean “Binary nice” subnet site:

We will start with the premise that we have single LAN site on a single campus.  While it is possible to run DHCP over Wide Area Networks, it is not considered best practice so we will stick to a single LAN in this paper.  The site will have up to 1000 users with 1000 computers broken down in to 256-host sized VLANs (Virtual Local Area Networks created by logically segmenting a network with a managed layer 2 or layer 3 switch) with no more than 100 users per VLAN with room to spare.  This means we will require a minimum of 10 VLANs on this site.  Additionally, because we want to be able to summarize this site in to a single supernet when routing, we will round up to the next “nice” binary number 16.  We will use the private class A scheme of 10.x.x.x for our company, so for this site, we will run the entire site under the network ID of 10.0.0.0/20.  For those of you new to this terminology, this is the abbreviated terminology for the Network ID of 10.0.0.0 with subnet mask of 255.255.240.0 which defines all IP addresses ranging from 10.0.0.0 to 10.0.15.255.  By using “binary nice” numbers like 2, 4, 8, 16, 32, and so on, I am able to define the entire subnet by the single network ID of 10.0.0.0/20.  The reason for this is not solely ascetic, it greatly simplifies routing and security rules because I can define the entire network with a single statement.  This not only simplifies management, but also improves performance and reduces the chance of mistakes.  Some of you at this point may be balking at the idea of running 10 separate subnets for “only” 1000 users, but bear with me, it is not that difficult to handle if you use the right technology.  Also keep in mind that there are 65,536 256-host sized subnets in the 10.0.0.0/8 class A private network.  This means that you can have 4096 of these sites with 16 subnets each.  Obviously, the next campus LANs of similar size will be defined as 10.0.16.0/20, 10.0.32.0/20, 10.0.48.0/20, and so on.

Routing of Subnets with Layer 3 switches

Now that we have the basic network laid out, we must build it.  The best way to handle this is with a managed Ethernet Layer 3 switch such as a Cisco Catalyst 6500 series with MSFC but a Cisco 3550-12G can be used instead for smaller networks or tighter budgets (Note that Cisco is not the only company than can fill this job, but for the purposes of this paper, I will use the Cisco example.  Additionally, the 3550-12G makes for a great poor man’s core/distribution layer switch at 1/10th the cost).  Both of these switches can act as the core, or core and distribution layers of the network.  Then we can proceed to connect Access layer switches such as the Cisco 2980 switches (you can use cheaper unmanaged switches for this too but understand that you can’t break them up into additional VLANs or have trunking support) to the 6500 via gigabit Ethernet uplinks.  Then distribute these access layer switches around the campus so that the actual Cat5e or Cat6 copper runs to the clients are kept to a minimum length vastly reducing cabling cost in material and labor while increasing signal reliability.  Once this 2 or 3 tier design is in place, we can proceed to configuring the switches.  The Cisco 2980 access layer switch has VLAN or Bridge Group capabilities, but has no routing capabilities of it’s own.  For that, it can connect or trunk into the core switch using 802.1q trunking over the gigabit uplink via Cat6 copper or full duplex Fiber.  The core switch using the 6500 MSFC or the 3550-12G can act as a massive VLAN router to handle all routing requests and act as the default gateway for every VLAN on all tiers by configuring a single static routing table and/or protocol such as EIGRP, RIP, or OSPF.  Additionally, it can also act as the DHCP relay agent for all the VLANs as well and is definitely easier and cheaper than setting up at least 10 separate Windows or Linux boxes to act as DHCP relay agents.

Example with six VLANs using 6 2980 L2 switches and a 3550 12G as core/distribution layer switch:

DHCP relay:
A DHCP relay agent sits in place of an actual DHCP server in a TCP/IP subnet.  It basically extends the reach of the DHCP server without the need for multiple DHCP servers on each subnet by acting as the DHCP server’s helper agent in a remote subnet.  DHCP relay does not manage IP addresses itself, but relays the DHCP request to the DHCP server on behalf of the client, obtains the IP address, and then hands out the IP addresses to the asking client on behalf of the DHCP server.  The only other option is a single DHCP server with multiple Ethernet ports sitting on each VLAN but that has some serious limitations in scalability.  On Cisco Layer 3 switches, DHCP relay can easily be achieved with a single command of ip helper-address 10.0.14.255 entered in to each VLAN interface as shown below.  10.0.14.255 will be the broadcast address of the VLAN that will home my DHCP servers.  You can use a specific IP address here instead of a broadcast address, but that would mean only having one active DHCP server or you must cluster two or more DHCP servers on a single IP address.  For our example, the following are configuration examples with VLAN definitions (AKA Bridge Group), default gateways, and DHCP relay configurations for Cisco or IEEE standard configurations.

IEEE standard configuration on a Cisco 2948-L3 switch used as a Core/Distribution layer switch:

Bridge group declarations

bridge 1 protocol ieee

bridge 1 route ip

bridge 2 protocol ieee

bridge 2 route ip

…all the way through…

bridge 15 protocol ieee

bridge 15 route ip

Declares VLAN 1

Enables routing in VLAN 1

Declares VLAN 2

Enables routing in VLAN 2

… Declare VLANs 3 – 14 your self …

Declares VLAN 15

Enables routing in VLAN 15

Interface configurations

interface BVI1

ip address 10.0.1.1 255.255.255.0

ip helper-address 10.0.14.255

no ip directed-broadcast

interface BVI2

ip address 10.0.2.1 255.255.255.0

ip helper-address 10.0.14.255

no ip directed-broadcast

…all the way through…

interface BVI15

ip address 10.0.15.1 255.255.255.0

ip helper-address 10.0.14.255

no ip directed-broadcast

Defines VLAN 1

Sets the default gateway listener for VLAN 1 as 10.0.1.1

Sets the DHCP relay agent to forward to 10.0.14.255

Defines VLAN 2

Sets the default gateway listener for VLAN 2 as 10.0.2.1

Sets the DHCP relay agent to forward to 10.0.14.255

… Fill in VLANs 3 – 14 your self …

Defines VLAN 15

Sets the default gateway listener for VLAN 15 as 10.0.15.1

Sets the DHCP relay agent to forward to 10.0.14.255

Port configurations

interface FastEthernet1

no ip address

no ip directed-broadcast

bridge-group 1

interface FastEthernet2

no ip address

no ip directed-broadcast

bridge-group 2

…all the way through…
interface FastEthernet15

no ip address

no ip directed-broadcast

bridge-group 15

Defines Fast Ethernet Interface 1
Binds Interface 1 to VLAN 1
Defines Fast Ethernet Interface 2
Binds Interface 2 to VLAN 2

… Fill in Interfaces 3 – 14 your self …

Defines Fast Ethernet Interface 15

Binds Interface 15 to VLAN 15

(Note that VLAN 14 is the only bridge group interface that does not need the helper-address because it contains the DHCP server it self.  Also note that VLAN 11 through 15 will be used for spare, DMZ, or server farms.  I didn’t have a 3550-12G with Gigabit Ethernet handy so I used a 2948-L3 with Fast Ethernet instead for this example as the core switch, but it is the same ideal.)

Additionally, some of the Cisco L3 switches use a different type of command line interface.  The following is an example with a Cisco 6509 MSFC L3 module:

interface Vlan1

description Subnet 1

ip address 10.0.1.1 255.255.255.0

ip helper-address 10.0.14.255

no ip redirects

no ip directed-broadcast

This looks quite a bit different than the 2948-L3, but still uses the same DHCP relay command.  The VLAN command accomplishes the same thing as the BVI command, but it is a little easier with the 6509 type CLI (command line interface) because you don’t need to declare the IEEE bridge protocol.  The Cisco 2948-L3 CLI must manage the routing as well as the switching and port configurations.  The 6509 MSFC module is more of a dedicated routing and management module with the physical switch ports handled by a separate CLI.  You can consult your switch manual or Cisco’s web site for more information on your particular hardware.

While it is possible to use a Windows or Linux server as a DHCP relay agent, it would seen to be over kill to dedicate 15 or more separate machines to do the job of a single command on your Layer 3 switch.  Note that without this technique of using the L3 Switch, it would be extremely impractical to implement this degree of TCP/IP segmentation on a LAN.  You would also need 15 separate servers for DHCP relay agents and 15 traditional routers to join the 15 VLANs all of which would be absurd.  The point is, take the easy route and use a Layer 3 switch at the heart of your network.  It opens up all sorts of possibilities.

DHCP redundancy and configuration:

Setting up two non-overlapping DHCP servers for maximum availability
As I mentioned earlier, the DHCP servers will reside in the subnet of 10.0.14.0/24 along with many of your other servers.  Since DHCP is an extremely low activity service, my recommendation for this is that you may host the DHCP servers on your Windows NT or Windows 2000 Domain controllers or File servers along with other services like DNS, WINS, and other common services.  You only need to find two servers for a home.  Once you do, simply install the DHCP service and proceed to configure each server to serve only half the subnet with non-overlapping scopes (it is also good to cluster your DHCP servers, but that requires windows advanced server which may not be an option for everyone).  The first DHCP server will be configured with a scope of host numbers 10-109, and the second DHCP server will host 110 to 219.  This leaves hosts 1-10 reserved and 220-254 for static IP addresses for things like printers.  This is what is called a 50/50 configuration and you may also hear recommendations for an 80/20 configuration where IP addresses are a bit scarcer.  In this architecture how ever, we are leaving so much breathing room that only 50% of the total subnet is more than enough for all DHCP clients.  I also recommend not using DHCP reservations because this makes the management of DHCP servers extremely messy by fragmenting the scopes.  I would much rather assign people addresses in the 220-254 range (make this range as large as you need) rather than letting other system administrators or users pick their favorite number.  Because the DHCP relay agent is forwarding to the broadcast address where these two DHCP servers reside, it is basically a first respond first serve environment.  But it doesn’t matter since all of our users can fit in a single DHCP server with tons of room to spare.  Statistically if the two computers have equal load and are equal in speed, users will end up half and half on each DHCP server.

Building DHCP scopes and setting the scope and server options

For our example, we will use Windows 2000.  On these DHCP servers, you will need to create 10 new scopes using the create scope wizard.  During the creation of these scopes, simply name them VLAN1 through VLAN10 and enter the corresponding IP ranges.  Be sure to only enter the default gateway for each scope and don’t enter any other DHCP options.  This is explained by the differentiation between scope options and global (server) options and I often see people confuse the two.  It is possible to put any type of DHCP attributes like default gateway, DNS server, WINS, and such on either Scope or Server options, but there is only one proper way to do it.  The default gateway should always be put under scope options as you have already done during the creation of the scopes, all other standard attributes like WINS and DNS should be placed under server options (formerly known as global options under NT4).  Then the server options will automatically be inherited into all of the scopes saving you a lot of manual entry and possibility for errors.  To configure the server options, simply right click on server options and hit “configure options” to get the following window:

Set 006 for your DNS servers

Set 015 for your default domain suffix

Set 044 for your WINS servers

Set 046 for 0x8 for your WINS/NBT Node type

Now imagine doing this 10 times for each scope, it would be silly.  Putting these additional settings under Server Options is the best way to go.  Then repeat this procedure on the second DHCP server doing everything the same.  The only difference is that the host range will be 110-219 instead of 10-109 on the first DHCP server.  Some of you astute readers at this point may be wondering how to actually bind all the different scopes to their respective network IDs.  The answer is surprisingly simple, nothing!  When you created the scopes, you had to define the separate IP ranges of all the corresponding scopes it should operate in.  That alone is enough configuration to match up the scopes with the subnets they will serve.  When the DHCP server receives the DHCP forwarded request from the DHCP relay agent (or IP Helper), it simply examines the source IP of the DHCP relay agent that forwarded the request, then matches it up to the scope that serves the subnet of the DHCP relay agent and grants an IP-configuration-set back to the relay agent.  Then that IP-configuration-set is passed on by the DHCP relay agent to the original client that made the DHCP request in the first place.

Then finally after all that, be sure you “activate” your DHCP servers and authorize them by right clicking on the DHCP server and choosing “authorize”.  Once authorized and activated, you have just set up a two DHCP servers to serve 10 separate subnets with the aid of a single layer 3 switch.  Note that this type of infrastructure is extremely scalable and could just as easily serve 1000 scopes if needed.  A DHCP server only has to do one transaction per user per week so even 1000 scopes is not a lot of work for a slow 486 computer.

Be aware that this type of architecture absolutely mandates a good DNS and WINS infrastructure.  You cannot rely on the old broadcast discovery techniques like you could under a flat subnet where everyone lived.  But that is a great performance advantage and puts less reliance on luck when using broadcast and prey for a response.  But rest assure that having a disciplined TCP/IP name resolution infrastructure will pay great dividends when all the inconsistencies and mysteries of legacy style Windows networking disappears.

Using MAC address “security” on DHCP:

You can set up casual “security” for your network by only issuing IP addresses with pre-reserved MAC addresses.  The reason I say that sarcastically is because it is security through obscurity.  It can only be used for casual security because it is based on the honor system.  MAC addresses can be changed on any network adapter within seconds.  Your MAC address is what you declare it to be.  This is the same reason why MAC addresses can’t really secure wireless networks because it is so easy to forge.  The other problem with this “Security” scheme is that even if you don’t assign some one an IP address, that doesn’t mean they can’t just simply type in an IP address manually and still participate on your network.  Additionally, maintaining a 12 digit hex number gets to be quite cumbersome for a thousand users.  This technique keeps the non technical person out, but it has no security capabilities beyond that.  Real security needs to be handled at the switch level with 802.1x and EAP.

Advanced Switches with 802.1x Port Based Access Control and EAP:

Some advanced switches like the Cisco Catalyst 6500 supports 802.1x port based access control and extensible authentication protocol.  Basically, this means no authentication no access.  The Ethernet port remains closed until you authenticated successfully over EAP.  Unlike the previous method discussed using MAC reservations on the DHCP server, you can’t just forge the MAC address or even manually enter an IP address.  Either hack is useless when 802.1x/EAP is employed on the access switch.  When 802.1x is employed, a client connecting to a port on the switch must support the 802.1x protocol.  Currently, Windows XP is the only operating system that natively supports 802.1x but Microsoft is promising 802.1x support for legacy operating systems like 98, NT, and 2000 by the end of 2002.  Basically, when the 802.1x capable client connects to the switch, it must send EAP credentials to the switch.  The switch then forwards the EAP message to the RADIUS (Remote Authentication Dial-In User Service) server.  If the RADIUS server accepts the credentials, it will respond with an EAP success message to the Switch.  Only then will the Switch transition the port to an open state and then permit DHCP requests and full network participation.  Additionally, this same RADIUS infrastructure can be used to provide enterprise grade wireless security.

For more information on 802.1x and Cisco Switches, see this Cisco configuration guide for port based authentication.

Conclusion:

All the old concepts and ideas on hubs, switches, routers, and DHCP servers have been revolutionized by this new approach.  Not only are we able to create a more manageable and robust DHCP and Network infrastructure, we are able to do it with less money, equipment, and time.  It is simply a matter of taking advantage of what new technology has to offer.

2 thoughts on “Building a Redundant and Manageable DHCP infrastructure”

  1. Another option is to simply use the DHCP server built into most routers and layer 3 switches. Cisco IOS DHCP is reliable and quite easy to use if you are comfortable with the command line.
    This way, if you have a melt down in your server room, DCHP is still available as it is part of the insfrastructure.

    1. The problem with using the switches or routers is that it’s not centralized. When you have one central DHCP server, it’s easy to find any dynamic IP address in the company. Furthermore, the DHCP can update the dynamic names in Active Directory.

Comments are closed.