Skip to main content

What is DHCP Failover and How Does it Work?

written by Asterfuison

August 8, 2025

Maintaining a connection is crucial in the dynamic realm of network infrastructure. Consider an expansive data center or a huge business network. If a vital service, such as DHCP (Dynamic Host Configuration Protocol), fails, the whole network may come to a complete stop. In this situation, DHCP Failover ensures that your network is resilient, accessible, and responsive. This blog article examines the crucial function of DHCP failover.

What is DHCP Failover?

DHCP Failover is a technology designed to enhance the reliability and availability of DHCP (Dynamic Host Configuration Protocol) services.
In traditional DHCP deployments, typically only one DHCP server assigns IP addresses, leases, default gateways, DNS server addresses, and other configuration information to network clients. DHCP Failover establishes a cooperative relationship between two servers, allowing them to work collaboratively to handle scenarios where a single server experiences failure or excessive load.

What are the benefits of using DHCP Failover in networking?

Enhanced Network Reliability and Availability

  • Eliminate single points of failure: In enterprise networks and data centers, if the sole DHCP server suffers a hardware failure, software crash, or network outage, clients will be unable to obtain new IP addresses or renew existing leases, disrupting connectivity. DHCP Failover ensures that if one server fails, the other can quickly take over, maintaining service continuity.
  • Reduce downtime: Traditional recovery methods may require manual intervention (server restarts, reconfiguration, etc.), leading to extended outages. DHCP Failover automates failover, minimizing downtime and business impact.

Load Sharing Under High Demand

  • Handle large request volumes: As the number of network devices grows—such as in large campuses, schools, and hotels—DHCP servers can become overloaded. The load balancing mode in DHCP Failover distributes requests evenly between the two servers, improving performance and response times.
  • Automatic failover between primary and secondary DHCP servers
  • Real-time lease synchronization across failover partners
  • Load balancing for optimal network performance
  • Zero-downtime network operations during server maintenance
  • Enterprise network reliability for mission-critical environments

How does DHCP Failover work?

1. Message Exchange Process

DHCP Failover achieves high availability and load balancing through coordinated operation between two servers, exchanging specific control messages.

DHCP-Failover-Operating-Principles-Interaction-Process-Illustration

CONTACT – Probe the communication integrity between partners.

  • Purpose: Periodically checks the connectivity between the Primary Server and Secondary Server to ensure the communication link is active.
  • Scenario: Keeps the connection active during periods with no data transfer to prevent timeout and disconnection.

CONNECT – Establish a connection with the secondary server.

  • Purpose: Used when initiating the initial connection between the two servers to negotiate communication parameters and perform authentication.
  • Scenario: Ensures both sides can communicate properly when starting failover operations or reconnecting after a disruption.

CONNECTACK – Acknowledge a connection attempt.

  • Purpose: Sent in response to a CONNECT message to confirm that the connection has been successfully established.
  • Scenario: Completes the connection setup process, enabling subsequent message exchanges.

STATE – Notify the peer of the current state or a state change.

  • Purpose: Exchanges status information between servers, such as current load or address pool usage.
  • Scenario: Supports load balancing adjustments and failover recovery, ensuring both sides share a consistent view of the system status.

BNDUPD (Binding Update) – Update binding information to the peer.

  • Purpose: Synchronizes lease update information between the two servers to keep their lease databases consistent.
  • Scenario: During address allocation, renewal, or release, the server uses BNDUPD to replicate lease changes to its partner.

BNDACK (Bind Acknowledge) – Confirm receipt of a binding update.

  • Purpose: Sent after receiving a BNDUPD message to acknowledge that the lease update has been successfully received and applied.
  • Scenario: Serves as the response to BNDUPD, ensuring the reliability of lease synchronization.

2. DHCP Failover Operational States

2.1 Normal

When CONTACT messages are exchanged normally, the DHCP Failover state is Normal.

Using a split value of 50 on the Primary Server as an example, the operation is as follows:

  • Under normal conditions, each server holds half of the address pool and handles half of the client requests.
  • When a client requests an address, both servers receive the DHCP request, but only one responds. The responding server is determined by computing a hash value from the CLIENT_IDENTIFIER to decide which server will assign the address to that client.
  • The server that assigns the address synchronizes the lease information with its partner via DHCP Failover. The partner server receives and acknowledges the update.
  • When a client renews a lease, the request can be processed by either server.

Note: Split bits define the load-sharing ratio between the Primary and Secondary Servers. The value is expressed as a percentage (0–100), indicating the weight assigned to the Primary Server.

2.2 Communications-interrupted

Both servers periodically send CONTACT messages to ensure normal communication between them. If communication is interrupted (meaning three consecutive CONTACT messages from the peer are not received within the duration specified by max-response-delay), the server enters the communications_interrupted state.

  • In the communications_interrupted state, each server retains half of the address pool.
  • When a client requests an address, the server first checks whether the address should be allocated by itself. If yes, it assigns the address immediately. If not, it holds the request; if after 3 seconds, the same DHCP DISCOVER message from that client is still received, the server assigns an address from its own pool.
  • When communication is restored, each server synchronizes lease information with its peer. Clients can renew their leases through either server.

Note: max-response-delay seconds specifies the maximum communication interruption time. If a server does not receive any message from its failover peer within the configured number of seconds, it considers the communication link interrupted. The default value is 60 seconds.

2.3 Partner_down

In the communications_interrupted state, if communication is still not restored after the duration specified by auto-partner-down (in seconds), the surviving DHCP server will assume that its peer is offline and enter the partner_down state. After the Maximum Client Lead Time (MCLT) expires, the surviving server will take control of the entire address pool.

  • Client address allocation: The surviving DHCP server assigns addresses based on client information and the current status of the address pool. Once the exchange is complete, it updates its local lease database.
  • Client lease renewal: Clients renew leases only with the surviving server.

Notes: auto-partner-down seconds specifies the delay timer that starts when the server enters the communications_interrupted state. When the timer expires, the server automatically transitions to the partner_down state. MCLT defines the maximum lease time offset allowed for a client, with a default value of 3600 seconds.

In conclusion, DHCP Failover is a fundamental technology for building resilient and highly available network infrastructures. It moves beyond the limitations of a single-server DHCP deployment by eliminating a critical single point of failure. Through a meticulously designed system of message exchanges and operational states—Normal, Communications-interrupted, and Partner_down—it ensures that network clients can always obtain and renew IP addresses, even when a server fails or is taken offline for maintenance. By implementing comprehensive DHCP failover solutions, organizations can achieve 99.9% network uptime, support large-scale device deployments, and maintain business continuity even during server failures or maintenance activities.

Learn more about how to configure DHCP Failover.


Author:

Serena Guo

Technical Marketing Manager

Bridging R&D and Market for Open Campus Networks Asterfusion | 5+ Years at the Forefront of Campus Networking

Latest Posts