Posts Tagged ‘BFD’

Using BFD to Track WAN Status and Change HSRP Priority

July 29, 2015 Leave a comment

It’s been five years since I started this blog! Time flies and a lot has happened since. Thanks for being along for the ride. What better way to celebrate than a blog post?

This post is going to be short and to the point.

Many of us run HSRP or VRRP. It is quite common to run it in a topology where you have dual routers and dual exits to the WAN and you don’t want to black hole your traffic.


One traditional way of achieving this is by tracking the interface that goes towards the WAN. There are a couple of drawbacks to this approach though:

  • You may not get link down on failure (connecting to switch)
  • You may experience an error that does not produce link down event

The next option is to use IP SLA that sends ICMP Echo towards the next-hop of the WAN or some destination further into the network. Ehanced Object Tracking (EOT) can then be used to create a track object that decrements the priority of the HSRP active router when the ICMP Echo probe fails. This works better but there are still some drawbacks to this approach:

  • Frequency can’t be set to lower than one second
  • There is no multiplier, one failed ping is enough which can lead to false positives
  • False positives will lead to the state changing more than necessary
  • The above can be solved by using a delay for the tracking object
  • Using IP SLA is likely more CPU intensive than using BFD

Unfortunately there is no way to directly configure HSRP to check the status of BFD running on your WAN interface. That does not mean we can’t solve the task at hand though. BFD is supported over static routes. What if we insert a dummy route into the RIB when BFD is running successfully over the WAN link and track that this route is installed into the RIB. If it is not installed it must mean that BFD has failed and that the HSRP priority of the active router should be decremented.

The configuration is quite simple. In my lab I have an ISP router with the following config:

interface GigabitEthernet1.100
encapsulation dot1Q 100
ip address
bfd interval 500 min_rx 500 multiplier 3


router bgp 1
bgp log-neighbor-changes
network mask
neighbor remote-as 2
neighbor fall-over bfd

I’m using BGP in this case to have BFD packets sent over the link. There needs to be a protocol registered with BFD for the packets to be sent. It would be more likely for the ISP to configure a static route using BFD as well. If you are already running BGP, this configuration may be overkill since you could track routes coming from BGP.

This is then the configuration of the active HSRP router:

track 1 ip route reachability
interface GigabitEthernet1
no ip address
negotiation auto
interface GigabitEthernet1.100
encapsulation dot1Q 100
ip address
bfd interval 500 min_rx 500 multiplier 3
interface GigabitEthernet1.200
encapsulation dot1Q 200
ip address
standby 1 ip
standby 1 priority 110
standby 1 preempt
standby 1 track 1 decrement 11
router bgp 2
bgp log-neighbor-changes
neighbor remote-as 1
neighbor fall-over bfd
ip route static bfd GigabitEthernet1.100
ip route GigabitEthernet1.100

To trigger the BFD packets being sent over the WAN link we first have a static route pointing out the egress interface and the next-hop.

ip route static bfd GigabitEthernet1.100

Then we put a standard IP route statement which will insert the dummy route. It is important to point out the egress interface though for single-hop BFD.

ip route GigabitEthernet1.100

EOT is used to track if the route is installed into the RIB or not.

track 1 ip route reachability

The HSRP priority is decremented if the route is not in the RIB which will make the standby router become active.

standby 1 track 1 decrement 11

We can then test if it works by shutting down the interface on the ISP router.

ISP1(config)#int gi1

The change in the RIB is detected by the HSRP active router:

*Jul 29 15:57:23.278: %TRACK-6-STATE: 1 ip route reachability Up -> Down
*Jul 29 15:57:24.872: %HSRP-5-STATECHANGE: GigabitEthernet1.200 Grp 1 state Active -> Speak
*Jul 29 15:57:36.582: %HSRP-5-STATECHANGE: GigabitEthernet1.200 Grp 1 state Speak -> Standby

The standby router takes over:

*Jul 29 15:57:24.872: %HSRP-5-STATECHANGE: GigabitEthernet1.200 Grp 1 state Standby -> Active

What are the advantages of this setup compared to IP SLA?

  • Lightweight protocol designed to test reachability
  • Can send packets faster than one second in between each
  • Can define what multiplier to use

The drawback may be that you have to get your ISP to run BFD and that they need to put a static route in on their side as well. This can be a real route or a dummy route though.

Hopefully this post was somewhat useful and you’ll stay with me for another five years. Thanks for reading!

Categories: FHRP Tags: , , ,

Detecting Network Failure

September 26, 2013 7 comments


In todays networks, reliability is critical. Reliability needs to be high and
convergence needs to be fast. There are several ways of detecting network failure
but not all of them scale. This post takes a look at different methods of
detection and discusses when one or the other should be used.

Routing Convergence Components

There are mainly four components of routing convergence:

  1. Failure detection
  2. Failure propagation (flooding)
  3. Topology/Routing recalculation
  4. Update of the routing and forwarding table (RIB and FIB)

With modern networking networking equipment and CPUs it’s actually the first
one that takes most time and not the flooding or recalculation of the topology.

Failure can be detected at different level of the OSI model. It can be layer 1, 2
or 3. When designing the network it’s important to look at complexity and cost
vs the convergence gain. A more complex solution could increase the Mean Time
Between Failure (MTBF) but also increase the Mean Time To Repair (MTTR) leading
to a lower reliability in the end.


Layer 1 Failure Detection – Ethernet

Ethernet has builtin detection of link failure. This works by sending
pulses across the link to test the integrity of it. This is dependant on
auto negotiation so don’t hard code links unless you must! In the case of
running a P2P link over a CWDM/DWDM network make sure that link failure
detection is still operational or use higher layer methods for detecting

Carrier Delay

  • Runs in software
  • Filters link up and down events, notifies protocols
  • By default most IOS versions defaults to 2 seconds to suppress flapping
  • Not recommended to set it to 0 on SVI
  • Router feature

Debounce Timer

  • Delays link down event only
  • Runs in firmware
  • 100 ms default in NX-OS
  • 300 ms default on copper in IOS and 10 ms for fiber
  • Recommended to keep it at default
  • Switch feature

IP Event Dampening

If modifying the carrier delay and/or debounce timer look at implementing IP
event dampening. Otherwise there is a risk of having the interface flap a lot
if the timers are too fast.

Layer 2 Failure Detection

Some layer 2 protocols have their own keepalives like Frame Relay and PPP. This
post only looks at Ethernet.


  • Detects one-way connections due to hardware failure
  • Detects one-way connections due to soft failure
  • Detects miswiring
  • Runs on any single Ethernet link even inside a bundle
  • Typically centralized implementation

UDLD is not a fast protocol. Detecting a failure can take more than 20 seconds so
it shouldn’t be used for fast convergence. There is a fast version of UDLD but this
still runs centralized so it does not scale well and should only be used on a select
few ports. It supports sub second convergence.

Spanning Tree Bridge Assurance

  • Turns STP into a bidirectional protocol
  • Ensures spanning tree fails “closed” rather than “open”
  • If port type is “network” send BPDU regardless of state
  • If network port stops receiving BPDU it’s put in BA-inconsistent state


Bridge Assurance (BA) can help protect against bridging loops where a port becomes
designated because it has stopped receiving BPDUs. This is similar to the function
of loop guard.


It’s not common knowledge that LACP has builtin mechanisms to detect failures.
This is why you should never hardcode Etherchannels between switches, always
use LACP. LACP is used to:

  • Ensure configuration consistence across bundle members on both ends
  • Ensure wiring consistency (bundle members between 2 chassis)
  • Detect unidirectional links
  • Bundle member keepalive

LACP peers will negotiate the requested send rate through the use of PDUs.
If keepalives are not received a port will be suspended from the bundle.
LACP is not a fast protocol, default timers are usually 30 seconds for keepalive
and 90 seconds for dead. The timer can be tuned but it doesn’t scale well if you
have many links because it’s a control plane protocol. IOS XR has support for
sub second timers for LACP.

Layer 3 Failure Detection

There are plenty of protocol timers available at layer 3. OSPF, EIGRP, ISIS,
HSRP and so on. Tuning these from their default values is common and many of
these protocols support sub second timers but because they must run to the
RP/CPU they don’t scale well if you have many interfaces enabled. Tuning these
timers can work well in small and controlled environments though. These are
some reasons to not tune layer 3 timers too low:

  • Each interface may have several protocols like PIM, HSRP, OSPF running
  • Increased supervisor CPU utilization leading to false positives
  • More complex configuration and bandwidth wasted
  • Might not support ISSU/SSO


Bidirectional Forwarding Detection (BFD) is a lightweight protocol designed to
detect liveliness over links/bundles. BFD is:

  • Designed for sub second failure detection
  • Any interested client (OSPF, HSRP, BGP) registers with BFD and is notified when BFD detects loss
  • All registered clients benefit from uniform failure detection
  • Uses UDP port 3784/3785 (echo)

Because any interested protocol can register with BFD there are less packets
going across the link which means less wasting of bandwidth and the packets
are also smaller in size which reduces this even more.

Many platforms also support offloading BFD to line cards which means that the
CPU does not get increased load when BFD is enabled. It also supports ISSU/SSO.

BFD negotiates the transmit and receive interval. If we have a router R1
that wants to transmit at 50 ms interval but R2 can only receive at 100 ms
then R1 has to transmit at 100ms interval.

BFD can run in asynchronous mode or echo mode. In asynchronous mode the BFD
packets go to the control plane to detect liveliness. This can also be combined
with echo mode which sends a packet with a source and destination IP of the
sending router itself. This way the packet is looped back at the other end
testing the data plane. When echo mode is enabled the control plane packets
are sent at a slower pace.

Link bundles

There can be challenges running BFD over link bundles. Due to CEF polarization
control plane/data plane packets might only be sent over the same link. This
means that not all links in the bundle can be properly tested. There is
a per link BFD mode but it seems to have limited support so far.

Event Driven vs Polled

Generally event driven mechanisms are both faster and scale better than polling
based mechanisms of detecting failure. Rely on event driven if you have the option
and only use polled mechanisms when neccessary.


Detecting a network failure is a very important part of network convergence. It
is generally the step that takes the most time. Which protocols to use depends
on network design and the platforms used. Don’t enable all protocols on a link
without knowing what they actually do. Don’t tune timers too low unless you
know why you are tuning them. Use BFD if you can as it is faster and uses
less resources. For more information refer to BRKRST-2333.