Archive

Archive for the ‘OSPF’ Category

OSPF Design Considerations

March 6, 2015 2 comments

Introduction

Open Shortest Path First (OSPF) is a link state protocol that has been around for a long time. It is geneally well understood, but design considerations often focus on the maximum number of routers in an area. What other design considerations are important for OSPF? What can we do to build a scalable network with OSPF as the Interior Gateway Protocol (IGP)?

Prefix Suppression

The main goal of any IGP is to be stable, converge quickly and to provide loop free connectivity. OSPF is a link state protocol and all routers within an area maintain an identical Link State Data Base (LSDB). How the LSDB is built it out of scope for this post but one relevant factor is that OSPF by default advertises stub links for all the OSPF enabled interfaces. This means that every router running OSPF installs these transit links into the routing table. In most networks these routes are not needed, only connectivity between loopbacks is needed because peering is setup between the loopbacks. What is the drawback of this default behavior?

  • Larger LSDB
  • SPF run time increased
  • Growth of the routing table

To change this behavior, there is a feature called prefix suppression. When this feature is enabled the stub links are no longer advertised. The benefits of using prefix suppression is:

  • Smaller LSDB
  • Shorter SPF run time
  • Faster convergence
  • Remote attack vector decreased

If there needs to be connectivity to these prefixes for the sake of monitoring or other reasons, these prefixes can be carried in BGP.

How Many Routers in an Area?

The most common question is “How many routers in an Area?”. As usual, the answer is, it depends… In the past the hardware of routers such as the CPU and memory severely limited the scalability of an IGP but these are not much of an factor on modern devices. So what factors decide how many routers can be deployed in an area?

  • Number of adjacent neighbors
  • How much information is flooded in the area? How many LSAs are in the area?
  • Keep router LSA under MTU size
    – Implies lots of interfaces (and possibly lots of neighbors)
    – Exceeding the MTU leeds to IP fragmentation which should be avoided

It’s impossible to give an exact answer to how many routers that fit into an area. There are ISPs out there running hundreds or even thousands in the same area. Doing so creates a very large failure domain though, a misbehaving router or link will cause all routers in the area to run SPF. To create a smaller failure domain, areas could be used, on the other hand MPLS does not play well with areas… So it depends…That is also why we see technologies like BGP-LS where you can have IGP islands glued together by BGP.

How many ABRs in an Area?

How many ABRs are suitable in an area? The ABR is very critical in OSPF due to the distance vector behavior between areas in OSPF. Traffic must pass through the ABR. Having one ABR may not be enough but adding too many adds complexity and adds flooding of LSAs and increases the size of the LSDB.

ABR1

  • More ABRs create more Type 3 LSA replication within the backbone and in other areas
  • This can create scalability issues in large scale routing
  • 10 prefixes in area 0 and 10 prefixes in area 1 would generate 60 summary LSAs with just 3 ABRs
  • Increasing the number of areas or the number of ABRs would worsen the situation

How Many Areas per ABR?

Based on what we learned above, increasing the number of areas on an ABR quickly adds up to a lot of LSAs.

ABR2

This ABR is in four areas, if every area contains 10 prefixes, the ABR has to generate 120 Type 3 summary LSAs in total.

  • More areas per ABR puts more burden on the ABR
  • More type 3 LSAs will be generated by the ABR
  • Increasing the number of areas will worsen the situation

Considerations for Intra-Area Routing Scalability

To build a stable and scalable intra-area network, take the following parameters into consideration:

  • Physical link flaps can cause instability in OSPF
    – Consider using IP dampening

  • Avoid having physical links in OSPF through the use of prefix suppression
  • BGP can be used to carry transit links for monitoring purpose

Considerations for Inter-Area Routing Scalability

  • Filter physical links outside the area through type 3 filtering feature
  • Every area should only carry loopback addresses for all routers
  • NMS station may keep track of physical links if needed
  • These can be redistributed into BGP

OSPF Border Connections

OSPF always prefers intra-area paths to inter-area paths, regardless of metric. This may cause suboptimal routing under certain conditions.

Border1

  • Assume the link between D and E is in area 0
  • If the link between D and F fails, traffic will follow the intra area path D -> G, G -> E and E -> F

This could be solved by adding an extra interface/subinterface between D and E in area 1. This would increase the number of LSAs though…

OSPF Hub and Spoke

  • Make spoke areas as stubby as possible
    – If possible, make the area totally stubby
    – If redistribution is needed, make the area totally not-so-stubby

  • Be aware of reachability issues, make sure hub router becomes DR and use network type of Point to Multipoint (P2MP) if needed
    – P2MP has smaller DB size compared to Point to Point (P2P)
    – P2P will use more address space and increase the DB size compared to P2MP but it may be beneficial for other reasons when trying to achieve fast convergence

  • If the number of spokes is small, the hub and spokes can be placed within an area such as the backbone
  • If the number of spokes is large, make the hub an ABR and split off the spokes in area(s)

Summary

OSPF is a link state protocol that can scale to large networks if the network is designed according to the characteristics of OSPF. This post described design considerations and features such as prefix suppression that will help in scaling OSPF. For a deeper look at OSPF design, go through BRKRST-2337, available at Cisco Live 365.

Categories: CCDP, OSPF Tags: , ,

OSPF – Non Broadcast and Point to Multipoint

May 4, 2014 2 comments

Introduction

There was a discussion on a forum about the point to multipoint network type in
OSPF. What is the purpose of it and why are /32 endpoints advertised? To understand
the solution we must first understand the problem. This topology has a NBMA network
where R1 is the hub. R1 has been elected the hub due to having the highest priority.

Topology

NBMA Networks

When using the network type non broadcast, which is the default for main interfaces
with frame relay enabled, a DR and BDR is elected. When using a hub and spoke
topology it is important that the hub is elected the DR. Why is this? If a
spoke is elected the DR the flooding process will fail. On broadcast and non broadcast
segments the DROTHERs flood the LSAs to the DR/BDR and then the DR floods them to
the other DROTHERs on the segment.

The routing has already been setup, R2 and R3 are advertising a loopback each, they
are 2.2.2.2/32 and 3.3.3.3/32 respectively. First we will have a look at the router
LSAs that are generated.

R2#sh ip ospf data router 3.3.3.3

            OSPF Router with ID (2.2.2.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 118
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 3.3.3.3
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000006
  Checksum: 0x5849
  Length: 48
  Number of Links: 2

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 3.3.3.3
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.0.0.1
     (Link Data) Router Interface address: 10.0.0.3
      Number of TOS metrics: 0
       TOS 0 Metrics: 64

Interfaces that are connected and don’t have an OSPF adjacency are considered
stub networks. These are advertised with the network and the mask. There is
also a transit network that contains the IP of the DR and the local routers IP
for that network. Note that there is no network mask within this LSA.

If we have a look at the routing table of R2, the next-hop to reach 3.3.3.3/32
is 10.0.0.3.

R2#sh ip route 3.3.3.3                
Routing entry for 3.3.3.3/32
  Known via "ospf 1", distance 110, metric 65, type intra area
  Last update from 10.0.0.3 on Serial1/0, 00:34:24 ago
  Routing Descriptor Blocks:
  * 10.0.0.3, from 3.3.3.3, 00:34:24 ago, via Serial1/0
      Route metric is 65, traffic share count is 1

On broadcast and non broadcast segments all routers are assumed to be fully
meshed but here we have a hub and spoke topology. When R2 needs to send traffic
to R3 it will try to encapsulate the frame but it does not know which DLCI to
use for 10.0.0.3 because there is no frame mapping for that.

R2#sh frame map
Serial1/0 (up): ip 10.0.0.1 dlci 201(0xC9,0x3090), dynamic,
              broadcast,, status defined, active

If we debug the sending of frame relay frames we will see that the encapsulation
is failing.

R2#debug frame-relay packet
Frame Relay packet debugging is on
R2#ping 3.3.3.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:

*Mar  1 00:42:30.495: Serial1/0(o): dlci 201(0x3091), pkt type 0x800(IP), datagramsize 84
*Mar  1 00:42:32.303: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:34.299: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:36.299: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:38.299: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:40.299: Serial1/0:Encaps failed--no map entry link 7(IP).
Success rate is 0 percent (0/5)

The layer 3 topology is not consistent with the layer 2 topology. The layer 3 is
assumed to be fully meshed but our layer 2 is in fact hub and spoke.

The next-hop is not changed on broadcast and non broadcast segments because
all routers are assumed to be fully meshed.

Solving the Inconsistency

One way of solving the inconsistency is by adding static mappings for
the IP of R2 and R3 respectively to the correct DLCI.

R2#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#int s1/0
R2(config-if)#frame map ip 10.0.0.3 201
R3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#int s1/0
R3(config-if)#frame map ip 10.0.0.2 301
R2#ping 3.3.3.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/51/76 ms

This solved our problem but requires manual intervention and if new routers are
added then new mappings would have to be added as well. Before we leave the
network type non-broadcast there is one more thing I would like to point out.

Network LSA

What is the role of the network LSA? It is twofold, it conveys both topology
information and reachability information. If we look at the network LSA that R1
generates.

R2#sh ip ospf data net 10.0.0.1

            OSPF Router with ID (2.2.2.2) (Process ID 1)

                Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 634
  Options: (No TOS-capability, DC)
  LS Type: Network Links
  Link State ID: 10.0.0.1 (address of Designated Router)
  Advertising Router: 1.1.1.1
  LS Seq Number: 80000003
  Checksum: 0x46C6
  Length: 36
  Network Mask: /24
        Attached Router: 1.1.1.1
        Attached Router: 2.2.2.2
        Attached Router: 3.3.3.3

We see all of the attached routers and also the network mask for the segment.
This network LSA will be converted by R4 into a type3 summary LSA. If we look
from R5’s perspective we can see this prefix.

R5#sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets

O IA    10.0.0.0 [110/84] via 45.45.45.45, 00:45:27, FastEthernet0/0
R5#sh ip ospf data sum 10.0.0.0

            OSPF Router with ID (5.5.5.5) (Process ID 1)

                Summary Net Link States (Area 1)

  Routing Bit Set on this LSA
  LS age: 831
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.0.0.0 (summary Network Number)
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000002
  Checksum: 0x7364
  Length: 28
  Network Mask: /24
        TOS: 0  Metric: 74 

R5 can reach this network which is demonstrated by the ping.

R5#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 60/64/72 ms

What would happen if R2 or R3 became the DR instead of R1? Let’s try it out.

R1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R1(config)#int s1/0
R1(config-if)#ip ospf prio 1
R2#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#int s1/0
R2(config-if)#ip ospf prio 100
R3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#int s1/0
R3(config-if)#ip ospf prio 0

Then clear the process to see the effect.

The segment has now been split into two. R2 does not know about R3 so we have
a segment with R1 and R2 and then a segment with R1 and R3. This can be seen
from the neighbor table and from the network LSA that is generated.

R1#sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
4.4.4.4           1   FULL/DR         00:00:31    14.14.14.4      FastEthernet0/0
2.2.2.2         100   FULL/DR         00:01:44    10.0.0.2        Serial1/0
3.3.3.3           0   FULL/DROTHER    00:01:45    10.0.0.3        Serial1/0
R1#sh ip ospf data net 10.0.0.2

            OSPF Router with ID (1.1.1.1) (Process ID 1)

                Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 600
  Options: (No TOS-capability, DC)
  LS Type: Network Links
  Link State ID: 10.0.0.2 (address of Designated Router)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000001
  Checksum: 0x43D6
  Length: 32
  Network Mask: /24
        Attached Router: 2.2.2.2
        Attached Router: 1.1.1.1

This also means that SPF can’t run properly because R3 is now not connected to
the topology because it’s not part of the network LSA for 10.0.0.0/24. This
means that R5 can’t ping R3.

R5#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

It can still reach R1 and R2 though.

R5#ping 10.0.0.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/52/84 ms
R5#ping 10.0.0.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/66/84 ms

So the network LSA is used both for building the SPF topology and for reachibility
information.

Point to Multipoint Network

There is a point to multipoint network type. This overcomes the limitations of
partially meshed networks by accomplishing two things. The first thing that happens
is that when a LSA is received on an interface, the next-hop is changed to IP of
the router LSA that the local router is connecting to. In our case this is 10.0.0.1.

R2#sh ip route 3.3.3.3
Routing entry for 3.3.3.3/32
  Known via "ospf 1", distance 110, metric 129, type intra area
  Last update from 10.0.0.1 on Serial1/0, 00:03:59 ago
  Routing Descriptor Blocks:
  * 10.0.0.1, from 3.3.3.3, 00:03:59 ago, via Serial1/0
      Route metric is 129, traffic share count is 1
R2#sh ip ospf data router 1.1.1.1

            OSPF Router with ID (2.2.2.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 483
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 1.1.1.1
  Advertising Router: 1.1.1.1
  LS Seq Number: 80000018
  Checksum: 0x6E61
  Length: 72
  Number of Links: 4

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 14.14.14.4
     (Link Data) Router Interface address: 14.14.14.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 2.2.2.2
     (Link Data) Router Interface address: 10.0.0.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 64

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 3.3.3.3
     (Link Data) Router Interface address: 10.0.0.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 64

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.0.0.1
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 0

This is described in RFC 2328.

If the destination is a router which connects to the
calculating router via a Point-to-MultiPoint network, the
destination's next hop IP address(es) can be determined by
examining the destination's router-LSA: each link pointing
back to the calculating router and having a Link Data field
belonging to the Point-to-MultiPoint network provides an IP
address of the next hop router.

From the router LSA above you can see that each router in the point to multipoint
network advertises a stub network with a /32 mask. On point to multipoint segments
the network is described as a collection of point to point links. With a regular
point to point network the stub network is described with the actual mask of
the interface that is running OSPF. What is the use of these /32 routes?
First let’s test the reachability.

R2#ping 3.3.3.3 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 48/60/72 ms
R2#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/48/76 ms
R2#show ip route 10.0.0.3
Routing entry for 10.0.0.3/32
  Known via "ospf 1", distance 110, metric 128, type intra area
  Last update from 10.0.0.1 on Serial1/0, 00:12:41 ago
  Routing Descriptor Blocks:
  * 10.0.0.1, from 3.3.3.3, 00:12:41 ago, via Serial1/0
      Route metric is 128, traffic share count is 1

Full reachability. What happens if we filter the /32 route to R3 on R2?

R2(config)#ip prefix-list DENY_R3 deny 10.0.0.3/32         
R2(config)#ip prefix-list DENY_R3 permit 0.0.0.0/0 le 32   
R2(config)#router ospf 1
R2(config-router)#distribute-list prefix DENY_R3 in

The host route to R3 is now gone. The route to R3 is known via the connected
network.

R2#show ip route 10.0.0.3
Routing entry for 10.0.0.0/24
  Known via "connected", distance 0, metric 0 (connected, via interface)
  Routing Descriptor Blocks:
  * directly connected, via Serial1/0
      Route metric is 0, traffic share count is 1

What about reachability?

R2#ping 3.3.3.3 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/57/84 ms
R2#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

The ping went through when we used the loopback as a source. That is because
R3 still has a host route to R2 in its routing table. We couldn’t ping 10.0.0.3
though? Why? Because without the host route pointing to a next-hop of 10.0.0.1,
R2 will try to see what DLCI to use when encapsulating the frame to 10.0.0.3
and there is no mapping for this. If we ping R2’s loopback from R3 sourcing
with the 10.0.0.3 IP, it will fail for the same reason.

R3#ping 2.2.2.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Conclusion

OSPF has various network types that accomplish different things.
The point to multi point network type is used to overcome limitations at
layer 2. The network is described as a collection of point to point links
where each router advertises a stub network with a /32 mask. By doing this
the spokes can maintain connectivity between each other when sourcing traffic
from the interface connected to their common network.

The next-hop is also changed on incoming LSAs to the IP contained in the router
LSA from the router that originated the LSA. This solves the next hop issue
on non broadcast networks where the next hop is maintained because full mesh
connectivity is assumed.

Why fast IGP timers aren’t always beneficial

March 31, 2014 4 comments

Introduction

When tuning your IGP of choice, the first thing people look at is usually the hello
and dead interval. This is a flawed logic, it is true that it can help in certain
cases but convergence consists of much more than just hello timers.

Why tune timers?

Detecting that the other side of the link is down is an important part of converging.
That’s why your design should avoid putting any bump in the wires such as converters
or a L2 cloud between the L3 endpoints. If you avoid such things when one end of the
link goes down the other end will as well which provides fast detection of failure.

In rare cases you can have the link being up but traffic is not passing over it. For
such cases or for those cases where there was no chance of avoiding a converter or
L2 cloud, tuning the hello timers can help with failure detection. The answer is almost
always BFD though, if the platform supports it.

Topologies where tuning timers is bad

When using a topology where VSS is involved such as Catalyst 6500 or Catalyst 4500,
tuning the timers is very bad. A common topology might look like this:

VSS1

The L3 switches are dually connected to the VSS. These L3 switches might be in the
distribution layer and the VSS is part of the core. The distribution switches run
LACP towards the VSS which acts as one device from an outside perspective.

The VSS runs Stateful Switchover (SSO) which syncs configuration, boots the standby
supervisor with the software and has the line cards ready to go in case of failure
of the primary chassis. Hardware forwarding tables are also synchronized, SSO
switchover takes somewhere up to 10 seconds.

SSO

The active VSS chassis runs the control plane. Routing protocols such as OSPF are not
HA aware, meaning that the state of the routing protocols is not synchronized between
the chassis.

When using fast timers and a switchover occurs, what happens is that OSPF detects that the
neighbor is not replying and tears down the adjacency. The secondary chassis then has to go
bring the adjacency back up by sending out hello packets, exchanging LSAs and updating
RIB/FIB. This may take as long as 20 seconds with the time included from the switchover.

VSS_failure

Non Stop Forwarding (NSF)

NSF combined with graceful restart is a technology used to forward packets when
a switchover has occured. The goal of NSF is to delay the failure detection which
may sound strange from a convergence perspective. Remember though that the VSS acts
as one device.

With NSF the forwarding is done according to the last known FIB entries. After a
switchover the secondary VSS will use graceful restart to inform its neighbors that
it has restarted and needs to synchronize its LSDB. This is done by sending hello packets
with a special bit set and the synchronization is done Out Of Band (OOB) to not tear
down the existing adjacency. The neighbors exchange LSAs and run SPF as normal. The
RIB and FIB can then be updated and and normal forwarding ensues.

This process is dependant on that the neighbors are also NSF aware otherwise they
would tear down the adjacency when the secondary VSS is restarting its routing
processes. So the key here is that the adjacency must stay up and that’s why timers
should be left at default if running VSS. This goes for both the VSS and any routers
that are neighbors to the VSS.

Conclusion

When using VSS always leave IGP timers at the default. Fast timers ruins the NSF
process and will lead to much higher convergence times than leaving them at the
default.

Some pointers on OSPF as PE to CE protocol

February 23, 2014 5 comments

There was a discussion at the Cisco Learning Network (CLN) about OSPF as PE to CE
protocol.
I wanted to provide some pointers on using OSPF as PE to CE protocol.

RFC 4577 describes how to use OSPF as PE to CE protocol. When using BGP to carry the
OSPF routes the MPLS backbone is seen as a super backbone. This adds another level of
hierarchy making OSPF three levels compared to the usual two when using plain OSPF.

Superbackbone

Because the the MPLS backbone is seen as a super area 0, that means that OSPF routes
going across the MPLS backbone can never be better than type 3 summary LSA. Even if
the same area is used on both sides of the backbone and the input is a type 1 or type 2
LSA it will be advertised as a summary LSA on the other side.

LSA across superbackbone

The only way to keep the type 1 or type 2 LSAs as they are is to use a sham link.
Sham links sets up a control plane mechanism acting as a tunnel for the LSAs passing
over the MPLS backbone. Sham links are outside the scope of this article.

A LSA can never be “better” than it originally was input as. This means that if the input
to the PE isa type 3 LSA this can never be converted to a type 1 or type 2 LSA on the other
side. If the LSA was type 5 external to begin it will be sent as type 5 on the other side
as well.

To understand how the LSAs are sent over the backbone, look at this picture.

MPBGP

OSPF LSA is sent to PE which is running OSPF in a VRF with the CPE. The PE installs
the LSA as a route in the OSPF RIB. If the route is the best one known to the router
it can install it to the global RIB.

The PE redistributes from OSPF into BGP. Only routes that are installed as OSPF in
the RIB will be redistributed. To be able to carry OSPF specific information the PE
has to add extended communities. To make the IPv4 route a VPNv4 route the PE has
to add the RD and RT values. The OSPF specific communities consist of:

Domain-ID

The domain ID can either be hard coded or derived from the OSPF process running.
It is used to identify if LSAs are sent into the same domain as they originated
from. If the domain ID matches then type 3 summary LSAs can be sent for routes
that were internal or inter area. If the domain ID does not match then all routes
must be sent as external.

Domain ID match

Domain ID 1

Domain ID non match

Domain ID 2

OSPF Route Type

The route type consists of area number, route type and options.

Route Type

If we look at a MPBGP update we can see the route type encoded.

R4#sh bgp vpnv4 uni rd 1:1 1.1.1.1/32
BGP routing table entry for 1:1:1.1.1.1/32, version 5
Paths: (1 available, best #1, table cust)
Flag: 0x820
  Not advertised to any peer
  Local
    2.2.2.2 (metric 21) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 11, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:22.22.22.22:0
      mpls labels in/out nolabel/18

Something that is a bit peculiar is that this update has a route type of 2 even though
it originated from a type 1 LSA. In the end it doesn’t make a difference because it will
be advertised as type 3 LSA to the CPE.

OSPF Router ID

The router ID of the router that originated the LSA (PE) is also carried as an extended
community.

R4#sh bgp vpnv4 uni rd 1:1 1.1.1.1/32
BGP routing table entry for 1:1:1.1.1.1/32, version 5
Paths: (1 available, best #1, table cust)
Flag: 0x820
  Not advertised to any peer
  Local
    2.2.2.2 (metric 21) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 11, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:22.22.22.22:0
      mpls labels in/out nolabel/18

MED

The MED is set to the OSPF metric + 1 as defined by the RFC.


R4#sh bgp vpnv4 uni rd 1:1 1.1.1.1/32
BGP routing table entry for 1:1:1.1.1.1/32, version 5
Paths: (1 available, best #1, table cust)
Flag: 0x820
  Not advertised to any peer
  Local
    2.2.2.2 (metric 21) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 11, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:22.22.22.22:0
      mpls labels in/out nolabel/18

The goal of these extended communities is to extend BGP so that OSPF LSAs can be
carried transparently as if BGP hadn’t been involved at all. LSAs are translated
to BGP updates and then translated back to LSAs.

If we look at a packet capture we can see the extended communities attached.
This BGP Update originated from a type 5 external LSA with metric-type 1.

Capture

When using OSPF as the PE to CE protocol it is important to remember the design
rules of OSPF. Because of that you should avoid designs like this:

OSPF1

In this design area 1 is used on both sides but the CPE is then connected to area 0
which makes it an ABR. The rules of OSPF dictate that summary LSAs must only be
received over area 0 if it is an ABR. This means this topology is broken and would
require changing area or using a virtual link.

OSPF as PE to CE protocol has some complexity but must of it is still plain OSPF
which is in itself a complicated protocol. Combine that with BGP and MPLS and
it is easy to get confused which protocol is responsible for what. That is also
one of the reasons that I recommend to use eBGP or static when customers connect
to their ISP.

Categories: BGP, MPLS, OSPF Tags: , , , , ,

Why OSPF FA is only set on broadcast networks

April 10, 2013 6 comments

A friend of mine asked me about the OSPF forwarding address. The question was why
must the network type be broadcast for the FA to be set? Why is not point to point
and point to multipoint network type valid?

First of all, what is the point of having a forwarding address? Look at the topology
below.

Forwarding_address_BGP

R3 is the only one running BGP to R4. If the FA is not set then there will be an
extra hop compared to R2 sending the traffic directly to R4.

R1#sh ip route 10.10.4.0
Routing entry for 10.10.4.0/24
  Known via "ospf 1", distance 110, metric 1
  Tag 4, type extern 2, forward metric 20
  Last update from 10.10.12.2 on FastEthernet0/0, 00:00:23 ago
  Routing Descriptor Blocks:
  * 10.10.12.2, from 10.10.23.3, 00:00:23 ago, via FastEthernet0/0
      Route metric is 1, traffic share count is 1
      Route tag 4

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 35
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 10.10.23.3
  LS Seq Number: 80000001
  Checksum: 0xEB7D
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0 
        Metric: 1 
        Forward Address: 0.0.0.0
        External Route Tag: 4

R1#traceroute 10.10.4.4 num

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 44 msec 44 msec 32 msec
  2 10.10.23.3 60 msec 36 msec 40 msec
  3 10.10.234.4 84 msec *  76 msec

Because the forwarding address is set to 0 the traffic must flow through the
ASBR originating the LSA.

Which conditions must be met to set the FA?

The interface on the ASBR must have OSPF enabled. It must not be passive and it
must be broadcast. Let’s enable this on R3.

R3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#int f0/1
R3(config-if)#ip ospf 1 area 0

Now check the external LSA on R1 and a traceroute.

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 243
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 10.10.23.3
  LS Seq Number: 80000002
  Checksum: 0xF66E
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0 
        Metric: 1 
        Forward Address: 10.10.234.4
        External Route Tag: 4

R1#traceroute 10.10.4.4 num

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 48 msec 32 msec 64 msec
  2 10.10.234.4 96 msec *  88 msec

The traffic is now flowing directly via R2. The key point here is that in broadcast
networks all routers can communicate with each other (full mesh). We can see this by
looking at the type2 LSA.

R1#sh ip ospf data net 10.10.234.3

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 179
  Options: (No TOS-capability, DC)
  LS Type: Network Links
  Link State ID: 10.10.234.3 (address of Designated Router)
  Advertising Router: 10.10.23.3
  LS Seq Number: 80000001
  Checksum: 0x3485
  Length: 32
  Network Mask: /24
        Attached Router: 10.10.23.3
        Attached Router: 10.10.12.2

Why isn’t a point to point network valid? Well, the name pretty much says it all.
With point-to-point there can only be two routers connected so there is no use
in setting the FA because the traffic must flow through the router originating
the LSA.

If we look at the router LSA from R2 when we have broadcast network type it looks
like this:

R1#sh ip ospf data router 10.10.12.2

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Router Link States (Area 0)

  LS age: 7
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 10.10.12.2
  Advertising Router: 10.10.12.2
  LS Seq Number: 8000000A
  Checksum: 0x977B
  Length: 60
  Number of Links: 3

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.234.3
     (Link Data) Router Interface address: 10.10.234.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.23.2
     (Link Data) Router Interface address: 10.10.23.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.12.1
     (Link Data) Router Interface address: 10.10.12.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

You can see that the 10.10.234.0 is a transit network and then the type 2 LSA shows
which routers are connected and the network mask. Now if we change to point to point.

R1#sh ip ospf data router 10.10.12.2

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Router Link States (Area 0)

  LS age: 59
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 10.10.12.2
  Advertising Router: 10.10.12.2
  LS Seq Number: 8000000B
  Checksum: 0xF2E3
  Length: 72
  Number of Links: 4

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 10.10.23.3
     (Link Data) Router Interface address: 10.10.234.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.10.234.0
     (Link Data) Network Mask: 255.255.255.0
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.23.2
     (Link Data) Router Interface address: 10.10.23.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.12.1
     (Link Data) Router Interface address: 10.10.12.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

The 10.10.234.0 network is now a stub network which means it can’t be used for transit.
Usually there should only be two routers connected here, we shouldn’t use P2P network
type if there is an Ethernet segment with multiple routers.

So finally why is P2MP not valid? Because P2MP is used in NBMA networks. These networks
are usually partially meshed and from the perspective of OSPF it is a collection of
point to point links. This is how the LSA looks.

R1#sh ip ospf data router 10.10.12.2

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Router Link States (Area 0)

  LS age: 8
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 10.10.12.2
  Advertising Router: 10.10.12.2
  LS Seq Number: 8000000D
  Checksum: 0xFCD6
  Length: 72
  Number of Links: 4

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 10.10.23.3
     (Link Data) Router Interface address: 10.10.234.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.10.234.2
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 0

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.23.2
     (Link Data) Router Interface address: 10.10.23.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.12.1
     (Link Data) Router Interface address: 10.10.12.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

It looks very similar to P2P with the difference that the stub network has a mask
of /32. This is useful in partial mesh where spokes need to reach each other via
the hub and don’t have a DLCI between them.

So it only makes sense to use FA in broadcast networks because that is the only
place where routers are guaranteed to be able to communicate to each other because
it is by nature fully meshed.

Categories: OSPF Tags: , ,

Tiebreakers with routes from different OSPF processes

March 15, 2013 17 comments

This post is inspired by a discussion at Twitter with Ivan Pepelnjak and
Nicolas Michel. Nicolas asked what happens when there is the same route from two
different OSPF processes. Which one will be selected? Ivan explained how
to use the distance command. First before I show how it works and why we
need to get some few basic concepts explained.

LSDB – Link State Database – All OSPF LSAs populate the LSDB
RIB – Routing Information Base – The best routes from every protocol
compete to get installed to the RIB
FIB – Forwarding Information Base – Routes are copied from the RIB
and used for forwarding (CEF)
CEF – Cisco Express Forwarding – The algorithm that Cisco uses for
the forwarding (FIB)

If we have for example OSPF, this is how a route gets selected to the RIB(global).
The routers exchange LSAs with each other. Within an area every router has the same
view of the network. These LSAs populate the LSDB. If there are multiple paths to
a destination they will compete with each other unless they are of same type and equal
cost. Intra area is preferred first, then inter and finally external routes. There is no
way of modifying this behaviour. The best route then goes to the OSPF RIB, could be several
if they are equal. From there this route will compete with other routing protocols and the
AD will decide which one is installed. If the OSPF one is best then that one goes to the global
RIB. Then finally the RIB populates FIB with this information and forwarding can ensue.

This is a picture I made that describes the process.

Route_selection

We start out with a very basic topology looking like this.

Multiple_OSPF_1

R1 and R3 will announce the same network 1.1.1.1/32. R2 will use two different OSPF processes.
We start out with the basic configuration:

R1

R1(config)#int f1/0
R1(config-if)#ip add 12.12.12.1 255.255.255.0
R1(config-if)#no sh
R1(config-if)#ip ospf 1 area 0
R1(config-if)#int lo0
R1(config-if)#ip add 1.1.1.1 255.255.255.255
R1(config-if)#ip ospf 1 area 0

R2

R2(config)#int f1/0
R2(config-if)#ip add 12.12.12.2 255.255.255.0
R2(config-if)#no sh
R2(config-if)#ip ospf 1 area 0
R2(config-if)#int f1/1
R2(config-if)#ip add 23.23.23.2 255.255.255.0
R2(config-if)#no sh
R2(config-if)#ip ospf 3 area 0
%OSPF-5-ADJCHG: Process 1, Nbr 12.12.12.1 on FastEthernet1/0 from LOADING to FULL, Loading Done

We see the session coming up immediately. Now lets bring up R3 as well.

R3

R3(config)#int f1/0
R3(config-if)#ip add 23.23.23.3 255.255.255.0
R3(config-if)#no sh
R3(config-if)#ip ospf 3 area 0
R3(config-if)#int lo0
R3(config-if)#ip add 1.1.1.1 255.255.255.255
R3(config-if)#ip ospf 3 area 0
%OSPF-5-ADJCHG: Process 3, Nbr 23.23.23.2 on FastEthernet1/0 from LOADING to FULL, Loading Done

Both OSPF peerings are up. Now lets follow the steps that was shown in
the picture above starting by looking at the database.

R2#sh ip ospf data router 12.12.12.1

            OSPF Router with ID (23.23.23.2) (Process ID 3)

            OSPF Router with ID (12.12.12.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 184
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 12.12.12.1
  Advertising Router: 12.12.12.1
  LS Seq Number: 80000003
  Checksum: 0xF78
  Length: 48
  Number of Links: 2

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 1.1.1.1
     (Link Data) Network Mask: 255.255.255.255
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 12.12.12.1
     (Link Data) Router Interface address: 12.12.12.1
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

We see that R1 is announcing 1.1.1.1/32 and we have a metric of 2 to it.
Do we see R3 announcing that as well?

R2#sh ip ospf data router 23.23.23.3

            OSPF Router with ID (23.23.23.2) (Process ID 3)

                Router Link States (Area 0)

  LS age: 148
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 23.23.23.3
  Advertising Router: 23.23.23.3
  LS Seq Number: 80000003
  Checksum: 0x54A7
  Length: 48
  Number of Links: 2

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 1.1.1.1
     (Link Data) Network Mask: 255.255.255.255
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 23.23.23.2
     (Link Data) Router Interface address: 23.23.23.3
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

Yes, it’s there. Now we take a look at the OSPF RIB. Which ones do we see there?

R2#sh ip ospf rib

            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*   1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 12.12.12.1, FastEthernet1/0
*   12.12.12.0/24, Intra, cost 1, area 0, Connected
      via 12.12.12.2, FastEthernet1/0

The greater than sign indicates that the one from OSPF process 1 was selected.
Why? When running multiple OSPF processes the one that first installs to the
RIB will be selected to the global RIB. Now we confirm by looking in the
global RIB.

R2# show ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 12.12.12.1, 00:06:35, FastEthernet1/0

Yes, that looks correct. Final step is to verify that FIB is also updated.

R2#sh ip cef 1.1.1.1/32
1.1.1.1/32
  nexthop 12.12.12.1 FastEthernet1/0

So the one that first writes to the global RIB wins. Now lets bring down the
process that is currently winning.

R2(config)#int f1/0
R2(config-if)#sh
R2(config-if)#

The OSPF RIB and global RIB should now be updated.

R2#show ip ospf rib

            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB
R2#show ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 23.23.23.3, 00:00:42, FastEthernet1/1

Now if we bring back OSPF process 1, what will happen? Process 3 should still be
winning since it installed to global RIB first.

R2(config)#int f1/0
R2(config-if)#no sh
R2#sh ip ospf rib

            OSPF Router with ID (2.2.2.2) (Process ID 11)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB


            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*   1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 12.12.12.1, FastEthernet1/0
*   12.12.12.0/24, Intra, cost 1, area 0, Connected
      via 12.12.12.2, FastEthernet1/0

Now process 1 is winning, which is odd. Lets debug ip routing to see what is
really happening. We shutdown interface in process 1.

*Mar 14 23:26:36.555: RT: del 1.1.1.1 via 12.12.12.1, ospf metric [110/2]
*Mar 14 23:26:36.559: RT: delete subnet route to 1.1.1.1/32
*Mar 14 23:26:36.579: RT: updating ospf 1.1.1.1/32 (0x0):
    via 23.23.23.3 Fa1/1
*Mar 14 23:26:36.583: RT: add 1.1.1.1/32 via 23.23.23.3, ospf metric [110/2]

Now we bring back process 1.

*Mar 14 23:29:04.163: RT: updating ospf 1.1.1.1/32 (0x0):
    via 12.12.12.1 Fa1/0
*Mar 14 23:29:04.171: RT: closer admin distance for 1.1.1.1, flushing 1 routes
*Mar 14 23:29:04.175: RT: add 1.1.1.1/32 via 12.12.12.1, ospf metric [110/2]

We can see that IOS is claiming that distance is lower which it is clearly not.
What happens if we change process 1 to process 11 and we shutdown the interface
in process 3?

R2(config)#int f1/1
R2(config-if)#sh
R2(config-if)#int f1/0
R2(config-if)#ip ospf 11 area 0

Now we look at the output from the debug.

*Mar 14 23:33:27.615: RT: updating ospf 1.1.1.1/32 (0x0):
    via 12.12.12.1 Fa1/0

*Mar 14 23:33:27.619: RT: add 1.1.1.1/32 via 12.12.12.1, ospf metric [110/2]
*Mar 14 23:33:39.927: RT: updating connected 23.23.23.0/24 (0x0):
    via 0.0.0.0 Fa1/1
*Mar 14 23:33:39.931: RT: add 23.23.23.0/24 via 0.0.0.0, connected metric [0/0]
*Mar 14 23:33:39.939: RT: interface FastEthernet1/1 added to routing table
*Mar 14 23:33:39.947: RT: updating connected 23.23.23.2/32 (0x0):
    via 0.0.0.0 Fa1/1
*Mar 14 23:33:39.951: RT: network 23.0.0.0 is now variably masked
*Mar 14 23:33:39.951: RT: add 23.23.23.2/32 via 0.0.0.0, connected metric [0/0]
*Mar 14 23:33:55.447: RT: updating ospf 1.1.1.1/32 (0x0):
    via 23.23.23.3 Fa1/1
*Mar 14 23:33:55.455: RT: closer admin distance for 1.1.1.1, flushing 1 routes
*Mar 14 23:33:55.455: RT: add 1.1.1.1/32 via 23.23.23.3, ospf metric [110/2]

We can see that first process 11 is the only option available so the 1.1.1.1/32
route is installed via f1/0. Then f1/1 comes back up and now 1.1.1.1/32 is reachable
via f1/1 and is chosen because of “closer admin distance” which is not true. This must
mean that the OSPF process number is the tie breaker.

We take a look at the OSPF RIB and global RIB to verify once more.

R2#sh ip ospf rib

            OSPF Router with ID (22.22.22.22) (Process ID 11)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*   1.1.1.1/32, Intra, cost 2, area 0
      via 12.12.12.1, FastEthernet1/0
*   12.12.12.0/24, Intra, cost 1, area 0, Connected
      via 12.12.12.2, FastEthernet1/0

            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

R2#sh ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 23.23.23.3, 00:09:02, FastEthernet1/1

What if we change the AD of process 11?

R2(config)#router ospf 11
R2(config-router)#distance ospf intra-area 100
*Mar 14 23:43:31.315: RT: updating ospf 1.1.1.1/32 (0x0):
    via 12.12.12.1 Fa1/0

*Mar 14 23:43:31.319: RT: closer admin distance for 1.1.1.1, flushing 1 routes
*Mar 14 23:43:31.323: RT: add 1.1.1.1/32 via 12.12.12.1, ospf metric [100/2]

That makes process 11 win again. So these tests seems to indicate that if everything
is the same then the tiebreaker is the lowest process number. For EIGRP it is the
lowest AS number so maybe Cisco chose to make it comparable.
Also take a look at what Ivan is saying at IOS hints

Categories: OSPF, Routing Tags: , , , ,

ASBR in NSSA – Choosing what IP to use as forwarding address

September 20, 2012 5 comments

OSPF is one of the protocols where the details are very important. It has lots
of bits and pieces to make it run in a proper way. I have described the forwarding
address in an earlier post and this time I want to show how the IP that is used
as the forwarding address is selected. We start out with this simple topology.

It’s a very basic config where R1 is redistributing a route and running in a
NSSA area.

R1#sh run | s router ospf|ip route
router ospf 1
 router-id 1.1.1.1
 log-adjacency-changes
 area 10 nssa
 redistribute static subnets
ip route 100.0.0.0 255.0.0.0 Null0

Which IP will R1 use for its forwarding address? We look at R3.

R3#sh ip route ospf | i E2
O E2 100.0.0.0/8 [110/20] via 23.23.23.2, 00:57:59, FastEthernet0/0
R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 120
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000005
  Checksum: 0x4AC0
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 12.12.12.1
        External Route Tag: 0

It has chosen its interface address towards R2. What if we enable OSPF on the other
Ethernet interface of R1?

R1(config)#int f0/1
R1(config-if)#ip ospf 1 area 10

We check R3 again.

R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 25
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000006
  Checksum: 0x6676
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 112.112.112.1
        External Route Tag: 0

The forwarding address has changed. It selected the IP of the other Ethernet interface
of R1. We can see that it prefers to choose a higher IP address. What if we announce
the loopback of R1 in the NSSA area?

R1(config-if)#int lo0
R1(config-if)#ip ospf 1 area 10
R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 27
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000007
  Checksum: 0xAE53
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 11.11.11.11
        External Route Tag: 0

Now the loopback IP is chosen instead. So since the loopback has a lower IP but still
is preferred we can see that loopbacks are preferred in the selection. To see this
clearly defined in words we reference RFC 3101 section 2.3.

When a router is forced to pick a forwarding address for a Type-7
LSA, preference should be given first to the router's internal
addresses (provided internal addressing is supported).  If internal
addresses are not available, preference should be given to the
router's active OSPF stub network addresses.  These choices avoid the
possible extra hop that may happen when a transit network's address
is used.  When the interface whose IP address is the LSA's forwarding
address transitions to a Down state (see [OSPF] Section 9.3), the
router must select a new forwarding address for the LSA and then re-
originate it.  If one is not available the LSA should be flushed.

So the selection process is to choose the highest IP of a loopback advertised
into the NSSA area. If no loopback is advertised then choose the highest
physical interface IP advertised into the NSSA area.

I hope that I have provide another piece to the OSPF puzzle and you now have
a good understanding of the forwarding address.

Route redistribution – Route-maps and tagging

August 16, 2012 2 comments

Earlier I have done some posts on route redistribution and on
route filtering in different protocols. I wanted to expand on this
by showing different ways we can tag and do filtering with route-maps
when doing route redistribution.

We start out with this topology where two different OSPF segments are
separated by an EIGRP segment.

R2 will redistribute between OSPF and EIGRP mutually. R1 is redistributing
its loopback so it will be an external OSPF route. R4 and R5 will mutually
redistribute between EIGRP and OSPF. One interesting aspect about EIGRP is
that in the EIGRP packet we can see which protocol that originated the
route from the beginning. Take a look at this output showing the R1 loopback
in the EIGRP domain.

R4#sh ip eigrp topo 10.10.1.0/24
IP-EIGRP (AS 100): Topology entry for 10.10.1.0/24
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2560002816
  Routing Descriptor Blocks:
  10.10.24.2 (FastEthernet0/0), from 10.10.24.2, Send flag is 0x0
      Composite metric is (2560002816/2560000256), Route is External
      Vector metric:
        Minimum bandwidth is 1 Kbit
        Total delay is 110 microseconds
        Reliability is 1/255
        Load is 1/255
        Minimum MTU is 1
        Hop count is 1
      External data:
        Originating router is 10.10.24.2
        AS number of route is 1
        External protocol is OSPF, external metric is 20
        Administrator tag is 0 (0x00000000)

We can see that it came from OSPF 1 and that the ASBR is 10.10.24.2.
We also see that it had a metric of 20 and no tag applied to it. Where
is this information carried? Take a look at this packet capture.

We can see that a lot of information is carried for external routes.
This gives us options when doing tagging and filtering.

We configure distribution on R2 and then look at our options for doing
tagging and filtering.

R2#sh run | s router
router eigrp 100
 redistribute ospf 1 metric 1 1 1 1 1

If we look at R4 we should have two external routes with AD 170 going towards
the OSPF domain.

R4#sh ip route eigrp | i EX
D EX    10.10.1.0 [170/2560002816] via 10.10.24.2, 00:07:22, FastEthernet0/0
D EX    10.10.12.0 [170/2560002816] via 10.10.24.2, 00:07:22, FastEthernet0/0

If we traceroute this traffic will go straight to R2.

R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.24.2 32 msec 40 msec 28 msec
  2 10.10.12.1 60 msec *  64 msec

What if we want external routes to go through R5 instead? We can
match on the route-type and incoming interface to block R2 routes.
This is a pretty blunt tool but can be good for some scenarios.

R4(config)#route-map RM_DENY_EXT_FA0/0 deny 10
R4(config-route-map)#match route-type external
R4(config-route-map)#route-map RM_DENY_EXT_FA0/0 permit 100
R4(config-route-map)#router eigrp 100
R4(config-router)#distribute-list route-map RM_DENY_EXT_FA0/0 in fa0/0

So what we just did is filter all external routes coming in on Fa0/0.
Did we achieve the wanted result?

R4#sh ip route eigrp | i EX
D EX    10.10.1.0 [170/2560030976] via 10.10.45.5, 00:00:21, FastEthernet0/1
D EX    10.10.12.0 [170/2560030976] via 10.10.45.5, 00:00:21, FastEthernet0/1
R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.45.5 40 msec 36 msec 16 msec
  2 10.10.35.3 56 msec 40 msec 40 msec
  3 10.10.23.2 48 msec 28 msec 44 msec
  4 10.10.12.1 68 msec *  68 msec

Now all external routes will go through R5 instead.

Currently we are not doing redistribution on R4 and R5. What will
happen with the EIGRP external routes when we do redistribution?
First we remove the previous configuration and then we configure
redistribution.

R4(config)#router eigrp 100
R4(config-router)#no distribute-list route-map RM_DENY_EXT_FA0/0 in FastEthernet0/0
R4(config-router)#redistribute ospf 1 metric 1 1 1 1 1
R4(config-router)#router ospf 10
R4(config-router)#redistribute eigrp 100 sub
R5(config)#router eigrp 100
R5(config-router)#redistribute ospf 10 metric 1 1 1 1 1
R5(config-router)#router ospf 10
R5(config-router)#redistribute eigrp 100 sub

From R4 we now look at how it reaches 10.10.1.0/24.

R4#sh ip route 10.10.1.0
Routing entry for 10.10.1.0/24
  Known via "eigrp 100", distance 170, metric 2560002816, type external
  Redistributing via eigrp 100, ospf 10
  Advertised by ospf 10 subnets
  Last update from 10.10.45.5 on FastEthernet0/1, 00:00:52 ago
  Routing Descriptor Blocks:
    10.10.45.5, from 10.10.45.5, 00:00:52 ago, via FastEthernet0/1
      Route metric is 2560002816, traffic share count is 1
      Total delay is 110 microseconds, minimum bandwidth is 1 Kbit
      Reliability 1/255, minimum MTU 1 bytes
      Loading 1/255, Hops 1
  * 10.10.24.2, from 10.10.24.2, 00:00:52 ago, via FastEthernet0/0
      Route metric is 2560002816, traffic share count is 1
      Total delay is 110 microseconds, minimum bandwidth is 1 Kbit
      Reliability 1/255, minimum MTU 1 bytes
      Loading 1/255, Hops 1

Why does it have two entries for 10.10.1.0/24? Take a look in the
EIGRP topology table.

R4#sh ip eigrp topo 10.10.1.0/24
IP-EIGRP (AS 100): Topology entry for 10.10.1.0/24
  State is Passive, Query origin flag is 1, 2 Successor(s), FD is 2560002816
  Routing Descriptor Blocks:
  10.10.24.2 (FastEthernet0/0), from 10.10.24.2, Send flag is 0x0
      Composite metric is (2560002816/2560000256), Route is External
      Vector metric:
        Minimum bandwidth is 1 Kbit
        Total delay is 110 microseconds
        Reliability is 1/255
        Load is 1/255
        Minimum MTU is 1
        Hop count is 1
      External data:
        Originating router is 10.10.24.2
        AS number of route is 1
        External protocol is OSPF, external metric is 20
        Administrator tag is 0 (0x00000000)
  10.10.45.5 (FastEthernet0/1), from 10.10.45.5, Send flag is 0x0
      Composite metric is (2560002816/2560000256), Route is External
      Vector metric:
        Minimum bandwidth is 1 Kbit
        Total delay is 110 microseconds
        Reliability is 1/255
        Load is 1/255
        Minimum MTU is 1
        Hop count is 1
      External data:
        Originating router is 10.10.56.5
        AS number of route is 10
        External protocol is OSPF, external metric is 20
        Administrator tag is 0 (0x00000000)

We can see that one route is originating from OSPF 1, which is the true
source of the route and one is originating via OSPF 10. R5 is learning
this route via OSPF and then redistributing it into EIGRP and R4 is
learning that via EIGRP. Let us confirm that R5 sees this as an OSPF
route.

R5#sh ip route 10.10.1.0
Routing entry for 10.10.1.0/24
  Known via "ospf 10", distance 110, metric 20, type extern 2, forward metric 2
  Redistributing via eigrp 100
  Advertised by eigrp 100 metric 1 1 1 1 1
  Last update from 10.10.56.6 on FastEthernet1/0, 00:08:12 ago
  Routing Descriptor Blocks:
  * 10.10.56.6, from 10.10.46.4, 00:08:12 ago, via FastEthernet1/0
      Route metric is 20, traffic share count is 1

Which it does. What would happen if R5 was redistributing with a
better metric than R2 is doing? First we enable debugging of ip
routing on R4. Remember that in a stable topology where everything
is converged there should be no changes.

R4#debug ip routing
IP routing debugging is on

Then we change the metric on R5.

R5(config)#router eigrp 100
R5(config-router)#redistribute ospf 10 metric 100000 10 255 1 1500
RT: eigrp's 10.10.1.0/24 (via 10.10.45.5) metric changed from distance/metric [170/2560002816] to 

[170/30720]
RT: del 10.10.1.0/24 via 10.10.24.2, eigrp metric [170/2560002816]
RT: NET-RED 10.10.1.0/24
RT: NET-RED 10.10.1.0/24
RT: eigrp's 10.10.12.0/24 (via 10.10.45.5) metric changed from distance/metric [170/2560002816] to 

[170/30720]
RT: del 10.10.12.0/24 via 10.10.24.2, eigrp metric [170/2560002816]
RT: NET-RED 10.10.12.0/24
RT: NET-RED 10.10.12.0/24

We can see that the metric change but at least we have no flapping.

R4#sh ip route 10.10.1.0
Routing entry for 10.10.1.0/24
  Known via "eigrp 100", distance 170, metric 30720, type external
  Redistributing via eigrp 100, ospf 10
  Advertised by ospf 10 subnets
  Last update from 10.10.45.5 on FastEthernet0/1, 00:02:33 ago
  Routing Descriptor Blocks:
  * 10.10.45.5, from 10.10.45.5, 00:02:33 ago, via FastEthernet0/1
      Route metric is 30720, traffic share count is 1
      Total delay is 200 microseconds, minimum bandwidth is 100000 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 1

Do we still have reachability?

R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.45.5 40 msec 16 msec 28 msec
  2 10.10.56.6 48 msec 32 msec 28 msec
  3 10.10.46.4 24 msec 44 msec 28 msec
  4 10.10.45.5 52 msec 48 msec 52 msec
  5 10.10.56.6 60 msec 60 msec 64 msec
  6 10.10.46.4 64 msec 60 msec 60 msec
  7 10.10.45.5 84 msec 108 msec 100 msec

Now we have a routing loop. What is happening here is that R4
is learning the route first via EIGRP and redistributes it into
OSPF. R5 learns this route via OSPF and then redistributes it
back into EIGRP and then R4 learns this route. They are now
both pointing at each other which means we have a loop.

What are our options of solving this? One way of solving it is
to increase the OSPF external AD on R5. That way R5 should not
redistribute it back to R4.

R5(config-router)#distance ospf external 180

R4#sh ip route 10.10.1.1
Routing entry for 10.10.1.0/24
  Known via "ospf 10", distance 110, metric 20, type extern 2, forward metric 2
  Last update from 10.10.46.6 on FastEthernet1/0, 00:00:38 ago
  Routing Descriptor Blocks:
  * 10.10.46.6, from 10.10.56.5, 00:00:38 ago, via FastEthernet1/0
      Route metric is 20, traffic share count is 1

R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.46.6 56 msec 36 msec 16 msec
  2 10.10.56.5 48 msec 40 msec 40 msec
  3 10.10.35.3 52 msec 52 msec 52 msec
  4 10.10.23.2 112 msec 76 msec 72 msec
  5 10.10.12.1 96 msec *  120 msec

That solved the loop changing the distance is a bit of a hack unless
we incorporate the same policy on all devices. At least all devices
involved in redistribution should have the same policy.

R4(config)#router ospf 10
R4(config-router)#distance ospf external 180

R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.24.2 28 msec 32 msec 12 msec
  2 10.10.12.1 44 msec *  60 msec

Yes, that solved it. A more elegant way is to use tagging and
filtering. We remove the previous distance commands.

What we can do now is to tag all external routes coming from OSPF 1
and then deny those routes from coming in if they have a tag set.
On R4 we tag routes with tag 444 and on R5 we will tag with 555.
First we confirm that the loop is back. You should note that with
redistribution you may see different results than I due to order of
operation. If that happens you could shutdown R5 link to R3 and
the loop should be back.

R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.45.5 16 msec 44 msec 24 msec
  2 10.10.56.6 36 msec 28 msec 36 msec
  3 10.10.46.4 32 msec 40 msec 32 msec
  4 10.10.45.5 48 msec 56 msec 48 msec
  5 10.10.56.6 64 msec 56 msec 68 msec
  6 10.10.46.4 48 msec 64 msec 60 msec

It is still there. Time for some route-maps.

R4(config)#route-map RM_DENY_EXT_FROM_R5 deny 10
R4(config-route-map)#match tag 444
R4(config-route-map)#route-map RM_DENY_EXT_FROM_R5 permit 100
R4(config-route-map)#route-map RM_SET_TAG_444 permit 10
R4(config-route-map)#match source-protocol ospf 1
R4(config-route-map)#match route-type external
R4(config-route-map)#set tag 444
R4(config-route-map)#route-map RM_SET_TAG_444 permit 100
R4(config-route-map)#router eigrp 100
R4(config-router)#distribute-list route-map RM_DENY_EXT_FROM_R5 in
R4(config-router)#router ospf 10
R4(config-router)#redistribute eigrp 100 route-map RM_SET_TAG_444 sub

First we will confirm on R5 that we now see a tag.

R5#sh ip route 10.10.1.0
Routing entry for 10.10.1.0/24
  Known via "ospf 10", distance 110, metric 20
  Tag 444, type extern 2, forward metric 2
  Redistributing via eigrp 100
  Advertised by eigrp 100 metric 100000 10 255 1 1500
  Last update from 10.10.56.6 on FastEthernet1/0, 00:00:48 ago
  Routing Descriptor Blocks:
  * 10.10.56.6, from 10.10.46.4, 00:00:48 ago, via FastEthernet1/0
      Route metric is 20, traffic share count is 1
      Route tag 444

We now see the tag. There should be no tag on EIGRP internal routes.
We can confirm this on R6.

R6#sh ip route 10.10.24.0
Routing entry for 10.10.24.0/24
  Known via "ospf 10", distance 110, metric 20, type extern 2, forward metric 1
  Last update from 10.10.56.5 on FastEthernet0/1, 08:28:39 ago
  Routing Descriptor Blocks:
    10.10.56.5, from 10.10.56.5, 08:28:39 ago, via FastEthernet0/1
      Route metric is 20, traffic share count is 1
  * 10.10.46.4, from 10.10.46.4, 08:30:34 ago, via FastEthernet0/0
      Route metric is 20, traffic share count is 1

There should be no loop on R4 now. We will test with a traceroute.

R4#traceroute 10.10.1.1 num

Type escape sequence to abort.
Tracing the route to 10.10.1.1

  1 10.10.24.2 28 msec 44 msec 12 msec
  2 10.10.12.1 36 msec *  48 msec

The loop is gone. We should implement the same policy on R5 so if
R4 sends routes back to R5 it should stop it from learning them.

R5(config)#route-map RM_DENY_EXT_FROM_R4 deny 10
R5(config-route-map)#match tag 555
R5(config-route-map)#route-map RM_DENY_EXT_FROM_R4 permit 100
R5(config-route-map)#route-map RM_SET_TAG_555 permit 10
R5(config-route-map)#match source-protocol ospf 1
R5(config-route-map)#match route-type external
R5(config-route-map)#set tag 555
R5(config-route-map)#router eigrp 100
R5(config-router)#distribute-list route-map RM_DENY_EXT_FROM_R4 in
R5(config-router)#router ospf 10
R5(config-router)#redistribute eigrp 100 route-map RM_SET_TAG_555 sub

And that concludes this lesson. Route redistribution is always fun 🙂
You can look at some of my older posts for more ideas about filtering
routes.

OSPF – Use of forwarding address

August 6, 2012 29 comments

In OSPF and other routing protocols we have something called forwarding address.
This can be used to route traffic in another direction than to the router that
originated the LSA. We start with the following topology.

It’s a basic OSPF setup where area 1 is a NSSA area. As you can see we have
two ABRs. Remember that in NSSA area, redistributed routes will be seen as N
internally but as E outside the area. To make this happen the ABR must translate
the type 7 LSA to type 5 LSA. If we have multiple ABRs, which one is responsible
for this task? The ABR with the highest RID will do the translation.

If we look at the LSA at R1, this is what it looks like.

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.13.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 1373
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000001
  Checksum: 0x7306
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 10.10.234.4
        External Route Tag: 0

So R3 is the ABR doing the translation but the forward address is set to
10.10.234.4 which is the address of R4. This means that traffic doesn’t need
to pass through R3 to reach the R4 network. The router will lookup the
10.10.234.0/24 prefix and use the routing information to reach the
10.10.4.0 network. This is proven by a traceroute.

R1#traceroute 10.10.4.4

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 44 msec 44 msec 20 msec
  2 10.10.234.4 60 msec *  72 msec

What happens if the forwarding address network is not advertised? We will
do some filtering on R2.

R2(config-router)#area 1 range 10.10.234.0 255.255.255.0 not-advertise
R3(config-router)#area 1 range 10.10.234.0 255.255.255.0 not-advertise

R1#sh ip route 10.10.4.0
% Subnet not in table

There is no reachability for the network any longer? How can we resolve
this without removing the filtering?

We can tell R3 to suppress the FA in the LSA.

R3(config-router)#area 1 nssa translate type7 suppress-fa

The network is back and we have reachability but now traffic must pass
through R3 since the FA is not set.

R1#sh ip route 10.10.4.0
Routing entry for 10.10.4.0/24
  Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 2
  Last update from 10.10.12.2 on FastEthernet0/0, 00:00:07 ago
  Routing Descriptor Blocks:
  * 10.10.12.2, from 3.3.3.3, 00:00:07 ago, via FastEthernet0/0
      Route metric is 20, traffic share count is 1

R1#traceroute 10.10.4.4

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 52 msec 76 msec 48 msec
  2 10.10.23.3 36 msec 48 msec 40 msec
  3 10.10.234.4 72 msec *  72 msec

So by setting the FA we achieve more effecient routing. The reason to have
a forwarding address is to reduce the number of LSAs needed. If all ABRs were
doing type 7 to type 5 translation then there would be more LSAs than what is
optimal.

Lets take a look at the LSA now. Note that the FA will be set to 0.0.0.0.

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.13.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 212
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000003
  Checksum: 0x6218
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 0

By default the FA is always set when using NSSA areas. Now we take a look
at another use case where we have another routing protocol involved and
redistribution is done between the routing domains.

This is our example topology. Very similar to before. We just changed from
OSPF to RIP on the lefthand side.

R3 will be the router doing mutual redistribution between RIP and OSPF.
We will see that the FA will be set to 0.0.0.0. We check the route on R1.

R1#sh ip route 10.10.4.0
Routing entry for 10.10.4.0/24
  Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 2
  Last update from 10.10.12.2 on FastEthernet0/0, 00:01:07 ago
  Routing Descriptor Blocks:
  * 10.10.12.2, from 3.3.3.3, 00:01:07 ago, via FastEthernet0/0
      Route metric is 20, traffic share count is 1

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.13.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 79
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000001
  Checksum: 0x6616
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 0

As expected the FA is set to 0.0.0.0. This means that traffic must traverse
R3. We confirm with a traceroute.

R1#traceroute 10.10.4.4

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 64 msec 28 msec 24 msec
  2 10.10.23.3 68 msec 40 msec 40 msec
  3 10.10.234.4 96 msec *  76 msec

Now what happens if we enable OSPF on R3 interface towards R4?

R3(config-if)#ip ospf 1 area 0

R1#traceroute 10.10.4.4

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 56 msec 32 msec 24 msec
  2 10.10.234.4 60 msec *  72 msec

Traceroute is now takinig the shorter path. How did this happen? Take a
look at the LSA on R1.

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.13.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 59
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000002
  Checksum: 0x7107
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 10.10.234.4
        External Route Tag: 0

The FA has now been set. How did this happen? The FA will be set for
external routes if we meet the following conditions.

  • OSPF is enabled on the ASBR’s next hop interface AND
  • ASBR’s next hop interface is non-passive under OSPF AND
  • ASBR’s next hop interface is not point-to-point AND
  • ASBR’s next hop interface is not point-to-multipoint AND
  • ASBR’s next hop interface address falls under the network range specified in the router ospf command.

 

So we have met all the conditions needed to set the FA. I hope that
you know have a better understanding of the forwarding address and
as usual always poste questions/feedback in the comments field.

Some interesting facts of OSPF

July 25, 2012 5 comments

OK so clearly I haven’t been updating a lot lately due to my very busy situation. I’m sorry for that but my former colleague Henri keeps nagging me for an update so I decided to write on some interesting tidbits of OSPF that I gathered in my notes recently.

We start with doing MD5 authentication to show something interesting.

R1#sh run int f0/0
Building configuration...

Current configuration : 151 bytes
!
interface FastEthernet0/0
 ip address 12.12.12.1 255.255.255.0
 ip ospf authentication message-digest
 ip ospf 1 area 0

The adjacency comes up even though no key has been configured, everything should be fine right? CCIE candidates don’t get off that easily, we need to verify.

%OSPF-5-ADJCHG: Process 1, Nbr 12.12.12.2 on FastEthernet0/0 from LOADING to FULL, Loading Done
R1#sh ip ospf int f0/0 | be Message
  Message digest authentication enabled
      No key configured, using default key id 0

So we are using a default key even though we didn’t configure any key! What is the next step in verifying? Debug…

R1#debug ip ospf adj
OSPF adjacency events debugging is on
R1#
*Mar  1 00:08:11.327: OSPF: Send with youngest Key 0

Indeed, authentication is working without a key! Not that useful but still an interesting fact.

OSPF does not use key chains like EIGRP and RIP. What can we do if we want to change the key used without disrupting the adjacency? We start by defining a key with ID 3.

R1(config-if)#ip ospf message-digest-key 3 md5 cisco

Then we configure another key with ID 1.

R1(config-if)#ip ospf message-digest-key 1 md5 cisco1
R1#sh ip ospf int f0/0 | be Message
  Message digest authentication enabled
    Youngest key id is 1
    Rollover in progress, 1 neighbor(s) using the old key(s):
      key id 3

So the old key is using ID 3 and the router will accept that key until the other side is also configured for key with ID 1. Now what happens if we configure another key with ID 5?

R1#sh ip ospf int f0/0 | be Message
  Message digest authentication enabled
    Youngest key id is 5
    Rollover in progress, 1 neighbor(s) using the old key(s):
      key id 3
      key id 1

So now we have two old keys and the newest one is the one with ID 5. I just wanted to show that the ID itself does not decide which one is newer, the last one you enter is the youngest key. So if we want to do a rollover we simply configure one side with the newer key and then the other side and the adjacency won’t flap.

When the keys are matching the output will look like this:

R1#sh ip ospf int f0/0 | be Message
  Message digest authentication enabled
    Youngest key id is 5

You might have heard that OSPF is distance vector between areas, how can we prove this? Lets try a simple 3 router setup looking like this.

We configure OSPF according to the topology and check that R3 is receiving the loopback of R1.

R3#sh ip route 1.1.1.1
Routing entry for 1.1.1.1/32
  Known via "ospf 1", distance 110, metric 3, type inter area
  Last update from 23.23.23.2 on FastEthernet0/0, 00:00:02 ago
  Routing Descriptor Blocks:
  * 23.23.23.2, from 12.12.12.2, 00:00:02 ago, via FastEthernet0/0
      Route metric is 3, traffic share count is 1

Which it is. Now what happens if we use a distribute-list on R2? OSPF is link state and LSA should still be advertised?

R2(config)#ip prefix-list DENY_R1_LO deny 1.1.1.1/32
R2(config)#ip prefix-list DENY_R1_LO permit 0.0.0.0/0 le 32
R2(config)#router ospf 1
R2(config-router)#distribute-list prefix DENY_R1_LO in

Is the prefix still in R3s routing table?

R3#sh ip route 1.1.1.1
% Network not in table
R3#sh ip ospf data sum 1.1.1.1

            OSPF Router with ID (23.23.23.3) (Process ID 1)

There is not even a LSA there. What about R2?

R2#sh ip route 1.1.1.1
% Network not in table
R2#sh ip ospf data sum 1.1.1.1

            OSPF Router with ID (12.12.12.2) (Process ID 1)

                Summary Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 344
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 1.1.1.1 (summary Network Number)
  Advertising Router: 12.12.12.1
  LS Seq Number: 80000001
  Checksum: 0x3ED4
  Length: 28
  Network Mask: /32
        TOS: 0  Metric: 1

There is a type 3 LSA originating from R1 but R2 is not originating one for area 2. It is proven that OSPF is distance vector between areas!

Finally I want to show something that can be useful when you want to take a router out of service gracefully. Rather than just rebooting or shutting down links it can be down this way. First we announce loopback from R3 and verify that it is seen on R2.

R1#sh ip route 3.3.3.3
Routing entry for 3.3.3.3/32
  Known via "ospf 1", distance 110, metric 3, type inter area
  Last update from 12.12.12.2 on FastEthernet0/0, 00:00:00 ago
  Routing Descriptor Blocks:
  * 12.12.12.2, from 12.12.12.2, 00:00:00 ago, via FastEthernet0/0
      Route metric is 3, traffic share count is 1

OK, so the route is there. Now assume that we want to take R3 out of service. How can we do that? By setting the LSA to the maximum metric available. If there is any other path to reach the prefix that will be preferred.

R3(config-router)#max-metric router-lsa

Now we have a look at R2. All router LSAs from R3 now have a maximum metric of 65535. So the route is not installed in the RIB.

R2#sh ip ospf data router 23.23.23.3

            OSPF Router with ID (12.12.12.2) (Process ID 1)

                Router Link States (Area 2)

  Routing Bit Set on this LSA
  LS age: 30
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 23.23.23.3
  Advertising Router: 23.23.23.3
  LS Seq Number: 80000005
  Checksum: 0xEC21
  Length: 36
  Area Border Router
  Number of Links: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 23.23.23.3
     (Link Data) Router Interface address: 23.23.23.3
      Number of TOS metrics: 0
       TOS 0 Metrics: 65535

This means that we can do work on a router and announce all router LSAs with the maximum metric and when we are done we remove the maximum metric and traffic will once again flow through the router. It’s a good option for those planned maintenance windows.

That’s all for this time!

Categories: CCIE, OSPF Tags: , , ,