Archive

Archive for the ‘OSPF’ Category

OSPF Design Considerations

March 6, 2015 2 comments

Introduction

Open Shortest Path First (OSPF) is a link state protocol that has been around for a long time. It is geneally well understood, but design considerations often focus on the maximum number of routers in an area. What other design considerations are important for OSPF? What can we do to build a scalable network with OSPF as the Interior Gateway Protocol (IGP)?

Prefix Suppression

The main goal of any IGP is to be stable, converge quickly and to provide loop free connectivity. OSPF is a link state protocol and all routers within an area maintain an identical Link State Data Base (LSDB). How the LSDB is built it out of scope for this post but one relevant factor is that OSPF by default advertises stub links for all the OSPF enabled interfaces. This means that every router running OSPF installs these transit links into the routing table. In most networks these routes are not needed, only connectivity between loopbacks is needed because peering is setup between the loopbacks. What is the drawback of this default behavior?

  • Larger LSDB
  • SPF run time increased
  • Growth of the routing table

To change this behavior, there is a feature called prefix suppression. When this feature is enabled the stub links are no longer advertised. The benefits of using prefix suppression is:

  • Smaller LSDB
  • Shorter SPF run time
  • Faster convergence
  • Remote attack vector decreased

If there needs to be connectivity to these prefixes for the sake of monitoring or other reasons, these prefixes can be carried in BGP.

How Many Routers in an Area?

The most common question is “How many routers in an Area?”. As usual, the answer is, it depends… In the past the hardware of routers such as the CPU and memory severely limited the scalability of an IGP but these are not much of an factor on modern devices. So what factors decide how many routers can be deployed in an area?

  • Number of adjacent neighbors
  • How much information is flooded in the area? How many LSAs are in the area?
  • Keep router LSA under MTU size
    – Implies lots of interfaces (and possibly lots of neighbors)
    – Exceeding the MTU leeds to IP fragmentation which should be avoided

It’s impossible to give an exact answer to how many routers that fit into an area. There are ISPs out there running hundreds or even thousands in the same area. Doing so creates a very large failure domain though, a misbehaving router or link will cause all routers in the area to run SPF. To create a smaller failure domain, areas could be used, on the other hand MPLS does not play well with areas… So it depends…That is also why we see technologies like BGP-LS where you can have IGP islands glued together by BGP.

How many ABRs in an Area?

How many ABRs are suitable in an area? The ABR is very critical in OSPF due to the distance vector behavior between areas in OSPF. Traffic must pass through the ABR. Having one ABR may not be enough but adding too many adds complexity and adds flooding of LSAs and increases the size of the LSDB.

ABR1

  • More ABRs create more Type 3 LSA replication within the backbone and in other areas
  • This can create scalability issues in large scale routing
  • 10 prefixes in area 0 and 10 prefixes in area 1 would generate 60 summary LSAs with just 3 ABRs
  • Increasing the number of areas or the number of ABRs would worsen the situation

How Many Areas per ABR?

Based on what we learned above, increasing the number of areas on an ABR quickly adds up to a lot of LSAs.

ABR2

This ABR is in four areas, if every area contains 10 prefixes, the ABR has to generate 120 Type 3 summary LSAs in total.

  • More areas per ABR puts more burden on the ABR
  • More type 3 LSAs will be generated by the ABR
  • Increasing the number of areas will worsen the situation

Considerations for Intra-Area Routing Scalability

To build a stable and scalable intra-area network, take the following parameters into consideration:

  • Physical link flaps can cause instability in OSPF
    – Consider using IP dampening

  • Avoid having physical links in OSPF through the use of prefix suppression
  • BGP can be used to carry transit links for monitoring purpose

Considerations for Inter-Area Routing Scalability

  • Filter physical links outside the area through type 3 filtering feature
  • Every area should only carry loopback addresses for all routers
  • NMS station may keep track of physical links if needed
  • These can be redistributed into BGP

OSPF Border Connections

OSPF always prefers intra-area paths to inter-area paths, regardless of metric. This may cause suboptimal routing under certain conditions.

Border1

  • Assume the link between D and E is in area 0
  • If the link between D and F fails, traffic will follow the intra area path D -> G, G -> E and E -> F

This could be solved by adding an extra interface/subinterface between D and E in area 1. This would increase the number of LSAs though…

OSPF Hub and Spoke

  • Make spoke areas as stubby as possible
    – If possible, make the area totally stubby
    – If redistribution is needed, make the area totally not-so-stubby

  • Be aware of reachability issues, make sure hub router becomes DR and use network type of Point to Multipoint (P2MP) if needed
    – P2MP has smaller DB size compared to Point to Point (P2P)
    – P2P will use more address space and increase the DB size compared to P2MP but it may be beneficial for other reasons when trying to achieve fast convergence

  • If the number of spokes is small, the hub and spokes can be placed within an area such as the backbone
  • If the number of spokes is large, make the hub an ABR and split off the spokes in area(s)

Summary

OSPF is a link state protocol that can scale to large networks if the network is designed according to the characteristics of OSPF. This post described design considerations and features such as prefix suppression that will help in scaling OSPF. For a deeper look at OSPF design, go through BRKRST-2337, available at Cisco Live 365.

Categories: CCDP, OSPF Tags: , ,

OSPF – Non Broadcast and Point to Multipoint

May 4, 2014 2 comments

Introduction

There was a discussion on a forum about the point to multipoint network type in
OSPF. What is the purpose of it and why are /32 endpoints advertised? To understand
the solution we must first understand the problem. This topology has a NBMA network
where R1 is the hub. R1 has been elected the hub due to having the highest priority.

Topology

NBMA Networks

When using the network type non broadcast, which is the default for main interfaces
with frame relay enabled, a DR and BDR is elected. When using a hub and spoke
topology it is important that the hub is elected the DR. Why is this? If a
spoke is elected the DR the flooding process will fail. On broadcast and non broadcast
segments the DROTHERs flood the LSAs to the DR/BDR and then the DR floods them to
the other DROTHERs on the segment.

The routing has already been setup, R2 and R3 are advertising a loopback each, they
are 2.2.2.2/32 and 3.3.3.3/32 respectively. First we will have a look at the router
LSAs that are generated.

R2#sh ip ospf data router 3.3.3.3

            OSPF Router with ID (2.2.2.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 118
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 3.3.3.3
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000006
  Checksum: 0x5849
  Length: 48
  Number of Links: 2

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 3.3.3.3
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.0.0.1
     (Link Data) Router Interface address: 10.0.0.3
      Number of TOS metrics: 0
       TOS 0 Metrics: 64

Interfaces that are connected and don’t have an OSPF adjacency are considered
stub networks. These are advertised with the network and the mask. There is
also a transit network that contains the IP of the DR and the local routers IP
for that network. Note that there is no network mask within this LSA.

If we have a look at the routing table of R2, the next-hop to reach 3.3.3.3/32
is 10.0.0.3.

R2#sh ip route 3.3.3.3                
Routing entry for 3.3.3.3/32
  Known via "ospf 1", distance 110, metric 65, type intra area
  Last update from 10.0.0.3 on Serial1/0, 00:34:24 ago
  Routing Descriptor Blocks:
  * 10.0.0.3, from 3.3.3.3, 00:34:24 ago, via Serial1/0
      Route metric is 65, traffic share count is 1

On broadcast and non broadcast segments all routers are assumed to be fully
meshed but here we have a hub and spoke topology. When R2 needs to send traffic
to R3 it will try to encapsulate the frame but it does not know which DLCI to
use for 10.0.0.3 because there is no frame mapping for that.

R2#sh frame map
Serial1/0 (up): ip 10.0.0.1 dlci 201(0xC9,0x3090), dynamic,
              broadcast,, status defined, active

If we debug the sending of frame relay frames we will see that the encapsulation
is failing.

R2#debug frame-relay packet
Frame Relay packet debugging is on
R2#ping 3.3.3.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:

*Mar  1 00:42:30.495: Serial1/0(o): dlci 201(0x3091), pkt type 0x800(IP), datagramsize 84
*Mar  1 00:42:32.303: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:34.299: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:36.299: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:38.299: Serial1/0:Encaps failed--no map entry link 7(IP).
*Mar  1 00:42:40.299: Serial1/0:Encaps failed--no map entry link 7(IP).
Success rate is 0 percent (0/5)

The layer 3 topology is not consistent with the layer 2 topology. The layer 3 is
assumed to be fully meshed but our layer 2 is in fact hub and spoke.

The next-hop is not changed on broadcast and non broadcast segments because
all routers are assumed to be fully meshed.

Solving the Inconsistency

One way of solving the inconsistency is by adding static mappings for
the IP of R2 and R3 respectively to the correct DLCI.

R2#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#int s1/0
R2(config-if)#frame map ip 10.0.0.3 201
R3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#int s1/0
R3(config-if)#frame map ip 10.0.0.2 301
R2#ping 3.3.3.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/51/76 ms

This solved our problem but requires manual intervention and if new routers are
added then new mappings would have to be added as well. Before we leave the
network type non-broadcast there is one more thing I would like to point out.

Network LSA

What is the role of the network LSA? It is twofold, it conveys both topology
information and reachability information. If we look at the network LSA that R1
generates.

R2#sh ip ospf data net 10.0.0.1

            OSPF Router with ID (2.2.2.2) (Process ID 1)

                Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 634
  Options: (No TOS-capability, DC)
  LS Type: Network Links
  Link State ID: 10.0.0.1 (address of Designated Router)
  Advertising Router: 1.1.1.1
  LS Seq Number: 80000003
  Checksum: 0x46C6
  Length: 36
  Network Mask: /24
        Attached Router: 1.1.1.1
        Attached Router: 2.2.2.2
        Attached Router: 3.3.3.3

We see all of the attached routers and also the network mask for the segment.
This network LSA will be converted by R4 into a type3 summary LSA. If we look
from R5’s perspective we can see this prefix.

R5#sh ip route 10.0.0.0
Routing entry for 10.0.0.0/24, 1 known subnets

O IA    10.0.0.0 [110/84] via 45.45.45.45, 00:45:27, FastEthernet0/0
R5#sh ip ospf data sum 10.0.0.0

            OSPF Router with ID (5.5.5.5) (Process ID 1)

                Summary Net Link States (Area 1)

  Routing Bit Set on this LSA
  LS age: 831
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.0.0.0 (summary Network Number)
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000002
  Checksum: 0x7364
  Length: 28
  Network Mask: /24
        TOS: 0  Metric: 74 

R5 can reach this network which is demonstrated by the ping.

R5#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 60/64/72 ms

What would happen if R2 or R3 became the DR instead of R1? Let’s try it out.

R1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R1(config)#int s1/0
R1(config-if)#ip ospf prio 1
R2#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#int s1/0
R2(config-if)#ip ospf prio 100
R3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#int s1/0
R3(config-if)#ip ospf prio 0

Then clear the process to see the effect.

The segment has now been split into two. R2 does not know about R3 so we have
a segment with R1 and R2 and then a segment with R1 and R3. This can be seen
from the neighbor table and from the network LSA that is generated.

R1#sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
4.4.4.4           1   FULL/DR         00:00:31    14.14.14.4      FastEthernet0/0
2.2.2.2         100   FULL/DR         00:01:44    10.0.0.2        Serial1/0
3.3.3.3           0   FULL/DROTHER    00:01:45    10.0.0.3        Serial1/0
R1#sh ip ospf data net 10.0.0.2

            OSPF Router with ID (1.1.1.1) (Process ID 1)

                Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 600
  Options: (No TOS-capability, DC)
  LS Type: Network Links
  Link State ID: 10.0.0.2 (address of Designated Router)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000001
  Checksum: 0x43D6
  Length: 32
  Network Mask: /24
        Attached Router: 2.2.2.2
        Attached Router: 1.1.1.1

This also means that SPF can’t run properly because R3 is now not connected to
the topology because it’s not part of the network LSA for 10.0.0.0/24. This
means that R5 can’t ping R3.

R5#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

It can still reach R1 and R2 though.

R5#ping 10.0.0.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/52/84 ms
R5#ping 10.0.0.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/66/84 ms

So the network LSA is used both for building the SPF topology and for reachibility
information.

Point to Multipoint Network

There is a point to multipoint network type. This overcomes the limitations of
partially meshed networks by accomplishing two things. The first thing that happens
is that when a LSA is received on an interface, the next-hop is changed to IP of
the router LSA that the local router is connecting to. In our case this is 10.0.0.1.

R2#sh ip route 3.3.3.3
Routing entry for 3.3.3.3/32
  Known via "ospf 1", distance 110, metric 129, type intra area
  Last update from 10.0.0.1 on Serial1/0, 00:03:59 ago
  Routing Descriptor Blocks:
  * 10.0.0.1, from 3.3.3.3, 00:03:59 ago, via Serial1/0
      Route metric is 129, traffic share count is 1
R2#sh ip ospf data router 1.1.1.1

            OSPF Router with ID (2.2.2.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 483
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 1.1.1.1
  Advertising Router: 1.1.1.1
  LS Seq Number: 80000018
  Checksum: 0x6E61
  Length: 72
  Number of Links: 4

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 14.14.14.4
     (Link Data) Router Interface address: 14.14.14.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 2.2.2.2
     (Link Data) Router Interface address: 10.0.0.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 64

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 3.3.3.3
     (Link Data) Router Interface address: 10.0.0.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 64

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.0.0.1
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 0

This is described in RFC 2328.

If the destination is a router which connects to the
calculating router via a Point-to-MultiPoint network, the
destination's next hop IP address(es) can be determined by
examining the destination's router-LSA: each link pointing
back to the calculating router and having a Link Data field
belonging to the Point-to-MultiPoint network provides an IP
address of the next hop router.

From the router LSA above you can see that each router in the point to multipoint
network advertises a stub network with a /32 mask. On point to multipoint segments
the network is described as a collection of point to point links. With a regular
point to point network the stub network is described with the actual mask of
the interface that is running OSPF. What is the use of these /32 routes?
First let’s test the reachability.

R2#ping 3.3.3.3 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 48/60/72 ms
R2#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/48/76 ms
R2#show ip route 10.0.0.3
Routing entry for 10.0.0.3/32
  Known via "ospf 1", distance 110, metric 128, type intra area
  Last update from 10.0.0.1 on Serial1/0, 00:12:41 ago
  Routing Descriptor Blocks:
  * 10.0.0.1, from 3.3.3.3, 00:12:41 ago, via Serial1/0
      Route metric is 128, traffic share count is 1

Full reachability. What happens if we filter the /32 route to R3 on R2?

R2(config)#ip prefix-list DENY_R3 deny 10.0.0.3/32         
R2(config)#ip prefix-list DENY_R3 permit 0.0.0.0/0 le 32   
R2(config)#router ospf 1
R2(config-router)#distribute-list prefix DENY_R3 in

The host route to R3 is now gone. The route to R3 is known via the connected
network.

R2#show ip route 10.0.0.3
Routing entry for 10.0.0.0/24
  Known via "connected", distance 0, metric 0 (connected, via interface)
  Routing Descriptor Blocks:
  * directly connected, via Serial1/0
      Route metric is 0, traffic share count is 1

What about reachability?

R2#ping 3.3.3.3 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/57/84 ms
R2#ping 10.0.0.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.3, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

The ping went through when we used the loopback as a source. That is because
R3 still has a host route to R2 in its routing table. We couldn’t ping 10.0.0.3
though? Why? Because without the host route pointing to a next-hop of 10.0.0.1,
R2 will try to see what DLCI to use when encapsulating the frame to 10.0.0.3
and there is no mapping for this. If we ping R2’s loopback from R3 sourcing
with the 10.0.0.3 IP, it will fail for the same reason.

R3#ping 2.2.2.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Conclusion

OSPF has various network types that accomplish different things.
The point to multi point network type is used to overcome limitations at
layer 2. The network is described as a collection of point to point links
where each router advertises a stub network with a /32 mask. By doing this
the spokes can maintain connectivity between each other when sourcing traffic
from the interface connected to their common network.

The next-hop is also changed on incoming LSAs to the IP contained in the router
LSA from the router that originated the LSA. This solves the next hop issue
on non broadcast networks where the next hop is maintained because full mesh
connectivity is assumed.

Why fast IGP timers aren’t always beneficial

March 31, 2014 4 comments

Introduction

When tuning your IGP of choice, the first thing people look at is usually the hello
and dead interval. This is a flawed logic, it is true that it can help in certain
cases but convergence consists of much more than just hello timers.

Why tune timers?

Detecting that the other side of the link is down is an important part of converging.
That’s why your design should avoid putting any bump in the wires such as converters
or a L2 cloud between the L3 endpoints. If you avoid such things when one end of the
link goes down the other end will as well which provides fast detection of failure.

In rare cases you can have the link being up but traffic is not passing over it. For
such cases or for those cases where there was no chance of avoiding a converter or
L2 cloud, tuning the hello timers can help with failure detection. The answer is almost
always BFD though, if the platform supports it.

Topologies where tuning timers is bad

When using a topology where VSS is involved such as Catalyst 6500 or Catalyst 4500,
tuning the timers is very bad. A common topology might look like this:

VSS1

The L3 switches are dually connected to the VSS. These L3 switches might be in the
distribution layer and the VSS is part of the core. The distribution switches run
LACP towards the VSS which acts as one device from an outside perspective.

The VSS runs Stateful Switchover (SSO) which syncs configuration, boots the standby
supervisor with the software and has the line cards ready to go in case of failure
of the primary chassis. Hardware forwarding tables are also synchronized, SSO
switchover takes somewhere up to 10 seconds.

SSO

The active VSS chassis runs the control plane. Routing protocols such as OSPF are not
HA aware, meaning that the state of the routing protocols is not synchronized between
the chassis.

When using fast timers and a switchover occurs, what happens is that OSPF detects that the
neighbor is not replying and tears down the adjacency. The secondary chassis then has to go
bring the adjacency back up by sending out hello packets, exchanging LSAs and updating
RIB/FIB. This may take as long as 20 seconds with the time included from the switchover.

VSS_failure

Non Stop Forwarding (NSF)

NSF combined with graceful restart is a technology used to forward packets when
a switchover has occured. The goal of NSF is to delay the failure detection which
may sound strange from a convergence perspective. Remember though that the VSS acts
as one device.

With NSF the forwarding is done according to the last known FIB entries. After a
switchover the secondary VSS will use graceful restart to inform its neighbors that
it has restarted and needs to synchronize its LSDB. This is done by sending hello packets
with a special bit set and the synchronization is done Out Of Band (OOB) to not tear
down the existing adjacency. The neighbors exchange LSAs and run SPF as normal. The
RIB and FIB can then be updated and and normal forwarding ensues.

This process is dependant on that the neighbors are also NSF aware otherwise they
would tear down the adjacency when the secondary VSS is restarting its routing
processes. So the key here is that the adjacency must stay up and that’s why timers
should be left at default if running VSS. This goes for both the VSS and any routers
that are neighbors to the VSS.

Conclusion

When using VSS always leave IGP timers at the default. Fast timers ruins the NSF
process and will lead to much higher convergence times than leaving them at the
default.

Some pointers on OSPF as PE to CE protocol

February 23, 2014 5 comments

There was a discussion at the Cisco Learning Network (CLN) about OSPF as PE to CE
protocol.
I wanted to provide some pointers on using OSPF as PE to CE protocol.

RFC 4577 describes how to use OSPF as PE to CE protocol. When using BGP to carry the
OSPF routes the MPLS backbone is seen as a super backbone. This adds another level of
hierarchy making OSPF three levels compared to the usual two when using plain OSPF.

Superbackbone

Because the the MPLS backbone is seen as a super area 0, that means that OSPF routes
going across the MPLS backbone can never be better than type 3 summary LSA. Even if
the same area is used on both sides of the backbone and the input is a type 1 or type 2
LSA it will be advertised as a summary LSA on the other side.

LSA across superbackbone

The only way to keep the type 1 or type 2 LSAs as they are is to use a sham link.
Sham links sets up a control plane mechanism acting as a tunnel for the LSAs passing
over the MPLS backbone. Sham links are outside the scope of this article.

A LSA can never be “better” than it originally was input as. This means that if the input
to the PE isa type 3 LSA this can never be converted to a type 1 or type 2 LSA on the other
side. If the LSA was type 5 external to begin it will be sent as type 5 on the other side
as well.

To understand how the LSAs are sent over the backbone, look at this picture.

MPBGP

OSPF LSA is sent to PE which is running OSPF in a VRF with the CPE. The PE installs
the LSA as a route in the OSPF RIB. If the route is the best one known to the router
it can install it to the global RIB.

The PE redistributes from OSPF into BGP. Only routes that are installed as OSPF in
the RIB will be redistributed. To be able to carry OSPF specific information the PE
has to add extended communities. To make the IPv4 route a VPNv4 route the PE has
to add the RD and RT values. The OSPF specific communities consist of:

Domain-ID

The domain ID can either be hard coded or derived from the OSPF process running.
It is used to identify if LSAs are sent into the same domain as they originated
from. If the domain ID matches then type 3 summary LSAs can be sent for routes
that were internal or inter area. If the domain ID does not match then all routes
must be sent as external.

Domain ID match

Domain ID 1

Domain ID non match

Domain ID 2

OSPF Route Type

The route type consists of area number, route type and options.

Route Type

If we look at a MPBGP update we can see the route type encoded.

R4#sh bgp vpnv4 uni rd 1:1 1.1.1.1/32
BGP routing table entry for 1:1:1.1.1.1/32, version 5
Paths: (1 available, best #1, table cust)
Flag: 0x820
  Not advertised to any peer
  Local
    2.2.2.2 (metric 21) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 11, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:22.22.22.22:0
      mpls labels in/out nolabel/18

Something that is a bit peculiar is that this update has a route type of 2 even though
it originated from a type 1 LSA. In the end it doesn’t make a difference because it will
be advertised as type 3 LSA to the CPE.

OSPF Router ID

The router ID of the router that originated the LSA (PE) is also carried as an extended
community.

R4#sh bgp vpnv4 uni rd 1:1 1.1.1.1/32
BGP routing table entry for 1:1:1.1.1.1/32, version 5
Paths: (1 available, best #1, table cust)
Flag: 0x820
  Not advertised to any peer
  Local
    2.2.2.2 (metric 21) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 11, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:22.22.22.22:0
      mpls labels in/out nolabel/18

MED

The MED is set to the OSPF metric + 1 as defined by the RFC.


R4#sh bgp vpnv4 uni rd 1:1 1.1.1.1/32
BGP routing table entry for 1:1:1.1.1.1/32, version 5
Paths: (1 available, best #1, table cust)
Flag: 0x820
  Not advertised to any peer
  Local
    2.2.2.2 (metric 21) from 2.2.2.2 (2.2.2.2)
      Origin incomplete, metric 11, localpref 100, valid, internal, best
      Extended Community: RT:1:1 OSPF DOMAIN ID:0x0005:0x000000020200 
        OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:22.22.22.22:0
      mpls labels in/out nolabel/18

The goal of these extended communities is to extend BGP so that OSPF LSAs can be
carried transparently as if BGP hadn’t been involved at all. LSAs are translated
to BGP updates and then translated back to LSAs.

If we look at a packet capture we can see the extended communities attached.
This BGP Update originated from a type 5 external LSA with metric-type 1.

Capture

When using OSPF as the PE to CE protocol it is important to remember the design
rules of OSPF. Because of that you should avoid designs like this:

OSPF1

In this design area 1 is used on both sides but the CPE is then connected to area 0
which makes it an ABR. The rules of OSPF dictate that summary LSAs must only be
received over area 0 if it is an ABR. This means this topology is broken and would
require changing area or using a virtual link.

OSPF as PE to CE protocol has some complexity but must of it is still plain OSPF
which is in itself a complicated protocol. Combine that with BGP and MPLS and
it is easy to get confused which protocol is responsible for what. That is also
one of the reasons that I recommend to use eBGP or static when customers connect
to their ISP.

Categories: BGP, MPLS, OSPF Tags: , , , , ,

Why OSPF FA is only set on broadcast networks

April 10, 2013 6 comments

A friend of mine asked me about the OSPF forwarding address. The question was why
must the network type be broadcast for the FA to be set? Why is not point to point
and point to multipoint network type valid?

First of all, what is the point of having a forwarding address? Look at the topology
below.

Forwarding_address_BGP

R3 is the only one running BGP to R4. If the FA is not set then there will be an
extra hop compared to R2 sending the traffic directly to R4.

R1#sh ip route 10.10.4.0
Routing entry for 10.10.4.0/24
  Known via "ospf 1", distance 110, metric 1
  Tag 4, type extern 2, forward metric 20
  Last update from 10.10.12.2 on FastEthernet0/0, 00:00:23 ago
  Routing Descriptor Blocks:
  * 10.10.12.2, from 10.10.23.3, 00:00:23 ago, via FastEthernet0/0
      Route metric is 1, traffic share count is 1
      Route tag 4

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 35
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 10.10.23.3
  LS Seq Number: 80000001
  Checksum: 0xEB7D
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0 
        Metric: 1 
        Forward Address: 0.0.0.0
        External Route Tag: 4

R1#traceroute 10.10.4.4 num

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 44 msec 44 msec 32 msec
  2 10.10.23.3 60 msec 36 msec 40 msec
  3 10.10.234.4 84 msec *  76 msec

Because the forwarding address is set to 0 the traffic must flow through the
ASBR originating the LSA.

Which conditions must be met to set the FA?

The interface on the ASBR must have OSPF enabled. It must not be passive and it
must be broadcast. Let’s enable this on R3.

R3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#int f0/1
R3(config-if)#ip ospf 1 area 0

Now check the external LSA on R1 and a traceroute.

R1#sh ip ospf data ex 10.10.4.0

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 243
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 10.10.4.0 (External Network Number )
  Advertising Router: 10.10.23.3
  LS Seq Number: 80000002
  Checksum: 0xF66E
  Length: 36
  Network Mask: /24
        Metric Type: 2 (Larger than any link state path)
        TOS: 0 
        Metric: 1 
        Forward Address: 10.10.234.4
        External Route Tag: 4

R1#traceroute 10.10.4.4 num

Type escape sequence to abort.
Tracing the route to 10.10.4.4

  1 10.10.12.2 48 msec 32 msec 64 msec
  2 10.10.234.4 96 msec *  88 msec

The traffic is now flowing directly via R2. The key point here is that in broadcast
networks all routers can communicate with each other (full mesh). We can see this by
looking at the type2 LSA.

R1#sh ip ospf data net 10.10.234.3

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 179
  Options: (No TOS-capability, DC)
  LS Type: Network Links
  Link State ID: 10.10.234.3 (address of Designated Router)
  Advertising Router: 10.10.23.3
  LS Seq Number: 80000001
  Checksum: 0x3485
  Length: 32
  Network Mask: /24
        Attached Router: 10.10.23.3
        Attached Router: 10.10.12.2

Why isn’t a point to point network valid? Well, the name pretty much says it all.
With point-to-point there can only be two routers connected so there is no use
in setting the FA because the traffic must flow through the router originating
the LSA.

If we look at the router LSA from R2 when we have broadcast network type it looks
like this:

R1#sh ip ospf data router 10.10.12.2

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Router Link States (Area 0)

  LS age: 7
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 10.10.12.2
  Advertising Router: 10.10.12.2
  LS Seq Number: 8000000A
  Checksum: 0x977B
  Length: 60
  Number of Links: 3

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.234.3
     (Link Data) Router Interface address: 10.10.234.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.23.2
     (Link Data) Router Interface address: 10.10.23.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.12.1
     (Link Data) Router Interface address: 10.10.12.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

You can see that the 10.10.234.0 is a transit network and then the type 2 LSA shows
which routers are connected and the network mask. Now if we change to point to point.

R1#sh ip ospf data router 10.10.12.2

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Router Link States (Area 0)

  LS age: 59
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 10.10.12.2
  Advertising Router: 10.10.12.2
  LS Seq Number: 8000000B
  Checksum: 0xF2E3
  Length: 72
  Number of Links: 4

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 10.10.23.3
     (Link Data) Router Interface address: 10.10.234.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.10.234.0
     (Link Data) Network Mask: 255.255.255.0
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.23.2
     (Link Data) Router Interface address: 10.10.23.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.12.1
     (Link Data) Router Interface address: 10.10.12.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

The 10.10.234.0 network is now a stub network which means it can’t be used for transit.
Usually there should only be two routers connected here, we shouldn’t use P2P network
type if there is an Ethernet segment with multiple routers.

So finally why is P2MP not valid? Because P2MP is used in NBMA networks. These networks
are usually partially meshed and from the perspective of OSPF it is a collection of
point to point links. This is how the LSA looks.

R1#sh ip ospf data router 10.10.12.2

            OSPF Router with ID (10.10.12.1) (Process ID 1)

                Router Link States (Area 0)

  LS age: 8
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 10.10.12.2
  Advertising Router: 10.10.12.2
  LS Seq Number: 8000000D
  Checksum: 0xFCD6
  Length: 72
  Number of Links: 4

    Link connected to: another Router (point-to-point)
     (Link ID) Neighboring Router ID: 10.10.23.3
     (Link Data) Router Interface address: 10.10.234.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.10.234.2
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 0

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.23.2
     (Link Data) Router Interface address: 10.10.23.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 10.10.12.1
     (Link Data) Router Interface address: 10.10.12.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 10

It looks very similar to P2P with the difference that the stub network has a mask
of /32. This is useful in partial mesh where spokes need to reach each other via
the hub and don’t have a DLCI between them.

So it only makes sense to use FA in broadcast networks because that is the only
place where routers are guaranteed to be able to communicate to each other because
it is by nature fully meshed.

Categories: OSPF Tags: , ,

Tiebreakers with routes from different OSPF processes

March 15, 2013 17 comments

This post is inspired by a discussion at Twitter with Ivan Pepelnjak and
Nicolas Michel. Nicolas asked what happens when there is the same route from two
different OSPF processes. Which one will be selected? Ivan explained how
to use the distance command. First before I show how it works and why we
need to get some few basic concepts explained.

LSDB – Link State Database – All OSPF LSAs populate the LSDB
RIB – Routing Information Base – The best routes from every protocol
compete to get installed to the RIB
FIB – Forwarding Information Base – Routes are copied from the RIB
and used for forwarding (CEF)
CEF – Cisco Express Forwarding – The algorithm that Cisco uses for
the forwarding (FIB)

If we have for example OSPF, this is how a route gets selected to the RIB(global).
The routers exchange LSAs with each other. Within an area every router has the same
view of the network. These LSAs populate the LSDB. If there are multiple paths to
a destination they will compete with each other unless they are of same type and equal
cost. Intra area is preferred first, then inter and finally external routes. There is no
way of modifying this behaviour. The best route then goes to the OSPF RIB, could be several
if they are equal. From there this route will compete with other routing protocols and the
AD will decide which one is installed. If the OSPF one is best then that one goes to the global
RIB. Then finally the RIB populates FIB with this information and forwarding can ensue.

This is a picture I made that describes the process.

Route_selection

We start out with a very basic topology looking like this.

Multiple_OSPF_1

R1 and R3 will announce the same network 1.1.1.1/32. R2 will use two different OSPF processes.
We start out with the basic configuration:

R1

R1(config)#int f1/0
R1(config-if)#ip add 12.12.12.1 255.255.255.0
R1(config-if)#no sh
R1(config-if)#ip ospf 1 area 0
R1(config-if)#int lo0
R1(config-if)#ip add 1.1.1.1 255.255.255.255
R1(config-if)#ip ospf 1 area 0

R2

R2(config)#int f1/0
R2(config-if)#ip add 12.12.12.2 255.255.255.0
R2(config-if)#no sh
R2(config-if)#ip ospf 1 area 0
R2(config-if)#int f1/1
R2(config-if)#ip add 23.23.23.2 255.255.255.0
R2(config-if)#no sh
R2(config-if)#ip ospf 3 area 0
%OSPF-5-ADJCHG: Process 1, Nbr 12.12.12.1 on FastEthernet1/0 from LOADING to FULL, Loading Done

We see the session coming up immediately. Now lets bring up R3 as well.

R3

R3(config)#int f1/0
R3(config-if)#ip add 23.23.23.3 255.255.255.0
R3(config-if)#no sh
R3(config-if)#ip ospf 3 area 0
R3(config-if)#int lo0
R3(config-if)#ip add 1.1.1.1 255.255.255.255
R3(config-if)#ip ospf 3 area 0
%OSPF-5-ADJCHG: Process 3, Nbr 23.23.23.2 on FastEthernet1/0 from LOADING to FULL, Loading Done

Both OSPF peerings are up. Now lets follow the steps that was shown in
the picture above starting by looking at the database.

R2#sh ip ospf data router 12.12.12.1

            OSPF Router with ID (23.23.23.2) (Process ID 3)

            OSPF Router with ID (12.12.12.2) (Process ID 1)

                Router Link States (Area 0)

  LS age: 184
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 12.12.12.1
  Advertising Router: 12.12.12.1
  LS Seq Number: 80000003
  Checksum: 0xF78
  Length: 48
  Number of Links: 2

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 1.1.1.1
     (Link Data) Network Mask: 255.255.255.255
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 12.12.12.1
     (Link Data) Router Interface address: 12.12.12.1
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

We see that R1 is announcing 1.1.1.1/32 and we have a metric of 2 to it.
Do we see R3 announcing that as well?

R2#sh ip ospf data router 23.23.23.3

            OSPF Router with ID (23.23.23.2) (Process ID 3)

                Router Link States (Area 0)

  LS age: 148
  Options: (No TOS-capability, DC)
  LS Type: Router Links
  Link State ID: 23.23.23.3
  Advertising Router: 23.23.23.3
  LS Seq Number: 80000003
  Checksum: 0x54A7
  Length: 48
  Number of Links: 2

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 1.1.1.1
     (Link Data) Network Mask: 255.255.255.255
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

    Link connected to: a Transit Network
     (Link ID) Designated Router address: 23.23.23.2
     (Link Data) Router Interface address: 23.23.23.3
      Number of MTID metrics: 0
       TOS 0 Metrics: 1

Yes, it’s there. Now we take a look at the OSPF RIB. Which ones do we see there?

R2#sh ip ospf rib

            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*   1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 12.12.12.1, FastEthernet1/0
*   12.12.12.0/24, Intra, cost 1, area 0, Connected
      via 12.12.12.2, FastEthernet1/0

The greater than sign indicates that the one from OSPF process 1 was selected.
Why? When running multiple OSPF processes the one that first installs to the
RIB will be selected to the global RIB. Now we confirm by looking in the
global RIB.

R2# show ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 12.12.12.1, 00:06:35, FastEthernet1/0

Yes, that looks correct. Final step is to verify that FIB is also updated.

R2#sh ip cef 1.1.1.1/32
1.1.1.1/32
  nexthop 12.12.12.1 FastEthernet1/0

So the one that first writes to the global RIB wins. Now lets bring down the
process that is currently winning.

R2(config)#int f1/0
R2(config-if)#sh
R2(config-if)#

The OSPF RIB and global RIB should now be updated.

R2#show ip ospf rib

            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB
R2#show ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 23.23.23.3, 00:00:42, FastEthernet1/1

Now if we bring back OSPF process 1, what will happen? Process 3 should still be
winning since it installed to global RIB first.

R2(config)#int f1/0
R2(config-if)#no sh
R2#sh ip ospf rib

            OSPF Router with ID (2.2.2.2) (Process ID 11)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB


            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*   1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 12.12.12.1, FastEthernet1/0
*   12.12.12.0/24, Intra, cost 1, area 0, Connected
      via 12.12.12.2, FastEthernet1/0

Now process 1 is winning, which is odd. Lets debug ip routing to see what is
really happening. We shutdown interface in process 1.

*Mar 14 23:26:36.555: RT: del 1.1.1.1 via 12.12.12.1, ospf metric [110/2]
*Mar 14 23:26:36.559: RT: delete subnet route to 1.1.1.1/32
*Mar 14 23:26:36.579: RT: updating ospf 1.1.1.1/32 (0x0):
    via 23.23.23.3 Fa1/1
*Mar 14 23:26:36.583: RT: add 1.1.1.1/32 via 23.23.23.3, ospf metric [110/2]

Now we bring back process 1.

*Mar 14 23:29:04.163: RT: updating ospf 1.1.1.1/32 (0x0):
    via 12.12.12.1 Fa1/0
*Mar 14 23:29:04.171: RT: closer admin distance for 1.1.1.1, flushing 1 routes
*Mar 14 23:29:04.175: RT: add 1.1.1.1/32 via 12.12.12.1, ospf metric [110/2]

We can see that IOS is claiming that distance is lower which it is clearly not.
What happens if we change process 1 to process 11 and we shutdown the interface
in process 3?

R2(config)#int f1/1
R2(config-if)#sh
R2(config-if)#int f1/0
R2(config-if)#ip ospf 11 area 0

Now we look at the output from the debug.

*Mar 14 23:33:27.615: RT: updating ospf 1.1.1.1/32 (0x0):
    via 12.12.12.1 Fa1/0

*Mar 14 23:33:27.619: RT: add 1.1.1.1/32 via 12.12.12.1, ospf metric [110/2]
*Mar 14 23:33:39.927: RT: updating connected 23.23.23.0/24 (0x0):
    via 0.0.0.0 Fa1/1
*Mar 14 23:33:39.931: RT: add 23.23.23.0/24 via 0.0.0.0, connected metric [0/0]
*Mar 14 23:33:39.939: RT: interface FastEthernet1/1 added to routing table
*Mar 14 23:33:39.947: RT: updating connected 23.23.23.2/32 (0x0):
    via 0.0.0.0 Fa1/1
*Mar 14 23:33:39.951: RT: network 23.0.0.0 is now variably masked
*Mar 14 23:33:39.951: RT: add 23.23.23.2/32 via 0.0.0.0, connected metric [0/0]
*Mar 14 23:33:55.447: RT: updating ospf 1.1.1.1/32 (0x0):
    via 23.23.23.3 Fa1/1
*Mar 14 23:33:55.455: RT: closer admin distance for 1.1.1.1, flushing 1 routes
*Mar 14 23:33:55.455: RT: add 1.1.1.1/32 via 23.23.23.3, ospf metric [110/2]

We can see that first process 11 is the only option available so the 1.1.1.1/32
route is installed via f1/0. Then f1/1 comes back up and now 1.1.1.1/32 is reachable
via f1/1 and is chosen because of “closer admin distance” which is not true. This must
mean that the OSPF process number is the tie breaker.

We take a look at the OSPF RIB and global RIB to verify once more.

R2#sh ip ospf rib

            OSPF Router with ID (22.22.22.22) (Process ID 11)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*   1.1.1.1/32, Intra, cost 2, area 0
      via 12.12.12.1, FastEthernet1/0
*   12.12.12.0/24, Intra, cost 1, area 0, Connected
      via 12.12.12.2, FastEthernet1/0

            OSPF Router with ID (23.23.23.2) (Process ID 3)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

*>  1.1.1.1/32, Intra, cost 2, area 0
      via 23.23.23.3, FastEthernet1/1
*   23.23.23.0/24, Intra, cost 1, area 0, Connected
      via 23.23.23.2, FastEthernet1/1

            OSPF Router with ID (12.12.12.2) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB

R2#sh ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 23.23.23.3, 00:09:02, FastEthernet1/1

What if we change the AD of process 11?

R2(config)#router ospf 11
R2(config-router)#distance ospf intra-area 100
*Mar 14 23:43:31.315: RT: updating ospf 1.1.1.1/32 (0x0):
    via 12.12.12.1 Fa1/0

*Mar 14 23:43:31.319: RT: closer admin distance for 1.1.1.1, flushing 1 routes
*Mar 14 23:43:31.323: RT: add 1.1.1.1/32 via 12.12.12.1, ospf metric [100/2]

That makes process 11 win again. So these tests seems to indicate that if everything
is the same then the tiebreaker is the lowest process number. For EIGRP it is the
lowest AS number so maybe Cisco chose to make it comparable.
Also take a look at what Ivan is saying at IOS hints

Categories: OSPF, Routing Tags: , , , ,

ASBR in NSSA – Choosing what IP to use as forwarding address

September 20, 2012 5 comments

OSPF is one of the protocols where the details are very important. It has lots
of bits and pieces to make it run in a proper way. I have described the forwarding
address in an earlier post and this time I want to show how the IP that is used
as the forwarding address is selected. We start out with this simple topology.

It’s a very basic config where R1 is redistributing a route and running in a
NSSA area.

R1#sh run | s router ospf|ip route
router ospf 1
 router-id 1.1.1.1
 log-adjacency-changes
 area 10 nssa
 redistribute static subnets
ip route 100.0.0.0 255.0.0.0 Null0

Which IP will R1 use for its forwarding address? We look at R3.

R3#sh ip route ospf | i E2
O E2 100.0.0.0/8 [110/20] via 23.23.23.2, 00:57:59, FastEthernet0/0
R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 120
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000005
  Checksum: 0x4AC0
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 12.12.12.1
        External Route Tag: 0

It has chosen its interface address towards R2. What if we enable OSPF on the other
Ethernet interface of R1?

R1(config)#int f0/1
R1(config-if)#ip ospf 1 area 10

We check R3 again.

R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 25
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000006
  Checksum: 0x6676
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 112.112.112.1
        External Route Tag: 0

The forwarding address has changed. It selected the IP of the other Ethernet interface
of R1. We can see that it prefers to choose a higher IP address. What if we announce
the loopback of R1 in the NSSA area?

R1(config-if)#int lo0
R1(config-if)#ip ospf 1 area 10
R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 27
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000007
  Checksum: 0xAE53
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 11.11.11.11
        External Route Tag: 0

Now the loopback IP is chosen instead. So since the loopback has a lower IP but still
is preferred we can see that loopbacks are preferred in the selection. To see this
clearly defined in words we reference RFC 3101 section 2.3.

When a router is forced to pick a forwarding address for a Type-7
LSA, preference should be given first to the router's internal
addresses (provided internal addressing is supported).  If internal
addresses are not available, preference should be given to the
router's active OSPF stub network addresses.  These choices avoid the
possible extra hop that may happen when a transit network's address
is used.  When the interface whose IP address is the LSA's forwarding
address transitions to a Down state (see [OSPF] Section 9.3), the
router must select a new forwarding address for the LSA and then re-
originate it.  If one is not available the LSA should be flushed.

So the selection process is to choose the highest IP of a loopback advertised
into the NSSA area. If no loopback is advertised then choose the highest
physical interface IP advertised into the NSSA area.

I hope that I have provide another piece to the OSPF puzzle and you now have
a good understanding of the forwarding address.