Archive

Archive for September, 2012

ASBR in NSSA – Choosing what IP to use as forwarding address

September 20, 2012 5 comments

OSPF is one of the protocols where the details are very important. It has lots
of bits and pieces to make it run in a proper way. I have described the forwarding
address in an earlier post and this time I want to show how the IP that is used
as the forwarding address is selected. We start out with this simple topology.

It’s a very basic config where R1 is redistributing a route and running in a
NSSA area.

R1#sh run | s router ospf|ip route
router ospf 1
 router-id 1.1.1.1
 log-adjacency-changes
 area 10 nssa
 redistribute static subnets
ip route 100.0.0.0 255.0.0.0 Null0

Which IP will R1 use for its forwarding address? We look at R3.

R3#sh ip route ospf | i E2
O E2 100.0.0.0/8 [110/20] via 23.23.23.2, 00:57:59, FastEthernet0/0
R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 120
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000005
  Checksum: 0x4AC0
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 12.12.12.1
        External Route Tag: 0

It has chosen its interface address towards R2. What if we enable OSPF on the other
Ethernet interface of R1?

R1(config)#int f0/1
R1(config-if)#ip ospf 1 area 10

We check R3 again.

R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 25
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000006
  Checksum: 0x6676
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 112.112.112.1
        External Route Tag: 0

The forwarding address has changed. It selected the IP of the other Ethernet interface
of R1. We can see that it prefers to choose a higher IP address. What if we announce
the loopback of R1 in the NSSA area?

R1(config-if)#int lo0
R1(config-if)#ip ospf 1 area 10
R3#sh ip ospf data ex 100.0.0.0

            OSPF Router with ID (3.3.3.3) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA
  LS age: 27
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 100.0.0.0 (External Network Number )
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000007
  Checksum: 0xAE53
  Length: 36
  Network Mask: /8
        Metric Type: 2 (Larger than any link state path)
        TOS: 0
        Metric: 20
        Forward Address: 11.11.11.11
        External Route Tag: 0

Now the loopback IP is chosen instead. So since the loopback has a lower IP but still
is preferred we can see that loopbacks are preferred in the selection. To see this
clearly defined in words we reference RFC 3101 section 2.3.

When a router is forced to pick a forwarding address for a Type-7
LSA, preference should be given first to the router's internal
addresses (provided internal addressing is supported).  If internal
addresses are not available, preference should be given to the
router's active OSPF stub network addresses.  These choices avoid the
possible extra hop that may happen when a transit network's address
is used.  When the interface whose IP address is the LSA's forwarding
address transitions to a Down state (see [OSPF] Section 9.3), the
router must select a new forwarding address for the LSA and then re-
originate it.  If one is not available the LSA should be flushed.

So the selection process is to choose the highest IP of a loopback advertised
into the NSSA area. If no loopback is advertised then choose the highest
physical interface IP advertised into the NSSA area.

I hope that I have provide another piece to the OSPF puzzle and you now have
a good understanding of the forwarding address.

Some important details of BGP

September 14, 2012 15 comments

We start out with a basic topopology of 3 routers.

R2 and R3 will peer to each others loopback. I have setup OSPF for full reachability
in the network. First we test connectivity.

R2#ping 3.3.3.3 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/53/80 ms

There is connectivity. We setup the peering and set ebgp-multihop to 2 since
this is what most people do. I will explain why this is not a good idea.

R2(config)#router bgp 1
R2(config-router)#nei 3.3.3.3 remote-as 3
R2(config-router)#nei 3.3.3.3 update-source loopback 0
R2(config-router)#nei 3.3.3.3 ebgp-multihop 2
R3(config)#router bgp 3
R3(config-router)#nei 2.2.2.2 remote-as 1
R3(config-router)#nei 2.2.2.2 update-source loopback 0
R3(config-router)#nei 2.2.2.2 ebgp-multihop 2

The session comes up.

 %BGP-5-ADJCHANGE: neighbor 2.2.2.2 Up

All good so far. We are not advertising anything yet. We add another loopback
on R3 and advertise that into BGP. We check if R2 is receiving it.

R2#sh bgp ipv4 uni
BGP table version is 3, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 33.33.33.33/32   3.3.3.3                  0             0 3 i

It looks good so far. Now lets think for a while what ebgp-multihop
actually does. The default setting for eBGP is to check that incoming BGP
packets are destined for a directly connected interface. So the default is
to do a connected-check and ebgp-multihop = 1. When we set ebgp-multihop 2
the outgoing TTL is set to 2 and the connected-check is disabled. We confirm
this with a packet capture.

So the TTL is set to 2, is this really necessary? The common argument is that
because we are peering to a loopback the TTL must be set to 2 because the
TTL is decremented before reaching the loopback. When do routers modify packets
before transmitting them? On the egress interface right? We try this theory by
setting up a peering between R1 and R3. We will use no ebgp-multihop to begin
with and then we will debug ip icmp. We have to disable the connected-check
otherwise BGP will only stay idle because a loopback can never be directly
connected.

R1(config-router)#nei 3.3.3.3 remote-as 3
R1(config-router)#nei 3.3.3.3 update-source lo0
R1(config-router)#nei 3.3.3.3 disable-connected-check
R3(config-router)#nei 1.1.1.1 remote-as 1
R3(config-router)#nei 1.1.1.1 update lo0
R3(config-router)#nei 1.1.1.1 disable-connected-check

We can now see that R2 is sending ICMP time exceeded message to R1 and R3.

R1: ICMP: time exceeded rcvd from 12.12.12.2
R3: ICMP: time exceeded rcvd from 23.23.23.2

This is because the TTL was set to 1. The TTL expired while in transit.

Now we setup a peering between R1 and R2 using the loopbacks. We will disable
the connected-check.

R1(config-router)#nei 2.2.2.2 remote-as 1
R1(config-router)#nei 2.2.2.2 update lo0
R1(config-router)#nei 2.2.2.2 disable-connected-check
R2(config-router)#nei 1.1.1.1 remote-as 1
R2(config-router)#nei 1.1.1.1 update lo0
R2(config-router)#nei 1.1.1.1 disable-connected-check

Now according to the people that say that TTL must be 2 for peering to come up
we will prove that this is wrong. The reason peering does not come up when using
loopbacks is that BGP is checking if it is directly connected or not. We take a
look at a BGP packet sent when using the disable-connected-check.

We clearly see that the TTL is 1 but the session still comes up. This proves
that is is not TTL that is expiring when peering to loopbacks!

R1#sh bgp all sum
For address family: IPv4 Unicast
BGP router identifier 1.1.1.1, local AS number 1
BGP table version is 9, main routing table version 9
2 network entries using 240 bytes of memory
2 path entries using 104 bytes of memory
3/2 BGP path/bestpath attribute entries using 372 bytes of memory
1 BGP AS-PATH entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
Bitfield cache entries: current 1 (at peak 2) using 32 bytes of memory
BGP using 772 total bytes of memory
BGP activity 5/3 prefixes, 5/3 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
2.2.2.2         4     2      83      80        9    0    0 00:02:45        1

Finally I want to bring up another disadvantage of using the ebgp-multihop
command when peering between directly connected routers using loopbacks.
We have a peering between R2 and R3. What happens when we shutdown the
interface on either router?

R2(config-router)#int f1/0
R2(config-if)#sh
R2(config-if)#
%OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on FastEthernet1/0 from FULL to DOWN, Neighbor Down: Interface down or detached
R2(config-if)#
%LINK-5-CHANGED: Interface FastEthernet1/0, changed state to administratively down
%LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet1/0, changed state to down
R2(config-if)#do sh bgp ipv4 uni
BGP table version is 11, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  33.33.33.33/32   3.3.3.3                  0             0 3 i

When we shutdown the interface the peering still stays up. This is because when using
ebgp-multihop the fast-external-fallover feature can not be used at the same time. This could
lead to blackholes since the peering stays up until the hold time expires (180s). In our
case we have no valid next-hop but what if we put in a default route?

R2(config)#ip route 0.0.0.0 0.0.0.0 12.12.12.1
R2(config)#int f1/0
R2(config-if)#sh
R2(config-if)#do
%OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on FastEthernet1/0 from FULL to DOWN, Neighbor Down: Interface down or detached
R2(config-if)#do
%LINK-5-CHANGED: Interface FastEthernet1/0, changed state to administratively down
%LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet1/0, changed state to down
R2(config-if)#do sh bgp ipv4 uni
BGP table version is 12, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 33.33.33.33/32   3.3.3.3                  0             0 3 i
R2(config-if)#do sh bgp ipv4 uni
BGP table version is 12, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 33.33.33.33/32   3.3.3.3                  0             0 3 i

Now the route stays in the BGP table until the holdtime expires which creates a
black hole. The default route is now functioning to make sure there is a next-hop
available.

By this post I hope you have got a better understanding of these BGP features
and how a router handles control plane packets. As usual post in comments
if you have any feedback or questions.

INE 10 day bootcamp – Review

September 4, 2012 10 comments

I’m back from London and it’s been a great experience. Many readers are interested in what
the bootcamp is like. It is a big investment to go for so it is understandable that you
want to know if it will be worth it. I’ll start by describing the teacher and his teaching
methods.

Brian Dennis is a well known and respected man in the network industry. He is CCIE #2210
and has 5x CCIEs. That is among the very best in the world. Brian is not one of those
academic guys that only knows what is written in a book. He as a solid background in the
industry which means he can explain WHY things are the way they are and not just stating
facts without any reasoning behind it. There will be NO powerpoints, it is CLI only and
although he has a topology he is using the configurations are not prebuilt. He will do
them live which means there will be issues, which is GOOD. You get to see a 5x CCIE
troubleshooting and since he hasn’t prepared the faults before you will see how he would
troubleshoot a live problem which is very good practice for the TS lab in the CCIE lab.
Brian is a strong believer in that there are no tips and tricks. If you have an
instructor teaching you all these tips and tricks then that instructor is a fake.
If you know the technology there are no tips and tricks. Sure he can teach you some
useful commands but there are no tips and tricks in routing protocols.

Jeremy Brown is the bootcamp coordinator. He’s a very nice guy and he will help you
with any queries you have about the bootcamp. If you are attending you will be
talking to him for sure.

When you start the class the first day you will be handed a folder with paper and
a pen and some contact information. Brian will introduce himself and give some
general guidelines and explain how the real lab works with TS section and
configuration section etc. Then everyone gets to introduce themselves. My class
had a lot of nationalities, Bolivia, France, Venezuela, Sweden, UK, Ireland,
Norway, Hungary were all represented.

The bootcamp runs from 9 AM in the morning to about 19-20 PM in the evening.
There will be some 15 minute breaks and a lunch break for 1.5h. It is long
days indeed so make sure to get enough sleep in the evening. This is a pure
learning experience, leave the partying for another time. If you want to
have some fun there will be time in the weekend for that.

The first day is about layer two. Since the configuration is built from
scratch it makes sense to start out with layer two. The topology used
is based on Cisco 360 with 5 routers and 4 switches. The routers are ISR
routers and the switches are 3560’s. It is good that this topology is
used since that is very similar to what is being used in the real lab.
When attending the bootcamp you are expected to have a good knowledge
of protocols and that you have watched the INE ATC videos. This is so
that you don’t get overwhelmed by the information in the bootcamp.
The layer two section focused on MST, PPP and frame relay and
spanning tree features like BPDU guard, BPDU filter etc. One advice
that Brian gave is to try to mix in things like PPP, PPPoE, PPPoFR
etc in your labs so that you get used to using these technologies.

Later in the week we moved on to IGPs. OSPF will be the main topic.
This is natural since OSPF will guaranteed be in your lab and you
REALLY need to know OSPF to pass the lab. Brian is an OSPF
machine, he knows the LSDB like the back of his hand. He is very
methodical and will confirm each step and show you in the LSDB
what we are seeing and why we are seeing it. He’s not one of
those guys that clears the routing table when he runs into a
glitch, he will explain how and why it is there. He had a very
good section about the forwarding address, this is an important
part of OSPF and Brian explained why it is used. He had a very
good analogy with BGP where basically if the FA is not set then
you are using next-hop-self and if it is set then the next-hop
is preserved. He also had a good explanation of the capability
transit feature and he did some great diagrams showing which
LSAs go where. This is basic knowledge but he put it so well in
that diagram. We also talked about virtual links and things like
that. One good command he showed was the show ip ospf rib
command. EIGRP and RIP will be shorter sections, he will only
show some more advanced configuration since these protocols are
a bit simpler to understand. For EIGRP he showed hot do do
unequal cost load balancing and how to calculate the metric
if you want to get a certain ratio. He showed how to do
offset-list, leak maps and authentication.

After we were done with IGPs we moved on to route redistribution.
This topic alone is enough to provide a good bootcamp experience.
Brian will in detail explain the difference between control plane
and data plane loops and why loops can occur. The important thing
to remember is that we are trying to protect the routes with a
high AD from being learned in a protocol with a lower AD. Usually
RIP is involved or EIGRP external routes since those have a high
AD. Brian will show you how to take any INE Vol2 lab topology
diagram and just look at it and identify potential issues.
This is a very good practice and when you can look at a diagram
and know what to do without even thinking about configuration
yet then you are in a good place. Brian will with his diagrams
show you where every command lives like the OSPF LSDB, OSPF RIB,
RIB, FIB etc. This is very good practice to make sure you have
a full understanding of what is going on.

BGP is of course an important topic and Brian is covering that
for sure. Brian starts by describing peering and goes through
some common misconceptions. BGP has no authentication,
wait for it…TCP has, this is a common misconception. It is
TCP providing the authentication of packets and not BGP.
He will explain concepts like hot potato vs cold potato routing.
He will show you the difference between disable-connected-check and
ebgp-multihop. He will teach you about route reflectors and
confederations and why you want to use the one or the other.
He will also explain MED in detail, something I found very useful,
explaining how deterministic MED works and always-compare-med.
He has such knowledge of everything and one thing I didn’t know
before is that networks in the BGP table are sorted by age where
the youngest network is listed first.

Building on BGP means MPLS comes naturally. These go hand in
hand and for the v4 CCIE lab you need to know MPLS. Brian
will of course explain the use of RD and RT. Remember that RD
only has a use in BGP. He shows where all the commands and
routes live and how to do troubleshooting for MPLS. The good
thing is that you will run into things that you didn’t maybe
think about and that will provide great troubleshooting. OSPF
is the most complicated PE-CE protocol and he will give you all
the details how to use Domain-ID, sham links and how the
external route tag and DN bit works.

First week is over. Time for some recovery. Have some fun and
go for some sightseeing or just do labs, the choice is yours.
Just make sure that you are well rested for when monday comes.

The second week started out with multicast. This was maybe my
favourite topic and I learned a lot from this section.
As I mentioned earlier Brian doesn’t believe in tips and tricks
and multicast is one of those topics where people have a lack
of understanding and that is why they go looking for tricks.
Multicast is 90% about PIM, you need to know PIM if you want
to be good with multicast. Brian shows common errors like having
a broken SPT or RPF failures and things like that. These usually
occur when hub and spoke frame relay is involved. With just a
few commands you can become very good with analyzing multicast.
Show ip pim interface, show ip pim neighbor,
show ip rpf x.x.x.x and show ip pim rp mapping will give you most
of the information you will need. The best thing about the
multicast section was that when we ran into errors Brian was very
methodical, instead of just pinging over and over he showed us
what was wrong and then cleared the mroute table, this will
make the mtree build again so that you always go back to a
well known state. It is probably common to have the correct
configuration but move away from it due to lack of patience
or lack of understanding of what is really going on.

Time for the killer topic, probably the most hated topic in
the entire blueprint for most candidates. You guessed it, it is
time for PfR. Where does this hate come from? Well it comes
from the fact that the 12.4 implementation of PfR is just so
incredibly bad. If I were to select one topic that is difficult
to study on your own and that you can really benefit from going
to a bootcamp then that would be PfR. Brian starts out with some
basic topologies and then moves on to some more advanced scenarios.
This topic runs for one day or even a bit more. You WILL run into
a lot of issues due to the implementation of PfR in 12.4. If you
have seen the PfR Vseminar then this will be a lot like that
with the added benefit that you can ask Brian questions of course.

The next big topic is QoS. Brian goes through frame relay
traffic shaping using both legacy syntax and MQC. He will go
through how to use policing and shaping. The coolest thing
about this part was how we configured values for policing
like Bc and then Brian showed by sending ICMP packets how the
token buckets are really working. You might be in for some
surprises here! No powerpoints here for sure! He will explain
the difference between single rate and dual rate policers and
why you would configure them for which scenarios. Then he will
go through the Catalyst QoS. This is a confusing section for
many since the Catalyst QoS is a bit convoluted. Brian shows
how the L2 QoS is very similar to MQC but the syntax is just
a bit strange. He shows how to use the priority queue and how
to use the share and shape queues for the SRR queues.

Whatever time is left will be spent on topics like EEM and
services that you would like to go through. If you feel that
you are weak in some service then this would be a good time to
ask Brian to go through it. I left the bootcamp at 3 PM on
friday and I probably missed a couple of hours in the end.
If you can find a later flight or go home on saturday that
could be a good option.

So now you have gone through a wall of text and you are
whondering what I think about it? Well if it wasn’t obvious
from my text then Yes! Go for it! Yes it costs to go and
with everything to account for like living expense and hotel,
yes it is costly. However if you look just at the price for the
bootcamp which is around 5990$. That is actually a good price,
if you consider that you can get 1500$ paid for your lab then
the cost is actually 4500$ Where I live one week of training
at Global Knowledge is usually around 3000$ for a week and
then often you get some Power Point guy reading slides or you
doing labs while the instructor is watching. The one thing I
found best about the bootcamp was that you learn how to think
at a higher level. Being a CCIE is not about knowing a lot of
commands, it is about thinking at a high level. You get to pick
the brain of a 5x CCIE with real world experience, you won’t
find many guys like that in the world and from what I’ve seen
I would rank Brian among the very best of them. The IGP, Multicast,
Redistribution, PfR sections were very good and you will learn a lot
for sure even if you were strong in these areas before.

Hopefully in class you will meet some new friends. I met some
people people in class I had only seen online before and also made
some new friends. I had a great time with David Rothera, Gian Paolo,
Jose Leitao, Susana and Harald. I also met Darren
for the first time, we have known each other online for a while now
but never met. I also had the chance to meet Patrick Barnes which is
another of my online friends 🙂

I’ve tried to cover as much as I can remember but always feel free
to ask questions in the comments section if you have anything you are
still thinking about.