Archive

Archive for May, 2013

NAT translation – What happened to my traceroute?

May 29, 2013 5 comments

This post will describe something interesting that happens to traceroute
when you are doing NAT. The inspiration for this post came from a
thread at IEOC started by Nick O’neill. He was doing ip nat outside translation but I
have found that the same behavior is true also for inside translations.

We will use this topology:

NAT_traceroute

R1 has a loopback 1.1.1.1/32 on the inside. On the outside this will be known as
100.100.100.100. You can see the networks connecting the routers are 12.12.12.0/24,
23.23.23.0/24 and 34.34.34.0/24. R3 will be the router doing NAT and this is the
configuration of it:

interface FastEthernet0/0
 ip address 23.23.23.3 255.255.255.0
 ip nat inside
!
interface FastEthernet0/1
 ip address 34.34.34.3 255.255.255.0
 ip nat outside
!
ip route 1.1.1.1 255.255.255.255 23.23.23.2
!
ip nat inside source static 1.1.1.1 100.100.100.100

This is not a post describing the basics but make sure that you have routing setup.
I am using static routing here. When doing static NAT like this there is bidirectional
communication setup. Remember that when the outside to inside translation is done NAT
is performed before the routing lookup so we must have a route to 1.1.1.1 on R3.

The command above translates the inside local address to an inside global address.
R4 will traceroute to 100.100.100.100 and we will see what happens.

R4#traceroute 100.100.100.100 num prob 1

Type escape sequence to abort.
Tracing the route to 100.100.100.100

  1 34.34.34.3 28 msec
  2 100.100.100.100 124 msec
  3 100.100.100.100 84 msec

Interesting. Hops 2 and 3 are the same and they have the inside global address as source.
What happened here? The first hop is correct. The ICMP TTL exceeded should come back from
the outgoing interface used to reach 34.34.34.4 which was the source of the UDP traceroute
packet sent. Hop 2 should have been 23.23.23.2.

If we compare it to MPLS this almost looks like some core hiding feature as it does not
reveal the internal addressing of our network. If we look at debug on R3 we will see
what is happening.

R3# debug ip nat det
IP NAT detailed debugging is on
NAT: i: icmp (23.23.23.2, 33435) -> (34.34.34.4, 49164) [1]  
NAT: s=23.23.23.2->100.100.100.100, d=34.34.34.4 [1]
NAT: i: icmp (12.12.12.1, 33436) -> (34.34.34.4, 49165) [2]  
NAT: s=12.12.12.1->100.100.100.100, d=34.34.34.4 [2]

We clearly see that the inside addresses are being translated to 100.100.100.100
even though there is no matching NAT statement. The number 33435 is the source
port number and 49164 is the destination port number.

This is the IOS version used for this lab:

R3#sh ver | i IOS
Cisco IOS Software, 3700 Software (C3725-ADVENTERPRISEK9-M), Version 12.4(15)T10, RELEASE SOFTWARE (fc3)

Now if we try it on mainline IOS instead…

R3#sh ver | i IOS
Cisco IOS Software, 3700 Software (C3725-ADVIPSERVICESK9-M), Version 12.4(25d), RELEASE SOFTWARE (fc1)
R4#trace 100.100.100.100 num prob 1

Type escape sequence to abort.
Tracing the route to 100.100.100.100

  1 34.34.34.3 68 msec
  2 23.23.23.2 164 msec
  3 100.100.100.100 196 msec

As you can see with the mainline IOS we can see the real addresses as they are not
being translated.

Knowing me you know that I like to dig deep to find out what is going on. I reached
out to a contact at Cisco to see if he could find any reference to what is happening
here. It turns out that this is related to bug CSCsu37097. Basically the customer
didn’t want the internal addressing revealed and so this was incorporated into IOS.
The symptom was described as: “Symptom:
Traceroute/ICMP unreachable doesn’t translate properly and can cause security problem.”

The following quote is also available from Cisco: “In Cisco IOS Release 15.1(3)T and
later releases, when you configure the traceroute command, NAT returns the same
inside global IP address for all inside local IP addresses.”
The source is this
document.

So basically this was implemented as a security feature for NAT. I hope you
learned something interesting.

Networkengineering @ Stack Exchange

May 27, 2013 2 comments

Some of you might have heard of Stack Exchange. It’s a site for a range
of different topics based on Q&A format. Recently there was a new section
started for network engineering. It was in private beta and now it has
gone into public beta.

I hope this site stays because so far the content has been really good. There
are lots of skilled engineers from different companies such as Cisco and various
ISPs.

The basic idea is to ask a question and people will try to answer it. Good
questions and answers will be “upvoted” which means that the person that wrote
it gets “reputation”. With reputation comes more power and responsibilities.

I have been part of the private beta and now it is in public beta. I’m currently
a moderator over there so tell me if you run into any issues. You can reach the site
at networkengineering.stackexchange.com.

STP convergence – MST

May 8, 2013 4 comments

In the comments I received a wish to compare RPVST+ with MST.
RPVST+ is Ciscos proprietary STP running one instance per VLAN over
802.1Q trunks. MST is an industry standard which can run multiple
instances but not one per VLAN. MST does run RSTP as underlying
protocol so in theory there should be no difference at all. Let’s
give it a try. The topology is very similar to last time but a couple
of extra routers are involved. We’ll get back to these later. This is
the topology:

STP-convergence-MST

These are the current port roles:

STP-port-roles-MST

I just have put some basic MST configuration and NTP on the switches.

SW3(config)#ntp server 13.13.13.1
SW3(config)#span mode mst
SW3(config)#span mst 0 prio 16384
SW3(config)#span mst 1 prio 16384
SW3(config)#span mst conf
SW3(config-mst)#name TST       
SW3(config-mst)#revision 1

Verify initial reachability between the routers.

R1#ping 13.13.13.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 13.13.13.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

R2#ping 25.25.25.5

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 25.25.25.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/4 ms

Now let’s shutdown Gi0/21 on SW3 which is leading to SW2 root port.
Debug spanning-tree events will show the sequence of events.

May  7 20:32:18.975: MST[0]: Fa0/21 state change forwarding -> disabled
May  7 20:32:18.975: MST[0]: updt roles, root port Fa0/21 going down
May  7 20:32:18.975: MST[0]: Fa0/23 is now root port
May  7 20:32:18.975: MST[0]: Fa0/21 state change disabled -> blocking
May  7 20:32:18.975: MST[0]: Fa0/23 state change blocking -> forwarding
May  7 20:32:18.979: MST[0]: sending proposal on Fa0/3
May  7 20:32:18.983: MST[0]: sending proposal on Fa0/5

The switchover is immediate as expected. Now let’s try to simulate passive
error by implementing BPDU filter.

SW3(config-if)#span bpdufilter enable
SW3(config-if)#do sh clock
20:36:14.354 UTC Tue May 7 2013

This is from SW2:

May  7 20:36:20.008: MST[0]: updt roles, information on root port Fa0/21 expired
May  7 20:36:20.008: MST[0]: Fa0/23 is now root port
May  7 20:36:20.008: MST[0]: Fa0/21 state change forwarding -> blocking
May  7 20:36:20.008: MST[0]: Fa0/3 state change forwarding -> blocking
May  7 20:36:20.008: MST[0]: Fa0/5 state change forwarding -> blocking
May  7 20:36:20.008: MST[0]: Fa0/23 state change blocking -> forwarding
May  7 20:36:20.008: MST[0]: Fa0/21 is now designated
May  7 20:36:20.012: MST[0]: sending proposal on Fa0/21
May  7 20:36:20.012: MST[0]: sending proposal on Fa0/3
May  7 20:36:20.012: MST[0]: sending proposal on Fa0/5

So it took roughly 6 seconds which was expected. Because MST runs
RSTP the results are exactly the same. The only thing that’s really different
with MST is that all BPDUs are piggybacked in the CIST (instance 0). If you have
VLANs mapped to instance 0 and there is a change then the other ISTs may have
to recalculate as well.

So using MST is no different than using RPVST+ from a convergence standpoint.
In future posts I will look at running a mix of RPVST+ and MST and see how
they interconnect.

Spanning tree convergence

May 7, 2013 10 comments

Someone asked the other day how fast STP converges depending on PVST+ or
RPVST+ or MST is running. Usually the answer for PVST+ is 30-50 seconds
and for RPVST+ it’s fast, maybe less than a second. I thought I would
explore on this and check difference between PVST+ and RPVST+ and also
using PVST+ with features like uplinkfast.

This post assumes you already have a good basic understanding of STP. This
is not an introductory post on STP.

This is the topology being used:

STP-convergence

SW1 is the root and ports towards the routers have been configured with VLAN 23
and portfast. I will run NTP to have the clocks properly synchronized. Currently
the port roles look like this:

STP-port-roles

I will configure the routers in 23.23.23.0/24 subnet and do a ping to verify connectivity.

R2#ping 23.23.23.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 23.23.23.3, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 1/3/4 ms

Working fine so far. Now let’s take a look at some different failure scenarios.
We turn on logging to a buffer to not flood the console. We will be looking at
spanning tree events.

SW1(config)#logging con 6
SW1(config)#logging buff 7
SW1(config)#logging buff 32768
SW1(config)#do debug spanning-tree events
Spanning Tree event debugging is on

What happens when the root port is shutdown? In theory when the carrier detects
that the link is down it should look at alternate BPDU and start to take that
port through the different port states. This should take around 30 seconds.

This is output from SW2.

May  7 10:27:03.314: STP: VLAN0023 new root port Fa0/16, cost 38
May  7 10:27:18.321: STP: VLAN0023 Fa0/16 -> learning
May  7 10:27:33.329: STP: VLAN0023 sent Topology Change Notice on Fa0/16
May  7 10:27:33.329: STP: VLAN0023 Fa0/16 -> forwarding

The timing is almost perfect. The port goes through listening and learning
at 15 seconds each before it goes to forwarding almost exactly 30 seconds after
the port was shutdown.

What happens when there is an indirect failure? The switch has to expire the root BPDU
before it believes other BPDUs with worse cost. This should take around 20 seconds. By
default Maxage will be set to 20 seconds.

SW1#sh span | i Age
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
SW2#sh span int f0/13 det | i age
   Timers: message age 1, forward delay 0, hold 0

We will this time simulate a passive error by configuring BPDU filter on SW1 towards
SW2.

SW1(config-if)#span bpdufilter enable   
SW1(config-if)#do sh clock
10:39:05.598 UTC Tue May 7 2013

This has created a bridging loop but in this case we just want to see how long it
takes before the alternate port is coming up as root.

May  7 10:39:24.046: STP: VLAN0023 new root port Fa0/16, cost 38
May  7 10:39:24.046: STP: VLAN0023 Fa0/16 -> listening
May  7 10:39:39.053: STP: VLAN0023 Fa0/16 -> learning
May  7 10:39:54.061: STP: VLAN0023 sent Topology Change Notice on Fa0/16
May  7 10:39:54.061: STP: VLAN0023 Fa0/16 -> forwarding

So it took almost 20 seconds for the BPDU to expire. Then the port goes through
the ordinary state changes. Roughly 48.5 seconds after the filter was applied
the port went into forwarding. For passive failures when running PVST+ the
maximum recovery time should be 50 seconds.

Now let’s look at PVST+ with Uplinkfast configured. The theory is that when a
root port fails the Alternate port should be bypass listening and learning
states and go direct to forwarding. Let’s try this out.

SW2(config)#spanning-tree uplinkfast
May  7 10:46:37.260: STP: VLAN0023 new root port Fa0/16, cost 3038
May  7 10:46:38.249: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/13, changed state to down
May  7 10:46:39.264: %LINK-3-UPDOWN: Interface FastEthernet0/13, changed state to down
May  7 10:46:39.264: STP: VLAN0023 sent Topology Change Notice on Fa0/16

It took only 2 seconds from realizing the port was down to putting the alternate
port into forwarding. For PVST+ this is a great enhancement. What if there is
a passive error?

SW1(config-if)#span bpdufilter enable
SW1(config-if)#do sh clock
10:51:11.870 UTC Tue May 7 2013
May  7 10:51:30.216: STP: VLAN0023 new root port Fa0/16, cost 3038
May  7 10:51:30.216: STP: VLAN0023 sent Topology Change Notice on Fa0/16

There is nothing to be done about the Maxage expiring but the port is
brought up after that. So instead of 50 seconds it takes about 20 seconds.

That’s it for PVST+. Now let’s move on to RPVST+. RPVST+ works by synchronizing
the topology and it has optimizations builtin. If a port fails then it should
converge almost instantly.

May  7 10:56:34.421: RSTP(1): updt roles, root port Fa0/13 going down
May  7 10:56:34.421: RSTP(1): Fa0/16 is now root port
May  7 10:56:34.421: RSTP(1): syncing port Fa0/4
May  7 10:56:34.421: RSTP(1): syncing port Fa0/6
May  7 10:56:34.421: RSTP(1): syncing port Fa0/24
May  7 10:56:34.421: RSTP(23): updt roles, root port Fa0/13 going down
May  7 10:56:34.421: RSTP(23): Fa0/16 is now root port
May  7 10:56:34.438: RSTP(1): transmitting a proposal on Fa0/4
May  7 10:56:34.438: RSTP(1): transmitting a proposal on Fa0/6
May  7 10:56:34.438: RSTP(1): transmitting a proposal on Fa0/24
May  7 10:56:35.419: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/13, changed state to down
May  7 10:56:35.578: RSTP(1): transmitting a proposal on Fa0/4
May  7 10:56:35.578: RSTP(1): transmitting a proposal on Fa0/6
May  7 10:56:35.578: RSTP(1): transmitting a proposal on Fa0/24
May  7 10:56:36.434: %LINK-3-UPDOWN: Interface FastEthernet0/13, changed state to down

It instantly failovers to the Alternate port and then starts synchronizing
the topology by sending out proposals. What if there was a passive failure?
In theory after RPVST+ misses 3 BPDUs it should realize that it needs to
start using the alternate path. Let’s try it out.

SW1(config-if)#span bpdufilter enable
SW1(config-if)#do sh clock
11:01:12.960 UTC Tue May 7 2013
May  7 11:01:16.648: RSTP(1): Fa0/13 rcvd info expired
May  7 11:01:16.648: RSTP(1): updt roles, information on root port Fa0/13 expired
May  7 11:01:16.648: RSTP(1): Fa0/16 is now root port
May  7 11:01:16.648: RSTP(1): Fa0/13 blocked by re-root
May  7 11:01:16.648: RSTP(1): syncing port Fa0/4
May  7 11:01:16.648: RSTP(1): syncing port Fa0/6
May  7 11:01:16.648: RSTP(1): syncing port Fa0/24
May  7 11:01:16.648: RSTP(1): Fa0/13 is now designated
May  7 11:01:16.648: RSTP(23): Fa0/13 rcvd info expired
May  7 11:01:16.648: RSTP(23): updt roles, information on root port Fa0/13 expired
May  7 11:01:16.648: RSTP(23): Fa0/16 is now root port
May  7 11:01:16.648: RSTP(23): Fa0/13 blocked by re-root
May  7 11:01:16.648: RSTP(23): Fa0/13 is now designated

Already around 4 seconds later the topology has converged. It should take
maximum 6 seconds depending on when the last BPDU was received before the
failure.

As you can see it’s very important to detect carrier down. If you do detect it
and are running RPVST+ then convergence is almost immediate. So when designing your
network try to avoid use fiber converts and such that won’t shut down the RJ45 side
if the optical goes down. Designing for convergence is just not about protocols, you
also need to consider the physical infrastructure.

I hope this post has given you a good insight to the convergence of STP.

Routing-bits SP handbook now available


We are many CCIE RS candidates that have used Ruhanns RS handbook to
aid us in passing the CCIE lab. Ruhann has now released a SP handbook
as well to aid all SP candidates.

Who is Ruhann?

Ruhann du Plessis 2x CCIE #24163 (RS, SP) is an experienced engineer
that designs and works with large MPLS VPN networks, intra/inter-AS
routing, large data centers and so on.

The book was written to be used as a kind of quick reference. You
will find both theory but must important config sets that describe
how to configure the different features. Relevant show commands
and how to troubleshoot is also shown which is really good. Also links
to the DOCCD are included so that it becomes easy to find where all
features are located.

The book starts by describing a feature/protocol with some theory and
facts, often in bullet point form. On top of the page there is a
reference to the DOCCD to find the relevant feature. Then the config set
shows how to configure the feature and finally show commands and how
to troubleshoot is shown at the end of the section. There is also a
reference to relevant RFCs describing the features/protocols.

From what I’ve seen this book looks great! The RS book is a great help
in passing the RS lab and now there is an equally good book to help
in passing the SP lab as well.

I really like to use the book as a reference. It’s sometimes easier to
find the information the the handbook than going to the Cisco documentation.
The config sets are even better then what is shown in the Cisco docs.

There is a sample available of the SP handbook here.

To buy it go to Ruhanns site. It’s only 98$.