Archive

Archive for the ‘Troubleshooting’ Category

MPLS troubleshooting scenario

April 23, 2012 16 comments

I’m in final preparation for my second attempt and I have been doing a lot of troubleshooting scenarios lately. I created a MPLS topology in GNS3 and sent it to my friend Darren for testing. He is taking his lab very soon and he performed well on this lab. The lab contains multiple faults but I won’t say how many since that would spoil some of the surprise.

The assignment is to make sure CE1 can ping CE2 loopback 6.6.6.6.

Post in comments what you did to make it work or if you need a hint to get you going in the right direction. You need to edit the .net file to use your own working dir and IOS image. You need IOS images for 3725 and 7200. Start with the configurations provided by importing the configs or simply pasting them in whatever you prefer but you should not look at the startup config before starting.

Download the .net and config files here.

This is what the topology looks like.

Advertisements

INE TS Vol4 – two labs done

August 18, 2011 Leave a comment

Did some of these TS labs this week. They are very challenging but they are designed by Petr Lapukhov so they should be… I found these to be more difficult than the ASET TS labs. From what I’ve heard from Petr these labs are not specifically designed to help you do the TS at the lab but more of a learning tool to become very proficient in TS. If you can solve these tasks my guess is that the TS at the lab should not be that difficult except that the lab topology is much larger.

While doing one of the tasks I learned a new cool command from the solution guide, debug ip packet detail dump. The dump at the end is a hidden command and will show the contents of the packet. This can be useful when troubleshooting authentication if the key is in plain text. Newer versions of IOS seem to show the key in log messages but it could still be handy. This is how we use it.

As you can see, I do all of my labs with PuTTY to be comfortable with it on the exam.

Another TS ASET lab done

August 11, 2011 1 comment

Did another TS lab yesterday. This one was a bit more challenging than the first. If I grade myself (no auto grading) I would have 6 or 7 out of 10 correct which is not too far away from the 80% passing score. I did not expect to be an expert at TS yet but this shows that I am on the right path. One thing that is annoying is that you don’t know the initial configurations and there is no solutions guide. You only get the final configurations that includes the correct configuration.

The next time I do one of these I think I will do a show run on all routers and download the config and then download final configs and do a diff to see what is different.

During these labs I have noticed that the wording is very important. Look at the following example

.6 TROUBLE TICKET 6
R10 and R11 are not seeing routes for the R21-R28 network or for the VPN “Foo” host 1.1.1.1. Determine the cause and correct the issue.

When I did the lab I interpreted this as that networks behind routers R21 to R28 are not reachable. However I think that what they really mean is that the link connecting R21 to R28 (running RIP) is not reachable in the domain. Depending on how you read the task you will get a very different result and do a lot of unnecessary steps.

I like that the topology is large since that is what we can expect at the lab. The user experience with topology diagram and connecting to routers seems to be similar to the real thing if we compare to the lab exam demo.

I might try some of the configuration labs later but for now I am mainly focusing on INE material.

Categories: Announcement, CCIE, Troubleshooting Tags: , , ,

Troubleshooting multicast – RPF failure

February 16, 2011 2 comments

In unicast routing we are interested in how to forward packets to their destination. In multicast routing we are interested where the source came from, multicast packets need to pass a RPF (Reverse Path Forwarding) check. Packets that are received on an interface are checked that the route back to the source is through the same interface, otherwise the RPF check will fail. RPF check failing is one of the most common errors in multicast networks. Lets look at the topology.

The goal of the scenario is that R6 should be able to ping SW4 which has joined multicast group 224.10.10.10. PIM dense mode has been enabled on R6 -> R4, between R4 and R5 PIM is only enabled on the frame-relay connection. PIM is enabled between R5 -> SW2 and SW2 -> SW4. Dense mode is being used. This is configuration from SW4.

Rack11SW4#sh run int vlan 10
Building configuration…
Current configuration : 96 bytes
!
interface Vlan10
 ip address 155.11.10.10 255.255.255.0
 ip igmp join-group 224.10.10.10
end

Ping from R6 to SW4.

Rack11R6#ping 224.10.10.10 re 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 224.10.10.10, timeout is 2 seconds:

Not successful, lets look at what multicast packets are being sent. We need to disable fast switching on the interface to see any packets.

Rack11R5(config)#int s0/0/0
Rack11R5(config-if)#no ip mroute-cache
Rack11R5(config-if)#^Z
Rack11R5#debug ip mpacket
IP multicast packets debugging is on
Rack11R5#
Feb 15 10:30:53.727: IP(0): s=155.11.146.6 (Serial0/0/0) d=224.10.10.10 id=93, ttl=253, prot=1, len=104
(100), RPF lookup failed for source
Feb 15 10:30:53.727: IP(0): s=155.11.146.6 (Serial0/0/0) d=224.10.10.10 id=93, ttl=253, prot=1, len=104
(100), not RPF interface
Rack11R5#
Feb 15 10:30:55.723: IP(0): s=155.11.146.6 (Serial0/0/0) d=224.10.10.10 id=94, ttl=253, prot=1, len=104
(100), not RPF interface
Rack11R5#
Feb 15 10:30:57.723: IP(0): s=155.11.146.6 (Serial0/0/0) d=224.10.10.10 id=95, ttl=253, prot=1, len=104
(100), not RPF interface

Packets are not coming in on the RPF interface. Lets look at the multicast routing table.

Rack11R5#sh ip mroute
IP Multicast Routing Table
Flags: D – Dense, S – Sparse, B – Bidir Group, s – SSM Group, C – Connected,
       L – Local, P – Pruned, R – RP-bit set, F – Register flag,
       T – SPT-bit set, J – Join SPT, M – MSDP created entry,
       X – Proxy Join Timer Running, A – Candidate for MSDP Advertisement,
       U – URD, I – Received Source Specific Host Report,
       Z – Multicast Tunnel, z – MDT-data group sender,
       Y – Joined MDT-data group, y – Sending to MDT-data group,
       V – RD & Vector, v – Vector
Outgoing interface flags: H – Hardware switched, A – Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 224.10.10.10), 00:03:41/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:03:41/00:00:00
    Serial0/0/0, Forward/Dense, 00:03:41/00:00:00
(155.11.146.6, 224.10.10.10), 00:00:02/00:02:57, flags:
  Incoming interface: Null, RPF nbr 155.11.45.4
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:00:02/00:00:00
    Serial0/0/0, Forward/Dense, 00:00:02/00:00:00
(*, 224.0.1.40), 00:39:35/00:02:54, RP 0.0.0.0, flags: DCL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:38:30/00:00:00
    Serial0/0/0, Forward/Dense, 00:14:20/00:00:00 

We are interested in the (155.11.146.6, 224.10.10.10) which is a dense mode group. Our RPF neighbor is 155.11.45.4 which is the address of R4 on the S0/1/0 interface which is not enabled for PIM. How do we reach 155.11.146.6?

Rack11R5#sh ip mroute
IP Multicast Routing Table
Flags: D – Dense, S – Sparse, B – Bidir Group, s – SSM Group, C – Connected,
       L – Local, P – Pruned, R – RP-bit set, F – Register flag,
       T – SPT-bit set, J – Join SPT, M – MSDP created entry,
       X – Proxy Join Timer Running, A – Candidate for MSDP Advertisement,
       U – URD, I – Received Source Specific Host Report,
       Z – Multicast Tunnel, z – MDT-data group sender,
       Y – Joined MDT-data group, y – Sending to MDT-data group,
       V – RD & Vector, v – Vector
Outgoing interface flags: H – Hardware switched, A – Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 224.10.10.10), 00:03:41/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:03:41/00:00:00
    Serial0/0/0, Forward/Dense, 00:03:41/00:00:00
(155.11.146.6, 224.10.10.10), 00:00:02/00:02:57, flags:
  Incoming interface: Null, RPF nbr 155.11.45.4
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:00:02/00:00:00
    Serial0/0/0, Forward/Dense, 00:00:02/00:00:00
(*, 224.0.1.40), 00:39:35/00:02:54, RP 0.0.0.0, flags: DCL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:38:30/00:00:00
    Serial0/0/0, Forward/Dense, 00:14:20/00:00:00
Rack11R5#sh ip route 155.11.146.6
Routing entry for 155.11.146.0/24
  Known via “eigrp 100”, distance 90, metric 2172416, type internal
  Redistributing via eigrp 100, ospf 1
  Advertised by ospf 1 subnets
  Last update from 155.11.45.4 on Serial0/1/0, 00:55:04 ago
  Routing Descriptor Blocks:
  * 155.11.45.4, from 155.11.45.4, 00:55:04 ago, via Serial0/1/0
      Route metric is 2172416, traffic share count is 1
      Total delay is 20100 microseconds, minimum bandwidth is 1544 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 1

Traffic to R6 is sent over the S0/1/0 interface which is not enabled for PIM, this is a problem…How can we pass the RPF check? By adding a static mroute we can enable the frame-relay interface to be a valid RPF interface.

Rack11R5(config)#ip mroute 155.11.146.6 255.255.255.255 155.11.0.4

The ping should now be successful.

Rack11R6#ping 224.10.10.10 re 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 224.10.10.10, timeout is 2 seconds:
Reply to request 0 from 155.11.108.10, 52 ms
Reply to request 1 from 155.11.108.10, 44 ms
Reply to request 2 from 155.11.108.10, 44 ms

Traffic is flowing, one final look at the mroute table.

Rack11R5#sh ip mroute
IP Multicast Routing Table
Flags: D – Dense, S – Sparse, B – Bidir Group, s – SSM Group, C – Connected,
       L – Local, P – Pruned, R – RP-bit set, F – Register flag,
       T – SPT-bit set, J – Join SPT, M – MSDP created entry,
       X – Proxy Join Timer Running, A – Candidate for MSDP Advertisement,
       U – URD, I – Received Source Specific Host Report,
       Z – Multicast Tunnel, z – MDT-data group sender,
       Y – Joined MDT-data group, y – Sending to MDT-data group,
       V – RD & Vector, v – Vector
Outgoing interface flags: H – Hardware switched, A – Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 224.10.10.10), 00:06:53/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:06:53/00:00:00
    Serial0/0/0, Forward/Dense, 00:06:53/00:00:00
(155.11.146.6, 224.10.10.10), 00:03:14/00:02:35, flags: T
  Incoming interface: Serial0/0/0, RPF nbr 155.11.0.4, Mroute
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:03:14/00:00:00
(*, 224.0.1.40), 00:42:47/00:02:47, RP 0.0.0.0, flags: DCL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:41:42/00:00:00
    Serial0/0/0, Forward/Dense, 00:17:32/00:00:00

The RPF neighbor is now 155.11.0.4 which is the next-hop over frame-relay. When doing multicast we need to think more about traffic patterns and ensuring that interfaces that are in the multicast transit path should be PIM enabled or not be used for multicast traffic.

BGP troubleshooting – route not installed

February 12, 2011 Leave a comment

Sometimes prefixes in BGP do not get installed into the routing table, if the route is also in an IGP that might be a reason but then a RIB-failure would be indicated. This scenario shows another possible source of problems. Once again, the topology is this.

All internal routers are running iBGP in a full mesh. Routers R4 and R6 have eBGP peerings to the backbone routers which are injecting external prefixes into the AS. All internal routers are announcing their loopbacks into BGP. SW3 is trying to reach 119.0.0.1 in the prefix 119.0.0.0/8 but is unable to do so, lets look at some output.

Rack1SW3#ping 119.0.0.1 so lo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 119.0.0.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.9.9
…..
Success rate is 0 percent (0/5)

SW3 can’t reach 119.0.0.1, why?

Rack1SW3#sh ip route 119.0.0.0
% Network not in table

We have no route there, what routes can we see from BGP?

Rack1SW3#sh ip route bgp
     150.1.0.0/24 is subnetted, 10 subnets
B       150.1.7.0 [200/0] via 155.1.79.7, 01:26:58
B       150.1.6.0 [200/0] via 155.1.67.6, 01:26:58
B       150.1.5.0 [200/0] via 155.1.45.5, 01:26:58
B       150.1.4.0 [200/0] via 155.1.146.4, 01:26:58
B       150.1.3.0 [200/0] via 155.1.37.3, 01:26:58
B       150.1.2.0 [200/0] via 155.1.23.2, 01:26:58
B       150.1.1.0 [200/0] via 155.1.146.1, 01:26:58
B       150.1.10.0 [200/0] via 155.1.108.10, 01:26:45
B       150.1.8.0 [200/0] via 155.1.58.8, 01:26:45

We can see all the loopbacks just fine but we have no route to the external prefixes. What is R6 announcing to us?

Rack1SW3# sh ip bgp nei 155.1.67.6 routes
BGP table version is 11, local router ID is 150.1.9.9
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
              r RIB-failure, S Stale
Origin codes: i – IGP, e – EGP, ? – incomplete
   Network          Next Hop            Metric LocPrf Weight Path
* i28.119.16.0/24   54.1.1.254               0    100      0 54 i
* i28.119.17.0/24   54.1.1.254               0    100      0 54 i
* i112.0.0.0            54.1.1.254               0    100      0 54 50 60 i
* i113.0.0.0            54.1.1.254               0    100      0 54 50 60 i
* i114.0.0.0            54.1.1.254               0    100      0 54 i
* i115.0.0.0            54.1.1.254               0    100      0 54 i
* i116.0.0.0            54.1.1.254               0    100      0 54 i
* i117.0.0.0            54.1.1.254               0    100      0 54 i
* i118.0.0.0            54.1.1.254               0    100      0 54 i
* i119.0.0.0            54.1.1.254               0    100      0 54 i
*150.1.6.0/24        155.1.67.6               0    100       0 i
Total number of prefixes 11

R6 is announcing the external prefixes to us but what do we have in our BGP table? Output has been abbreviated.

Rack1SW3#sh ip bgp
BGP table version is 11, local router ID is 150.1.9.9
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
              r RIB-failure, S Stale
Origin codes: i – IGP, e – EGP, ? – incomplete
   Network          Next Hop            Metric LocPrf Weight Path
* i119.0.0.0        204.12.1.254             0    100      0 54 i
* i                         54.1.1.254               0    100      0 54 i

So we do have 119.0.0.0/8 via 204.12.1.254 and 54.1.1.254 but how do we get to the next-hops, remember that route recursion will occur and that the first rule of the BGP best path is that we must have a valid next-hop. We can see that the route is valid but not best.

Rack1SW3#sh ip route 54.1.1.254
% Network not in table

We have an invalid next-hop, so that is why the route is not being installed, lets fix this.

Rack1R6(config)#router eigrp 100
Rack1R6(config-router)#network 54.1.1.0 0.0.0.255
Rack1R4(config)#router eigrp 100
Rack1R4(config-router)#network 204.12.1.0 0.0.0.255

That should take care of the next-hops, lets check the routing table.

Rack1SW3#sh ip route 54.1.1.254
Routing entry for 54.1.1.0/24
  Known via “eigrp 100”, distance 90, metric 2174976, type internal
  Redistributing via eigrp 100
  Last update from 155.1.79.7 on Vlan79, 00:02:04 ago
  Routing Descriptor Blocks:
  * 155.1.79.7, from 155.1.79.7, 00:02:04 ago, via Vlan79
      Route metric is 2174976, traffic share count is 1
      Total delay is 20200 microseconds, minimum bandwidth is 1544 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 2

We now have a route for the next-hop. Lets look at the BGP table again.

Rack1SW3#sh ip bgp
BGP table version is 31, local router ID is 150.1.9.9
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
              r RIB-failure, S Stale
Origin codes: i – IGP, e – EGP, ? – incomplete
   Network          Next Hop            Metric LocPrf Weight Path
*i119.0.0.0        204.12.1.254             0    100      0 54 i
* i                      54.1.1.254                 0    100      0 54 i

So the path is now have a best path, is it in the routing table?

Rack1SW3#sh ip route bgp
B    119.0.0.0/8 [200/0] via 204.12.1.254, 00:02:27

Route is installed, we should be good to go.

Rack1SW3#ping 119.0.0.1 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 119.0.0.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.9.9
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 12/37/84 ms

Success. Always remember to have a valid next-hop in BGP. Next-hops are modified over eBGP peerings but not over iBGP. To resolve this kind of problem either redistribute connected interface to the external peer into IGP or use next-hop-self on iBGP peerings. A route-map can also be used to achieve the same thing. I hope this post has showed you how to do BGP troubleshooting step by step.

BGP troubleshooting – peer address not matching

February 8, 2011 2 comments

Yesterday I did some Internetwork Expert vol1 labs on BGP. I was having trouble getting some of the peers to come up and had to troubleshoot. This post will describe how to troubleshoot when peers won’t form. First, lets look at the topology. Thanks to DennisD on IEOC forums for the image.

R2 and R5 should peer with each other in AS 100. R2 is setup to peer with R5’s IP 155.1.45.5 and R5 is setup to peer with R2’s IP 155.1.23.2. It would have been better to peer over the 155.1.0.0/24 subnet directly but this is to show the steps of troubleshooting. So the session will not form, why? Lets look at some output from debug ip tcp transactions.

Rack1R5#*Mar 1 01:31:39.291: TCP: sending SYN, seq 478218125, ack 0
*Mar 1 01:31:39.291: TCP0: Connection to 155.1.23.2:179, advertising MSS 536
*Mar 1 01:31:39.291: TCP0: state was CLOSED -> SYNSENT [56275 -> 155.1.23.2(179)]
*Mar 1 01:31:39.311: Released port 56275 in Transport Port Agent for TCP IP type 1 delay 240000
*Mar 1 01:31:39.311: TCP0: state was SYNSENT -> CLOSED [56275 -> 155.1.23.2(179)]
*Mar 1 01:31:39.311: TCP0: bad seg from 155.1.23.2 — closing connection: port 56275 seq 0 ack 478218126 rcvnxt 0
rcvwnd 0 len 0
*Mar 1 01:31:39.311: TCP0: connection closed – remote sent RST
*Mar 1 01:31:39.311: TCB 0x651784FC destroyed

We can see that R5 is initiating the connection, it is sending a TCP SYN to R2 on port 179 but R2 responds with a TCP RST which resets the connection. This could indicate that either R2 is not running BGP or that their is a problem with the neighbor statements.

So we want to know what IP R5 is using when sending TCP packets to R2. Lets debug IP packets.

Rack1R5(config)#access-list 101 permit tcp any host 155.1.23.2
Rack1R5#debug ip packet 101
*Mar 1 01:36:01.611: IP: tableid=0, s=155.1.0.5 (local), d=155.1.23.2 (Serial0/0), routed via FIB
*Mar 1 01:36:01.615: IP: s=155.1.0.5 (local), d=155.1.23.2 (Serial0/0), len 44, sending

R5 is using its IP of 155.1.0.5 to communicate with 155.1.23.2 but R2 expects R5 to setup the BGP session from the IP of 155.1.45.5. Let’s verify why R5 is using 155.1.0.5 to get to 155.1.23.2. This is a look at the routing table.

Rack1R5#sh ip route 155.1.23.0
Routing entry for 155.1.23.0/24
Known via “eigrp 100”, distance 90, metric 2681856, type internal
Redistributing via eigrp 100
Last update from 155.1.0.3 on Serial0/0, 00:42:06 ago
Routing Descriptor Blocks:
155.1.0.3, from 155.1.0.3, 00:42:06 ago, via Serial0/0
Route metric is 2681856, traffic share count is 1
Total delay is 40000 microseconds, minimum bandwidth is 1544 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 1
* 155.1.0.2, from 155.1.0.2, 00:42:06 ago, via Serial0/0
Route metric is 2681856, traffic share count is 1
Total delay is 40000 microseconds, minimum bandwidth is 1544 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 1

We can see that R5 has two equal cost paths to reach the IP of R2. The next hop is either 155.1.0.2 or 155.1.0.3 and these are reachable via the connected subnet of Serial0/0. That is why R5 is using the IP of 155.1.0.5 to source packets. How can we solve this? Either we can setup the neighbor statement to point at 155.1.0.5 or we can change the update-source.

Rack1R5(config-router)#neighbor 155.1.23.2 update-source s0/1

A debug IP packet confirms that the right interface is now being used.

*Mar 1 01:36:31.663: IP: tableid=0, s=155.1.45.5 (local), d=155.1.23.2 (Serial0/0), routed via FIB
*Mar 1 01:36:31.663: IP: s=155.1.45.5 (local), d=155.1.23.2 (Serial0/0), len 44, sending

Show ip bgp confirms that they are now peers.

Neighbor   V AS    MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
155.1.23.2 4 100  5                  5        9          0      0     00:00:05            1

Show tcp brief is a good command to see TCP sessions to/from the router.

Rack1R5#show tcp brief
TCB Local Address Foreign Address (state)
651791FC 155.1.45.5.26655 155.1.23.2.179 ESTAB

And this is how to do basic BGP troubleshooting.