Archive

Archive for the ‘Catalyst’ Category

Why fast IGP timers aren’t always beneficial

March 31, 2014 4 comments

Introduction

When tuning your IGP of choice, the first thing people look at is usually the hello
and dead interval. This is a flawed logic, it is true that it can help in certain
cases but convergence consists of much more than just hello timers.

Why tune timers?

Detecting that the other side of the link is down is an important part of converging.
That’s why your design should avoid putting any bump in the wires such as converters
or a L2 cloud between the L3 endpoints. If you avoid such things when one end of the
link goes down the other end will as well which provides fast detection of failure.

In rare cases you can have the link being up but traffic is not passing over it. For
such cases or for those cases where there was no chance of avoiding a converter or
L2 cloud, tuning the hello timers can help with failure detection. The answer is almost
always BFD though, if the platform supports it.

Topologies where tuning timers is bad

When using a topology where VSS is involved such as Catalyst 6500 or Catalyst 4500,
tuning the timers is very bad. A common topology might look like this:

VSS1

The L3 switches are dually connected to the VSS. These L3 switches might be in the
distribution layer and the VSS is part of the core. The distribution switches run
LACP towards the VSS which acts as one device from an outside perspective.

The VSS runs Stateful Switchover (SSO) which syncs configuration, boots the standby
supervisor with the software and has the line cards ready to go in case of failure
of the primary chassis. Hardware forwarding tables are also synchronized, SSO
switchover takes somewhere up to 10 seconds.

SSO

The active VSS chassis runs the control plane. Routing protocols such as OSPF are not
HA aware, meaning that the state of the routing protocols is not synchronized between
the chassis.

When using fast timers and a switchover occurs, what happens is that OSPF detects that the
neighbor is not replying and tears down the adjacency. The secondary chassis then has to go
bring the adjacency back up by sending out hello packets, exchanging LSAs and updating
RIB/FIB. This may take as long as 20 seconds with the time included from the switchover.

VSS_failure

Non Stop Forwarding (NSF)

NSF combined with graceful restart is a technology used to forward packets when
a switchover has occured. The goal of NSF is to delay the failure detection which
may sound strange from a convergence perspective. Remember though that the VSS acts
as one device.

With NSF the forwarding is done according to the last known FIB entries. After a
switchover the secondary VSS will use graceful restart to inform its neighbors that
it has restarted and needs to synchronize its LSDB. This is done by sending hello packets
with a special bit set and the synchronization is done Out Of Band (OOB) to not tear
down the existing adjacency. The neighbors exchange LSAs and run SPF as normal. The
RIB and FIB can then be updated and and normal forwarding ensues.

This process is dependant on that the neighbors are also NSF aware otherwise they
would tear down the adjacency when the secondary VSS is restarting its routing
processes. So the key here is that the adjacency must stay up and that’s why timers
should be left at default if running VSS. This goes for both the VSS and any routers
that are neighbors to the VSS.

Conclusion

When using VSS always leave IGP timers at the default. Fast timers ruins the NSF
process and will lead to much higher convergence times than leaving them at the
default.

Advertisements

Cisco updates the Catalyst 2960 – Catalyst 2960-X and Catalyst 2960-XR

June 12, 2013 1 comment

The Catalyst 2960 is a very common switch in any environment that has
Cisco devices. A couple of years ago the 2960 got stacking via the
2960-S model. It also got the ability to do static routes which
was a nice feature. I used it in some deployments to do routing
locally in 2960 and then add a default route towards WAN provider.
That way I didn’t have to go through a slow CPE to route my local
VLANs.

The 2960-X and -XR are available in 24 or 48 port configurations.
Uplinks are either 2x 10 Gbit SPF+ or 4x 1 Gbit SFP. The PoE models
can support 370W or 740W of power.

The 2960-X provides up to 80 Gbps of stack bandwidth which is 2x more
compared to the 2960-S. It is now also possible to stack up to 8 switches
compared to the earlier maximum of 4. The 2960-S model uses FlexStack while
the newer -X and -XR models uses FlexStack-Plus. FlexStack-Plus supports
detecting stack port operational state in hardware and change the forwarding
according to it. This takes 100 ms or less. The older model does it in CPU
which can take 1 or 2 seconds.

Here are some notable differences between 2960-X and -XR compared to 2960-S.

  • Dual core CPU @ 600 MHz. 2960-S has single core
  • 2960-XR has support for dual power supplies
  • 256 MB of flash for -XR, 128 MB for -X. The S model has 64 MB
  • 512 MB of DRAM compared to 256 for 2960-S
  • 1k active VLANs compared to 255 for 2960-S
  • 48 Etherchannel groups for -XR, 24 for -X and 6 for -S
  • 4 MB of egress buffers instead of 2 MB
  • 4 SPAN sessions instead of 2
  • 32k MACs for -XR, 16k for -X and 8k for -S
  • 24k unicast routes for -XR, 16 static routes for -X and -S

The newer models also support Netflow lite, hibernation mode and EEE.

The 2960-XR does support dynamic routing. It has support for RIP, OSPF stub,
OSPFv3 stub, EIGRP stub, HSRP, VRRP and PIM.

Here are some performance numbers:

2960-X Lan Lite has 100 Gbps of switching bandwidth and 64 active VLANS.
2960-X Lan Base has 216 Gbps of switching bandwidth and 1023 active VLANs.
The same holds true for 2960-XR with IP Lite feature set. The 2960-S had
a maximum of 255 VLANs and 176 Gbps switching bandwidth. Depending on
model the 2960-X tops out at 130.9 Mpps compared to 101.2 for 2960-S.

The switches also have added support for IPv6. Notable features are:

  • IPv6 MLDv1 and v2 snooping
  • IPv6 First Hop Security (RA guard, source guard, and binding integrity guard
  • IPv6 ACLs
  • IPv6 QoS
  • HTTP/HTTPs over IPv6
  • SNMP over IPv6
  • Syslog over IPv6

I’m expecting more information to come out as it gets presented during Cisco Live
in Orlando.

Cisco releases new switch – Catalyst 3850

January 30, 2013 8 comments

Cisco is releasing a new Catalyst switch and it will be called 3850. Some of
you might already have heard about it. I’ve seen some presentations about it
but now it’s offical. Lets look at some of the highlights of this new switch.
If you want the full info check out BRKCRS-2887 from Cisco Live in London.

  • Integrated wireless controller
  • Clean Air
  • Radio Resource Management (RRM)
  • 802.11ac ready
  • Stacking, Stackpower
  • Flexible Netflow
  • Granular QoS
  • Energywise

It terminates CAPWAP and DTLS in hardware. One switch/stack can support up to
50 APs and 2000 clients. Wireless capacity is 40G/switch. Supports IPv4 and
IPv6 client mobility. IP base license level is required to use wireless capabilities.

The stack supports 480Gbps and the fans and power supplies are field replacable.
It also has support for stackpower and linerate on all ports. The switch supports
SFP and SFP+ modules. Different network modules can be inserted with different
capabilities. WS-C3850-NM4-1G has 4x 1G ports (SFP). WS-C3850-NM-2-10G has 2x 10G
OR 4x 1G OR 2x 1G AND 1x 10G. WS-C3850-NM-4-10G is autosensing and supports all
combinations up to 4x 10G, only supported on WS-C3850-48.

Power modules are available at 350W, 715W and 1100W and are called PWR-C1-350WAC,
PWR-C1-715WAC, PWR-C1-1100WAC.

Here is a comparison to the 3750-X.

Comparision_3750X

As you can see it’s a pretty good improvement compared to the 3750.

Some more features:

  • Cavium 6230 800 MHz 4-core CPU
  • IOS XE
  • 2GB flash, 4GB DRAM
  • 84Mpps per ASIC
  • Line rate for 64-byte packets
  • 8 queues per port (wired), 4 queues per port (AP ports)
  • Flexible Netflow

 

So one major thing here is that it is actually running IOS XE and it has an
IOS daemon supporting IOS. This enables support for the multicore CPU. It
allows for hosted applications like Wireshark. Here is a look under the hood
of the switch.

3850_under_the_hood

Finally the Catalyst 3850 uses MQC and not MLS QoS which is nice to see. This
means the QoS features will be more comparable to those of a router. A nice
features is that you can apply different QoS settings depending on the SSID.

All in all this looks like a very intesting switch for the enterprise that has
both wired and wireless needs.

Catalyst QoS – A deeper look at the egress queues

October 8, 2012 2 comments

I’ve done a post earlier on Catalyst QoS. That described how to
configure the QoS features on the Catalyst but I didn’t describe
in detail how the buffers work on the Catalyst platform. In this
post I will go into more detail about the buffers and thresholds
that are used.

By default, QoS is disabled. When we enable QoS all ports
will be assigned to queue-set 1. We can configure up to two
different queue-sets.

sh mls qos queue-set 
Queueset: 1
Queue     :       1       2       3       4
----------------------------------------------
buffers   :      25      25      25      25
threshold1:     100     200     100     100
threshold2:     100     200     100     100
reserved  :      50      50      50      50
maximum   :     400     400     400     400
Queueset: 2
Queue     :       1       2       3       4
----------------------------------------------
buffers   :      25      25      25      25
threshold1:     100     200     100     100
threshold2:     100     200     100     100
reserved  :      50      50      50      50
maximum   :     400     400     400     400

These are the default settings. Every port on the Catalyst has
4 egress queues (TX). When a port is experiencing congestion
it needs to place the packet into a buffer. If a packet gets
dropped it is because there were not enough buffers to store it.

So by default each queue gets 25% of the buffers. The value is
in percent to make it usable across different versions of the Catalyst
since they may have different size of buffers. The ASIC will have
buffers of some size, maybe a couple of megs but this size is not known
to us so we have to use the percentages.

Of the buffers we assign to a queue we can make the buffers reserved.
This means that no other queue can borrow from these buffers. If we
compare it to CBWFQ it would be the same as the bandwidth percent command
because that guarantees X percent of the bandwidth but it may use more
if there is bandwidth available. The buffers work the same way. There is
a common pool of buffers. The buffers that are not reserved go into the
common pool. By default 50% of the buffers are reserved and the rest go
into the common pool.

There is a maximum how much buffers the queue may use and by default this
is set to 400% This means that the queue may use up to 4x more buffers than
it has allocated (25%).

To differentiate between packets assigned to the same queue the thresholds
can be used. You can configure two thresholds and then there is an implicit
threshold that is not configurable (threshold3). It is always set to the maximum the queue
can support. If a threshold is set to 100% that means it can use 100% of
the buffers allocated to a queue. It is not recommended to put a low value
for the thresholds. IOS enforces a limit of at least 16 buffers assigned
to a queue. Every buffer is 256 bytes which means that 4096 bytes are
reserved.

	 Q1% Q1buffer Q2% Q2buffer Q3% Q3buffer Q4% Q4buffer
buffers  25           25           25           25
Thresh1  100 50       100 50       100 50       100 50
Thresh2  100 50       100 50       100 50       100 50
Reserved 50  25       50  25       50  25       50  25
maximum  400 200      400 200      400 200      400 200

This table explains how the buffers works. Lets say that this port
on the ASIC has been assigned 200 buffers. Every queue gets 25% of the
buffers which is 50 buffers. However out of these 50 buffers only 50%
are reserved which means 25 buffers. The rest of the buffers go to the
common pool. The thresholds are set to 100% which means they can use 100%
of the allocated buffers to the queue which was 50 buffers. For packets
that go to threshold3 400% of the buffers can be used which means 200 buffers.
This means that a single queue can use up all the non reserved buffers
if the other queues are not using them.

To see which queue packets are getting queued to we can use the show
platform port-asic stats enqueue command.

Switch#show platform port-asic stats enqueue gi1/0/25
Interface Gi1/0/25 TxQueue Enqueue Statistics
Queue 0
Weight 0 Frames 2
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 3729
Weight 1 Frames 91
Weight 2 Frames 1894
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 577

In this output we have the four queues with three thresholds. Note that queue 0
here is actually queue 1. Queue 1 is queue 2 and so on. Weight 0 is
threshold1, weight 1 is threshold2 and weight 3 is the maximum threshold.

We can also list which frames are being dropped. To do this we use the
show platform port-asic stats drop command.

Switch-38#show platform port-asic stats drop gi1/0/25
Interface Gi1/0/25 TxQueue Drop Statistics
Queue 0
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 5
Weight 1 Frames 0
Weight 2 Frames 0
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0

The queues are displayed in the same way here where queue 0 = queue 1.
This command can be good to find out if you are having packet loss for important
traffic like IPTV traffic or such that is dropping in a certain queue.

The documentation for Catalyst QoS can be a bit shady and by this post I
hope that you know have a better understanding how the egress queueing works.

Categories: Catalyst, CCIE, QoS Tags: , , ,

Catalyst QoS

March 7, 2012 22 comments

I’m back studying and I have already booked a new lab date.
I won’t make an announcement until I get back.
This is to keep a bit of the pressure off from taking the lab.

The last couple of days I have been studying Catalyst QoS. It can get a bit messy.

There are no Catalyst 3550’s in the lab any longer, only 3560. So when practicing
forget about 3550. If you have a 3750 that is fine since it is basically the same
switch as 3560 but with stacking.

The Catalyst has 2 ingress queues, one can be used for priority. The switch also
has 4 egress queues where one can be used for priority. It is more likely to end
up with congestion on egress than ingress but we have the option of configuring
both.

Lets assume that we have a switch with a Cisco IP phone connected to it.
The switchport will have a general configuration like the one below.

interface FastEthernet0/10
switchport mode access
switchport access vlan 10
switchport voice vlan 100

Even though the port is configured as access this is actually a form of trunk since
voice and data are using different VLANs. By default QoS is disabled. This means that
the switch will be transparent to the QoS markings. Whatever the phone, computers are
setting will be transparently forwarded through the switch. As soon as we turn on QoS
with the mls qos command this behaviour is no longer true.
If we just enable QoS and do nothing more than all markings will be set to BE (0).
This is true both for CoS and DSCP. To check if QoS is enabled use show mls qos.

Rack18SW1#sh mls qos
QoS is disabled
QoS ip packet dscp rewrite is enabled

If we trust the device connecting to a port, most likely a phone then we setup
a trust boundary. We can trust CoS, IP precedence or DSCP. CoS or DSCP will be
more common than precedence.
The CoS is a layer 2 marking, sometimes also called 802.1p priority bits.
CoS is only available in tagged frames like on a 802.1Q trunk. There is a risk
of loosing the marking when the frame gets forwarded through different media from
Ethernet to frame relay or PPP or whatever your links are running. Because of this
it makes much sense to either trust DSCP or use the CoS value to map to a DSCP value.
If we want to configure trusting of CoS on a port we configure it like this.

interface FastEthernet0/10
mls qos trust cos

This means that the CoS marking coming in to the port is trusted. Untagged frames
will receive BE treatment since there is no marking of those packets. If we want to
mark the untagged frames we use the following configuration.

interface FastEthernet0/10
mls qos trust cos
mls qos cos 3

All untagged frames will get a CoS marking of 3. What if we want the port to
mark all the packets the same no matter comes in to the port?
We can use the override command for this.

interface FastEthernet0/10
mls qos cos 1
mls qos cos override

This will effectively set the CoS value to 1 for all frames entering the port.
We can also use the switchport priority command to instruct the Cisco phone to set a
CoS marking on packets from the computer (data) entering the IP phone.

interface FastEthernet0/10
switchhport priority extend cos 1

This will set all frames from the computer entering the phone to have a
marking of 1 no matter what the computer tries to set them to.

It is important to know that the Catalyst switch uses a concept of an
internal QoS label. This is a DSCP value which is used internally and
will define into which queues the traffic ends up.
If you type show mls qos map you will see a lot of different maps that
the Catalyst uses. The CoS to DSCP map is used by the switch so if we trust
CoS then a DSCP value will be derived from that and when the frame is exiting
the switch to another switch then the CoS value will be set according
to the DSCP to QoS mapping table. This effectively keeps the QoS labels synchronized.

Now lets take a look at the ingress queues. We have two of them.
By default queue 2 will be the priority queue. To see the default settings
use the show mls qos input-queue command. We can manipulate which queue becomes
the priority queue and this is done with the
mls qos srr-queue input priority-queue bandwidth command.
If you want to use queue 1 as the priority queue then enter a 1 in the command.
The weight defines how much bandwidth the priority queue can use. By default it uses 10%
You can set this value from 0 to 40 so that it does not starve all of the bandwidth.

Rack18SW1#sh mls qos input-queue
Queue     :       1       2
----------------------------------------------
buffers   :      90      10
bandwidth :       4       4
priority  :       0      10
threshold1:     100     100
threshold2:     100     100
Rack18SW1#
Rack18SW1(config)#mls qos srr-queue input priority-queue 1 bandwidth ?
    enter bandwidth number [0-40]

The switch uses buffers if there is a need to queue packets. Remember the basic
function of QoS that without congestion there is no queueuing to begin with, only forwarding.
Unfortunately Cisco does not tell us a lot about how much buffers are available
in the Catalyst platforms. To tune the buffers we use mls qos srr-queue input buffers.
We should not assign too much of the buffers to the priority queue.
Finding optimal values depends a lot on your network and takes a lot of testing.
The safest bet might be to use Auto QoS and look what Cisco is using.
These values have been researched by Cisco and should be safe to use.
Lets temporarily enable Auto QoS and look which values we get.

Rack18SW2#sh mls qos input-queue
Queue     :       1       2
----------------------------------------------
buffers   :      67      33
bandwidth :      90      10
priority  :       0      10
threshold1:       8      34
threshold2:      16      66

With Auto QoS configured the priority queue gets 10% of bandwidth
and 33% of the buffers. The thresholds for the non priority queue are significantly
lower than the default settings. So the buffers assigns buffer space to the
queues but it does not say how much bandwidth is available to each queue.
We control this with mls qos srr-queue input bandwidth.
It is important to note here that the priority queue gets served first and then a
Shared Round Robin (SRR) algorithm is used to divide the traffic between the
two queues according to the weights. These are just weights and not necessarily
percentages although you could configure it to be.

Rack18SW1(config)#mls qos srr-queue input bandwidth ?
    enter bandwidth weight for queue id 1

If we look at show mls qos map cos-input-q and show mls qos dscp-input-q
we can see the maps that are used to define which queue the traffic ends up in.
We can of course set these values according to our needs.

 Rack18SW1#sh mls qos map cos-input-q
   Cos-inputq-threshold map:
              cos:  0   1   2   3   4   5   6   7
              ------------------------------------
  queue-threshold: 1-1 1-1 1-1 1-1 1-1 2-1 1-1 1-1

Everything is by default mapped to queue 1 except for CoS 5 which is
mapped to queue 2. The general idea is to map VoIP to queue 2 and everything
else in queue 1. Lets look at the DSCP table as well.

Rack18SW1#sh mls qos map dscp-input-q
   Dscp-inputq-threshold map:
     d1 :d2    0     1     2     3     4     5     6     7     8     9
     ------------------------------------------------------------
      0 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01
      1 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01
      2 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01
      3 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01
      4 :    02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01 01-01 01-01
      5 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01
      6 :    01-01 01-01 01-01 01-01

To read this table start by reading from the left column and combining
that number with a number to a row on the right. Almost everything is mapped
to queue 1 except for DSCP 40-47 which is mapped to queue 2.

Up until now we have only discussed queues. The catalyst switch also uses
a congestion avoidance mechanism that is called Weighted Tail Drop (WTD).
The switch has three thresholds for every queue where the third threshold
is not configurable, it is always set to 100%
We can set the other two thresholds to values of our liking. Now we will
map CoS 6 to queue 2, threshold 3. We don’t want this traffic to get dropped unless
there is no other option.

Rack18SW1(config)#mls qos srr-queue input cos-map queue 2 threshold 3 ?
    8 cos values separated by spaces

Rack18SW1(config)#mls qos srr-queue input cos-map queue 2 threshold 3 6

Always confirm your result with the show mls qos map command.

Rack18SW1#sh mls qos map cos-input-q
   Cos-inputq-threshold map:
              cos:  0   1   2   3   4   5   6   7
              ------------------------------------
  queue-threshold: 1-1 1-1 1-1 1-1 1-1 2-1 2-3 1-1

Now lets try to map DSCP EF to queue 1, threshold 3.

Rack18SW1(config)#mls qos srr-queue input dscp-map queue 1 threshold 3 ?
    dscp values separated by spaces (up to 8 values total)

Rack18SW1(config)#mls qos srr-queue input dscp-map queue 1 threshold 3 46

That covers the ingress queues. Note that all commands will affect all
ports on the switch, there is no way of setting port specific QoS settings for input queues.

Now lets look at our options for egress queues. We have four queues
where every queue has three thresholds. We start by looking at the default settings.

Rack18SW1#sh mls qos map cos-output-q
   Cos-outputq-threshold map:
              cos:  0   1   2   3   4   5   6   7
              ------------------------------------
  queue-threshold: 2-1 2-1 3-1 3-1 4-1 1-1 4-1 4-1
Rack18SW1#sh mls qos map dscp-output-q
   Dscp-outputq-threshold map:
     d1 :d2    0     1     2     3     4     5     6     7     8     9
     ------------------------------------------------------------
      0 :    02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01
      1 :    02-01 02-01 02-01 02-01 02-01 02-01 03-01 03-01 03-01 03-01
      2 :    03-01 03-01 03-01 03-01 03-01 03-01 03-01 03-01 03-01 03-01
      3 :    03-01 03-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01
      4 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 04-01 04-01
      5 :    04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01
      6 :    04-01 04-01 04-01 04-01

The egress queueing is a bit more flexible. With the SRR algorithm we
can do some port specific bandwidth control. When it comes to egress queues
we can either shape or share a queue. A shaped queue is guaranteed an amount
of bandwidth but is also policed to that value. Even if there is no
congestion that queue still can’t use more bandwidth then it has been assigned.
The shaped value is calculated against the physical interface speed.
Look at the following command.

Rack18SW1(config-if)#srr-queue bandwidth shape 25 0 0 0

How much bandwidth did we just assign to queue 1? 25 Mbit?
We assigned 4 Mbit since (1/25)*100 = 4. When we set the other queues to 0
this means that they are operating in shared mode instead of shaped.
Now we configure the three other queues.

Rack18SW1(config-if)#srr-queue bandwidth share 33 33 33 33

How much does every queue get? Notice I put a 33 for queue 1 but
that will do nothing since it is operating in shaped mode. That leaves
us with the other three queues. To calculate their share we use
(33/33+33+33)*96 = 32 Mbit. So these values are just a weight
and we have to subtract the value from the shaped queue when calculating
how much the other queues get. When operating in shared mode if one
queue is not using all of its bandwidth the other queues may cut into this.
This is different compared to the shaped mode.

Assigning values to the egress queue depending on CoS or DSCP works
the same way as for ingress.

Rack18SW1(config)#mls qos srr-queue output cos-map queue 3 threshold 3 ?
    8 cos values separated by spaces

Rack18SW1(config)#mls qos srr-queue output cos-map queue 3 threshold 3 5
Rack18SW1(config)#mls qos srr-queue output dscp-map queue 2 threshold 3 ?
    dscp values separated by spaces (up to 8 values total)

Rack18SW1(config)#mls qos srr-queue output dscp-map queue 2 threshold 3 46

The egress queues also uses buffers. These can be tuned by configuring a
queue-set. By default all ports will use queue-set 1 with these settings.

Rack18SW1#show mls qos queue-set
Queueset: 1
Queue     :       1       2       3       4
----------------------------------------------
buffers   :      25      25      25      25
threshold1:     100     200     100     100
threshold2:     100     200     100     100
reserved  :      50      50      50      50
maximum   :     400     400     400     400

We can configure one of our own queue-sets and tell a port to use this instead.

Rack18SW1(config)#mls qos queue-set output 2 buffers 10 10 40 40

We can also configure thresholds and how much of the buffers are reserved.

Rack18SW1(config)#mls qos queue-set output 2 threshold ?
    enter queue id in this queue set

Rack18SW1(config)#mls qos queue-set output 2 threshold 1 ?
    enter drop threshold1 1-3200

Rack18SW1(config)#mls qos queue-set output 2 threshold 1 50 ?
    enter drop threshold2 1-3200

Rack18SW1(config)#mls qos queue-set output 2 threshold 1 50 200 ?
    enter reserved threshold 1-100

Rack18SW1(config)#mls qos queue-set output 2 threshold 1 50 200 75 ?
    enter maximum threshold 1-3200

Rack18SW1(config)#mls qos queue-set output 2 threshold 1 50 200 75 300

Then we need to actually assign the queue-set to an interface.

Rack18SW1(config)#interface FastEthernet0/10
Rack18SW1(config-if)#queue-set 2

We can check our settings with show mls qos queue-set

Rack18SW1#sh mls qos queue-set 2
Queueset: 2
Queue     :       1       2       3       4
----------------------------------------------
buffers   :      10      10      40      40
threshold1:      50     200     100     100
threshold2:     200     200     100     100
reserved  :      75      50      50      50
maximum   :     300     400     400     400

The buffer values can be a bit confusing. First we define how big a share
the queue gets of the buffers in percent. The thresholds will define when
traffic will be dropped. For queue 1 we start dropping traffic in threshold 1 at 50%
Then we drop traffic in threshold 2 at 200%
How can a queue get to 200%?! The secret here is that a queue can outgrow
the buffers we assign if there are buffers available in the common pool.
This is where the reserved values comes into play.
Every queue gets assigned buffers but we can define that only 50% of
these buffers are strictly reserved for the queue.
The other 50% goes into a common pool and can be used by the other queues as well.
We then set a maximum value for the queue which says that it can grow
up to 400% but no more than that.

Early in this post I talked about the priority queue for egress queues.
This how we enable it.

Rack18SW1(config)#int f0/10
Rack18SW1(config-if)#priority-queue out

This will always be queue 1 and is not configurable.

Now lets move on to some other things we can do with QoS. Lets assume that we
have a customer connecting to switch and internally they are using totally different
DSCP values than we want. We can use a DSCP mutation map for that.

Rack18SW1(config)#mls qos map dscp-mutation MUTATE 40 ?
    DSCP values separated by spaces (up to 8 values total)
  to      to keyword

Rack18SW1(config)#mls qos map dscp-mutation MUTATE 40 to 46
Rack18SW1(config)#int f0/10
Rack18SW1(config-if)#mls qos trust dscp
Rack18SW1(config-if)#mls qos dscp-mutation MUTATE

So in this example we are mutating DSCP 40 (CS5) to DSCP 46 (EF).

We also have the option of using policy maps just like on routers.
And we can even police traffic. This policy-map will match all ICMP and police
it to 128k with a marking of EF, any exceeding traffic will be remarked
to DSCP 0.

Rack18SW1(config)#ip access-list extended ICMP
Rack18SW1(config-ext-nacl)#permit icmp any any
Rack18SW1(config-ext-nacl)#class-map CM_ICMP
Rack18SW1(config-cmap)#match access-group name ICMP
Rack18SW1(config-cmap)#policy-map POLICE
Rack18SW1(config-pmap)#class CM_ICMP
Rack18SW1(config-pmap-c)#police 128000 32000 ex
Rack18SW1(config-pmap-c)#police 128000 32000 exceed-action policed-dscp-transmit
Rack18SW1(config-pmap-c)#set dscp ef
Rack18SW1(config-pmap-c)#exit
Rack18SW1(config-pmap)#exit
Rack18SW1(config)#mls qos map policed-dscp 46 ?
    DSCP values separated by spaces (up to 8 values total)
  to      to keyword
Rack18SW1(config)#mls qos map policed-dscp 46 to 0
Rack18SW1(config)#int f0/10
Rack18SW1(config-if)#service-policy input POLICE

If you are used to configuring MQC on routers then you will be surprised
to know that the show policy-map is not working for switches.
We need to use show mls qos int statistics instead.

Rack18SW1#sh mls qos int f0/10 statistics
FastEthernet0/10 (All statistics are in packets)

  dscp: incoming
-------------------------------

  0 -  4 :           0            0            0            0            0
  5 -  9 :           0            0            0            0            0
 10 - 14 :           0            0            0            0            0
 15 - 19 :           0            0            0            0            0
 20 - 24 :           0            0            0            0            0
 25 - 29 :           0            0            0            0            0
 30 - 34 :           0            0            0            0            0
 35 - 39 :           0            0            0            0            0
 40 - 44 :           0            0            0            0            0
 45 - 49 :           0            0            0            0            0
 50 - 54 :           0            0            0            0            0
 55 - 59 :           0            0            0            0            0
 60 - 64 :           0            0            0            0
  dscp: outgoing
-------------------------------

  0 -  4 :           0            0            0            0            0
  5 -  9 :           0            0            0            0            0
 10 - 14 :           0            0            0            0            0
 15 - 19 :           0            0            0            0            0
 20 - 24 :           0            0            0            0            0
 25 - 29 :           0            0            0            0            0
 30 - 34 :           0            0            0            0            0
 35 - 39 :           0            0            0            0            0
 40 - 44 :           0            0            0            0            0
 45 - 49 :           0            0            0            0            0
 50 - 54 :           0            0            0            0            0
 55 - 59 :           0            0            0            0            0
 60 - 64 :           0            0            0            0
  cos: incoming
-------------------------------

  0 -  4 :           2            0            0            0            0
  5 -  7 :           0            0            0
  cos: outgoing
-------------------------------

  0 -  4 :           0            0            0            0            0
  5 -  7 :           0            0            0
Policer: Inprofile:            0 OutofProfile:            0

This is a huge table showing how much traffic with different markings
are coming in and going out. At the very end you see the Policer which
shows how much is in profile and out of profile.

So that is one way of configuring policy-maps. When using the
catalyst switches they can use QoS in either VLAN based mode or port based mode.
If we use VLAN based mode we will apply the policy to a SVI instead.
This might be more scalable depending on your setup.
The caveat with using a policy-map on SVI is that you can’t police in the parent map.
You need a child map for that. Lets look at an example using a parent and child map.
Any IP traffic from trunks may use 256k (Fa0/13 – 21) and traffic from Fa0/6 will
be restricted to 56k. This will all be configured for VLAN 146.

Rack18SW1(config)#int range fa0/13 -21 , fa0/6
Rack18SW1(config-if)#mls qos vlan-based
Rack18SW1(config-if)#exit
Rack18SW1(config)#ip access-list extended IP_ANY
Rack18SW1(config-ext-nacl)#permit ip any any
Rack18SW1(config-ext-nacl)#class-map CM_IP_ANY
Rack18SW1(config-cmap)#match access-group name IP_ANY
Rack18SW1(config-cmap)#class-map CM_TRUNKS
Rack18SW1(config-cmap)#match input-interface fa0/13 - fa0/21
Rack18SW1(config-cmap)#class-map CM_R6
Rack18SW1(config-cmap)#match input-interface fa0/6
Rack18SW1(config-cmap)#policy-map CHILD
Rack18SW1(config-pmap)#class CM_TRUNKS
Rack18SW1(config-pmap-c)#police 256000 32000
Rack18SW1(config-pmap-c)#class CM_R6
Rack18SW1(config-pmap-c)#police 56000 28000
Rack18SW1(config-pmap-c)#policy-map PARENT
Rack18SW1(config-pmap)#class CM_IP_ANY
Rack18SW1(config-pmap-c)#service-policy CHILD
Rack18SW1(config-pmap-c)#set dscp cs1
Rack18SW1(config-pmap-c)#int vlan 146
Rack18SW1(config-if)#service-policy input PARENT

The final thing I want to show is how to use an aggregate policer.
We can use this if we want several classes to share a bandwidth instead
of setting bandwidth per class. Take a look at this.

Rack18SW1(config)#ip access-list extended ICMP
Rack18SW1(config-ext-nacl)#permit icmp any any
Rack18SW1(config-ext-nacl)#ip access-list extended HTTP
Rack18SW1(config-ext-nacl)#permit tcp any eq www any
Rack18SW1(config-ext-nacl)#permit tcp any any eq www
Rack18SW1(config-ext-nacl)#class-map CM_ICMP
Rack18SW1(config-cmap)#match access-group name ICMP
Rack18SW1(config-cmap)#class-map CM_HTTP
Rack18SW1(config-cmap)#match access-group name HTTP
Rack18SW1(config-cmap)#exit
Rack18SW1(config)#mls qos aggregate-policer AGG256k ?
    Bits per second (postfix k, m, g optional; decimal point
                      allowed)

Rack18SW1(config)#mls qos aggregate-policer AGG256k 256000 ?
    Normal burst bytes

Rack18SW1(config)#mls qos aggregate-policer AGG256k 256000 32000 ?
  exceed-action  action when rate is exceeded

Rack18SW1(config)#$regate-policer AGG256k 256000 32000 exceed-action drop
Rack18SW1(config)#policy-map AGG_POLICER
Rack18SW1(config-pmap)#class CM_ICMP
Rack18SW1(config-pmap-c)#police aggre
Rack18SW1(config-pmap-c)#police aggregate AGG256k
Rack18SW1(config-pmap-c)#class CM_HTTP
Rack18SW1(config-pmap-c)#police aggregate AGG256k
Rack18SW1(config-if)#int f0/2
Rack18SW1(config-if)#service-policy input AGG_POLICER

And before I leave you there is this final thing I want to show.
That is how to limit the egress traffic on an interface with the SRR command.
Let’s say that We have a 100 Mbit interface but customer only pays for 4 Mbit.
We can use this command.

Rack18SW1(config-if)#srr-queue bandwidth limit ?
    enter bandwidth limit for interface  as percentage

The lowest we can set is 10 though. If we have a port running at
100 Mbit that will leave us with 10 Mbit. We can set the port to 10 Mbit
and configure this to 40% Then we will achieve the value that was requested.

This post turned out to be very long but I hope it has been informative
and I know for sure it helped me to solidify the concepts.
I hope it will help a lot of other people as well.