CCIE Blog

Helping you become a Cisco Certified Internetwork Expert


Internetwork Expert Home  |  Entries (RSS)  |  Comments (RSS)
Welcome to Internetwork Expert's CCIE Blog

Welcome to Internetwork Expert’s CCIE Blog! This site is dedicated to helping you in your pursuit of becoming a Cisco Certified Internetwork Expert in Routing & Switching, Voice, Security, Service Provider, and Storage. Through this blog you can submit questions to our expert instructors, Brian Dennis - Quintuple CCIE #2210, Scott Morris - Quad CCIE #4713, Brian McGahan – Triple CCIE #8593, Petr Lapukhov - Quad CCIE #16379 and Anthony Sequeira - CCIE #15626. Check back daily as this blog will be updated frequently.

Click here to submit a question.

August 31st, 2008

Stealing The Internet - A Routed, Wide-area, Man in the Middle Attack

Below is a link to a presentation from this year’s Defcon in regards to the inherent insecurity of BGP prefix advertisements on the Internet.  This is of course old news to most of us but still an interesting presentation.

https://www.defcon.org/images/defcon-16/dc16-presentations/defcon-16-pilosov-kapela.pdf

Additionally there was another good presentation in regards to “Developments in Cisco IOS Forensics” that is an interesting read.

https://www.defcon.org/images/defcon-16/dc16-presentations/defcon-16-fx.pdf

August 29th, 2008

Policy Change to Payment for CCIE Labs - Good News!

<quote>

Policy Change to Payment for CCIE Labs

In effort to improve the availability of CCIE lab exams Cisco has updated the CCIE lab payment process.

On September 6, 2008 the payment policy for CCIE labs will be as follows:

Payment in full is due 90 days (calendar) prior to your lab date. Payment must be received to confirm your date. After 90 days refunds will not be available for canceled lab dates.

The change in this policy will allow for lab seats to be open in a timely manner and create more desirable time frames.

If you have questions or want to confirm you are within the 90+ day window please contact customer support.

</quote>

This is good news as it should open up more lab dates 90 days out as opposed to 30 days out.

August 28th, 2008

CCIE Lab Interviews?

I ran across this email on a mailing list today in regards to Cisco interviewing candidates before allowing them to take the exam.  It’s unconfirmed as to it’s authenticity but there have been stories of problems with certain CCIE lab locations (e.g. someone taking the lab for 4 or 5 other people, someone else with a very good memory but no Cisco networking skills taking the exam to just brain dump it, etc).

Dear Candidate:

On August 27, Cisco will introduce a pilot for the CCIE Routing and
Switching lab exam in Beijing, China. The pilot will add a 10-minute
interview that will assess the candidate's ability to apply expert-level
networking skills and knowledge to networking problems that are encountered
on the job. After the lab orientation, a panel of three experts will conduct
a verbal interview with each candidate, asking a series of expert-level
networking questions (questions and answers will be in English). The ability
to correctly answer these questions will affect the exam score. After
completing the interview, the candidate will have the entire 8 hours to
complete the lab portion of the exam.  These scores will then be
calculated and then combined for a total score which will decide a pass
or a fail.

Our goal with this email is to let you know that your day will extend beyond
the normal testing day by approximately one hour.  The additional hour will
be at the end of the day. We hope you find this interview process
enlightening and helpful as we continue to strive for the standard the world
has come to expect from CCIE.

August 27th, 2008

The War is On Between R4 and SW4!

Today my routers finally passed the point of no return. Negotiations between R4 and SW4 broke down, and the course of action we were all trying to avoid was now inevitable… all out war.

R4#
%OSPF-4-FLOOD_WAR: Process 1 re-originates LSA ID 204.12.1.0 type-5 adv-rtr 223.255.255.255 in area 0
%OSPF-4-FLOOD_WAR: Process 1 re-originates LSA ID 31.2.0.0 type-5 adv-rtr 223.255.255.255 in area 0
%OSPF-4-FLOOD_WAR: Process 1 re-originates LSA ID 31.3.0.0 type-5 adv-rtr 223.255.255.255 in area 0
%OSPF-4-FLOOD_WAR: Process 1 re-originates LSA ID 204.12.1.0 type-5 adv-rtr 223.255.255.255 in area 0
%OSPF-4-FLOOD_WAR: Process 1 re-originates LSA ID 31.2.0.0 type-5 adv-rtr 223.255.255.255 in area 0
%OSPF-4-FLOOD_WAR: Process 1 re-originates LSA ID 31.3.0.0 type-5 adv-rtr 223.255.255.255 in area 0

Who will be the winner? Only time will tell. What sent them over the edge though? Did the diplomat in charge of DTP negotiation fail?

Be the first person to tell me why R4 and SW4 declared all out WAR on each other and win a $50 amazon gift card! Post your comments now!

Update:

Congratulations to Patrik Berglund, winner of a $50 amazon gift card!

R4 and SW4 declared war on each other because they had duplicate OSPF Router-IDs. When R4 redistributed routes into OSPF, it generated LSA Type-5 routes tagged with its own Router-ID, 223.255.255.255. Per RFC 2328, OSPFv2:

    13.4.  Receiving self-originated LSAs

        It is a common occurrence for a router to receive self-
        originated LSAs via the flooding procedure. A self-originated
        LSA is detected when either 1) the LSA's Advertising Router is
        equal to the router's own Router ID or 2) the LSA is a network-
        LSA and its Link State ID is equal to one of the router's own IP
        interface addresses.

        However, if the received self-originated LSA is newer than the
        last instance that the router actually originated, the router
        must take special action.  The reception of such an LSA
        indicates that there are LSAs in the routing domain that were
        originated by the router before the last time it was restarted.
        In most cases, the router must then advance the LSA's LS
        sequence number one past the received LS sequence number, and
        originate a new instance of the LSA.

        It may be the case the router no longer wishes to originate the
        received LSA. Possible examples include: 1) the LSA is a
        summary-LSA or AS-external-LSA and the router no longer has an
        (advertisable) route to the destination, 2) the LSA is a
        network-LSA but the router is no longer Designated Router for
        the network or 3) the LSA is a network-LSA whose Link State ID
        is one of the router's own IP interface addresses but whose
        Advertising Router is not equal to the router's own Router ID
        (this latter case should be rare, and it indicates that the
        router's Router ID has changed since originating the LSA).  In
        all these cases, instead of updating the LSA, the LSA should be
        flushed from the routing domain by incrementing the received
        LSA's LS age to MaxAge and reflooding (see Section 14.1).

In this case, SW4 received an external LSA with its own Router-ID (223.255.255.255) as the originator ID. Since SW4 didn’t have a route to the destination that it was originating, it thought that it had previously originated the route, lost the route to the destination, and now received an old LSA which was aging out throughout the topology. In response to this SW4 incremented the age of the LSA to MaxAge, effectively poisoning it. When R4 received this back, it thought that its own LSA was somehow aged out, but since it had a route to the destination itself locally still it re-originated the LSA again. The fight between the legitimate route and the MaxAge route continues over and over, resulting in the FLOOD_WAR message on the command line.

For more detailed information and lab scenarios like this check out the new IEWB-RS Volume 1 Version 5.0!

August 26th, 2008

Cisco engineering units are the emerging measure of global power

There is an interesting article regarding CCIEs on www.pbs.org.   Here is an excerpt from it:

Leading indicators are measurements that change over time and suggest future trends for important second-order results like population growth and economic development. Economists in particular are often looking for indicators that have been known historically to lead the overall economy. If unemployment goes down, for example, it is a good bet that shortly thereafter income will rise and the economy will improve. It’s for this very reason, then, that economists and Wall Street fund managers are always looking for newer and better leading indicators. But such indicators needn’t be limited to the economy: they can apply to technology and technical culture, too, which has its own feedback loop to economic development. My friend George Morton, who figured this all out, says that by knowing the right numbers to look at we can have a good idea what countries will be leading in technology — and presumably in economic development and power — in the years ahead. The measure George likes is the number of Cisco Certified Internetwork Experts or CCIEs.

You can read the rest of the article at:

http://www.pbs.org/cringely/pulpit/2008/pulpit_20080822_005393.html

August 26th, 2008

Understanding the “shape peak” command

Note: The following post is an excerpt from the full QoS section of IEWB-RS VOL1 version 5.

Peak shaping may look confusing at first sight; however, its function becomes clear once you think of oversubscription. As we discussed before, oversubscription means selling customers more bandwidth than a network can supply, hoping that not all connections would use their maximum sending rate at the same time. With oversubscription, traffic contract usually specifies three parameters: PIR, CIR and Tc – peak rate, committed rate and averaging time interval for rate measurements. The SP allows customers to send traffic at rates up to PIR, but only guarantees CIR rate in case of network congestion. Inside the network SP uses any of the max-min scheduling procedures to implement bandwidth sharing in such manner that oversubscribed traffic has lower preference than conforming traffic. Additionally, the SP generally assumes that customers respond to notifications of traffic congestion in the network (either explicit, such as FECN/BECN/TCP ECN or implicit such as packet drops in TCP) by slowing down sending rate.

Commonly, customers implement traffic shaping to conform to traffic contract, and provider uses traffic policing to enforce the contract. If a contract specifies PIR, then it makes sense for customer to shape traffic at PIR rate. However, this makes difficult to deduce CIR value just by looking at the router configuration. In some circumstances, like with Frame-Relay networks, a secondary parameter, known as minCIR, may help to understand the configuration quickly. In general, it would benefit to see CIR and PIR in the shaping configuration at the same time. This is exactly the idea behind shape peak. When you configure

shape peak <CIR> <Bc> <Be>

the actual maximum sending rate is limited to:

PIR = CIR*(1+Be/Bc).

That is, each time interval Tc=Bc/CIR the shaper allows sending up to Bc+Be bits of data. By default, if you omit the value for Be, it equals to Bc and thus PIR=2*CIR by default. However, due to some IOS show output discrepancy, this is NOT reflected in “show” command output, unless you explicitly specify the Be value in command line. With shape peak configured this way, you can see both CIR as the “average rate” and PIR as the “target rate” when issuing “show policy-map” command.

Rack1R6#show policy-map interface fastEthernet 0/0.146
 FastEthernet0/0.146 

  Service-policy output: POLICY_VLAN146_OUT

    Class-map: HTTP (match-all)
      6846 packets, 4065413 bytes
      5 minute offered rate 63000 bps, drop rate 0 bps
      Match: access-group 180
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval  Increment
             Rate           Limit  bits/int  bits/int  (ms)      (bytes)
           128000/64000     1600   6400      6400      100       1600
…

All other shaping functions remain the same as with the classic GTS - shape peak is just more suited for use with oversubscription scenarios. Also, in Frame-Relay networks you may want to use configuration similar to the following to respond to congestion notifications:

shape peak <CIR> <Bc> <Be>
shape adaptive <CIR>

To illustrate the use of shape peak, let’s look at the following scenario. Here, R4 serves two customers (R1 and R6) sending their traffic across one serial link of 128Kbps between R4 and R5. The fictive ISP sells 128Kbps (PIR) to each of the customers, guaranteeing only 64Kbps (CIR). Let’s assume the measurement interval of 100ms for this configuration. The serial link, which is the oversubscribed resource, uses WFQ for fair bandwidth sharing between two flows.

Oversubscription scenario

R1:
access-list 180 permit tcp any eq 80 any
!
class-map HTTP
 match access-group 180
!
policy-map POLICY_VLAN146_OUT
  class HTTP
    shape peak 64000 6400 6400
!
interface FastEthernet 0/0
  service-policy output POLICY_VLAN146_OUT

R6:
access-list 180 permit tcp any eq 80 any
!
class-map HTTP
 match access-group 180
!
policy-map POLICY_VLAN146_OUT
  class HTTP
    shape peak 64000 6400 6400
!
interface FastEthernet 0/0.146
  service-policy output POLICY_VLAN146_OUT

R4:
!
! All HTTP traffic
!
ip access-list extended HTTP
 permit tcp any eq 80 any
!
class-map HTTP
 match access-group name HTTP

!
! Traffic from R1 and R6 respectively
!
ip access-list extended FROM_R1
 permit ip host 155.1.146.1 any
!
ip access-list extended FROM_R6
 permit ip host 155.1.146.6 any
!
!
!
class-map FROM_R1
 match access-group name FROM_R1
!
class-map FROM_R6
 match access-group name FROM_R6

!
! Subrate policers
!
policy-map SUBRATE_POLICER
 class FROM_R1
  police cir 64000 bc 3200 pir 128000 be 6400
   conform-action set-prec-transmit 1
   exceed-action set-prec-transmit 0
   violate-action drop
 class FROM_R6
  police cir 64000 bc 3200 pir 128000 be 6400
   conform-action set-prec-transmit 1
   exceed-action set-prec-transmit 0
   violate-action drop

!
! Policer configuration using MQC syntax.
!
policy-map POLICE_VLAN146
 class HTTP
   service-policy SUBRATE_POLICER
!
interface FastEthernet 0/1
  service-policy input POLICE_VLAN146

The idea is to allow R1 and R6 send up to 128Kbps if there is enough bandwidth on the serial link. However, if both of the sources start streaming at the same time, the SP may only guarantee up to 64Kbps to each of sending routers. The implementation meters each flow against 64Kbps and 128Kbps meters, and marks all conforming traffic with IP precedence of 1. All exceeding traffic is marked with IP precedence of 0. Since the serial link uses WFQ, we conclude that traffic marked with IP precedence of zero has lower scheduling weight. Thus, if IP precedence 1 traffic exist on the link, it is given preference over low-priority traffic (precedence 0).

To verify our configuration in action, start downloading a large file from R1 across R4 and see the statistics on R1 and R4:

Rack1R4#show policy-map interface fastEthernet 0/1
 FastEthernet0/1 

  Service-policy input: POLICE_VLAN146

    Class-map: HTTP (match-all)
      20451 packets, 12066090 bytes
      30 second offered rate 126000 bps, drop rate 0 bps
      Match: access-group name HTTP

      Service-policy : SUBRATE_POLICER

        Class-map: FROM_R1 (match-all)
          20451 packets, 12066090 bytes
          30 second offered rate 126000 bps, drop rate 0 bps
          Match: access-group name FROM_R1
          police:
              cir 64000 bps, bc 3200 bytes
              pir 128000 bps, be 6400 bytes
            conformed 11113 packets, 6556670 bytes; actions:
              set-prec-transmit 1
            exceeded 9338 packets, 5509420 bytes; actions:
              set-prec-transmit 0
            violated 0 packets, 0 bytes; actions:
              drop
            conformed 64000 bps, exceed 62000 bps, violate 0 bps

        Class-map: FROM_R6 (match-all)
          0 packets, 0 bytes
          30 second offered rate 0 bps, drop rate 0 bps
          Match: access-group name FROM_R6
          police:
              cir 64000 bps, bc 3200 bytes
              pir 128000 bps, be 6400 bytes
            conformed 0 packets, 0 bytes; actions:
              set-prec-transmit 1
            exceeded 0 packets, 0 bytes; actions:
              set-prec-transmit 0
            violated 0 packets, 0 bytes; actions:
              drop
            conformed 0 bps, exceed 0 bps, violate 0 bps

        Class-map: class-default (match-any)
          0 packets, 0 bytes
          30 second offered rate 0 bps, drop rate 0 bps
          Match: any

!
! The above statistics demonstrate that R1 uses almost all available bandwidth
! From the output below we can see that R1 is set to CIR 64Kbps and PIR 128Kbs.
! We may also notice that shaper was active for some time, delaying hundreds of
! exceeding packets. This usually happens in the beginning of TCP session when
! sendger aggressively increases sending rate.
!

Rack1R1#show policy-map interface fastEthernet 0/0
 FastEthernet0/0 

  Service-policy output: POLICY_VLAN146_OUT

    Class-map: HTTP (match-all)
      3225 packets, 1897929 bytes
      30 second offered rate 124000 bps, drop rate 0 bps
      Match: access-group 180
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval  Increment
             Rate           Limit  bits/int  bits/int  (ms)      (bytes)
           128000/64000     1600   6400      6400      100       1600     

        Adapt  Queue     Packets   Bytes     Packets   Bytes     Shaping
        Active Depth                         Delayed   Delayed   Active
        -      0         3225      1897929   348       205320    no

    Class-map: class-default (match-any)
      29 packets, 4378 bytes
      30 second offered rate 0 bps, drop rate 0 bps
      Match: any

Now start another file transfer, this time from R6 down to a host behind, R5 across the serial link. This will make both flows compete for the link bandwidth, and result in fair sharing of the link bandwidth. Now verify the policer statistics once again:

Rack1R4#show policy-map interface fastEthernet 0/1
 FastEthernet0/1 

  Service-policy input: POLICE_VLAN146

    Class-map: HTTP (match-all)
      35113 packets, 20715559 bytes
      30 second offered rate 126000 bps, drop rate 0 bps
      Match: access-group name HTTP

      Service-policy : SUBRATE_POLICER

        Class-map: FROM_R1 (match-all)
          29986 packets, 17691740 bytes
          30 second offered rate 63000 bps, drop rate 0 bps
          Match: access-group name FROM_R1
          police:
              cir 64000 bps, bc 3200 bytes
              pir 128000 bps, be 6400 bytes
            conformed 18466 packets, 10894940 bytes; actions:
              set-prec-transmit 1
            exceeded 11520 packets, 6796800 bytes; actions:
              set-prec-transmit 0
            violated 0 packets, 0 bytes; actions:
              drop
            conformed 63000 bps, exceed 0 bps, violate 0 bps

        Class-map: FROM_R6 (match-all)
          5127 packets, 3023819 bytes
          30 second offered rate 63000 bps, drop rate 0 bps
          Match: access-group name FROM_R6
          police:
              cir 64000 bps, bc 3200 bytes
              pir 128000 bps, be 6400 bytes
            conformed 5124 packets, 3022049 bytes; actions:
              set-prec-transmit 1
            exceeded 3 packets, 1770 bytes; actions:
              set-prec-transmit 0
            violated 0 packets, 0 bytes; actions:
              drop
            conformed 63000 bps, exceed 0 bps, violate 0 bps

        Class-map: class-default (match-any)
          0 packets, 0 bytes
          30 second offered rate 0 bps, drop rate 0 bps
          Match: any

!
! Verify statistics for both traffic shapers on R1 and R6. Both are set for PIR=128Kbps.
! However, metered rate is close to CIR, and the shaping is inactive. The sending rate
! went down thanks to TCP implicit congestion management procedure, that makes protocol
! sending rate adaptive to congestion in networks.
!

Rack1R6#show policy-map interface fastEthernet 0/0.146
 FastEthernet0/0.146 

  Service-policy output: POLICY_VLAN146_OUT

    Class-map: HTTP (match-all)
      6846 packets, 4065413 bytes
      5 minute offered rate 63000 bps, drop rate 0 bps
      Match: access-group 180
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval  Increment
             Rate           Limit  bits/int  bits/int  (ms)      (bytes)
           128000/64000     1600   6400      6400      100       1600     

        Adapt  Queue     Packets   Bytes     Packets   Bytes     Shaping
        Active Depth                         Delayed   Delayed   Active
        -      0         6846      4065413   3         1782      no

    Class-map: class-default (match-any)
      191 packets, 43930 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
      Match: any

Rack1R1#show policy-map interface fastEthernet 0/0
 FastEthernet0/0 

 Service-policy output: POLICY_VLAN146_OUT

    Class-map: HTTP (match-all)
      33062 packets, 19505469 bytes
      30 second offered rate 63000 bps, drop rate 0 bps
      Match: access-group 180
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval  Increment
             Rate           Limit  bits/int  bits/int  (ms)      (bytes)
           128000/64000     1600   6400      6400      100       1600     

        Adapt  Queue     Packets   Bytes     Packets   Bytes     Shaping
        Active Depth                         Delayed   Delayed   Active
        -      0         33062     19505469  2632      1552858   no

    Class-map: class-default (match-any)
      7641 packets, 7385752 bytes
      30 second offered rate 0 bps, drop rate 0 bps
      Match: any

Now let’s confirm that WFQ is actually working on the serial interface between R4 and R5 and provides truly fair division of the bandwidth:

Rack1R4#show queueing interface serial 0/1
Interface Serial0/1 queueing strategy: fair
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: weighted fair
  Output queue: 12/1000/64/0 (size/max total/threshold/drops)
     Conversations  2/3/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 96 kilobits/sec

(depth/weight/total drops/no-buffer drops/interleaves) 6/16192/0/0/0
Conversation 134, linktype: ip, length: 580
source: 155.1.146.1, destination: 155.1.58.8, id: 0xEB41, ttl: 254,
TOS: 32 prot: 6, source port 80, destination port 11001

(depth/weight/total drops/no-buffer drops/interleaves) 6/16192/0/0/0
Conversation 192, linktype: ip, length: 580
source: 155.1.146.6, destination: 155.1.108.10, id: 0×70CA, ttl: 254,
TOS: 32 prot: 6, source port 80, destination port 11002

To summarize, shape peak is a special form of shaping specifically adapted to configure oversubscription scenarios. All other properties of GTS remains the same.

August 25th, 2008

Documentation Update for the CCIE Lab

From Cisco:

<Quote>

CCIE labs changing from UniversCD to Cisco Documentation

On Sept 24 2008 CCIE labs will no longer support using the UniversCD documentation for the lab exam.

All labs are migrating to Cisco Documentation only. For those scheduled to take the CCIE lab prior to Sept 24 access will still be available for UniversCD.

The Cisco Documentation pages have the same information that currently resides on UniversCD, please refer to the links on the CCIE web pages to view these pages and become familiar with the new format.

After Sept 24 2008 only the Cisco Documentation web pages will be available for CCIE labs.

</Quote>

So what does this mean for people taking the lab after Sept 24th?  It means you will still have access to everything needed in relation to the documentation but you will need to access it using the link below:

http://cisco.com/web/psa/products/tsd_products_support_configure.html

August 21st, 2008

IEWB-RS Vol 1 V5 Alpha OSPF Labs Posted

A large portion of the OSPF section from the new IEWB-RS Volume 1 Version 5.0 has been posted on the members site.  This section is still “alpha” because it’s only about 1/3rd of the total content, but I wanted to get something posted due to the large number of requests from eager candidates I’m getting :)  The sections posted so far are complete, the alpha designation just means that there is much more to come later.

Please post all questions and comments on www.IEOC.com under the CCIE R&S Lab Workbook Volume I Version 5.0 forum.

Happy Labbing!

August 20th, 2008

CCIE Wireless Officially Announced!

GENERAL ANNOUNCEMENT:

Cisco is now soliciting beta candidates for Cisco’s upcoming CCIE Wireless Written Exam. We are looking for an exclusive set of professional and expert level Wireless Networking Engineers who can dedicate 3 hours of their time to take the beta exam.

The CCIE Wireless certification, to be launched later this year, will validate that professionals have the expertise to design, manage and support mission and business critical wireless networks and the job skills and technical knowledge required of expert level network IT practitioners. This written exam is the first step in obtaining the CCIE Wireless certification. Successful candidates will have mastered broad theoretical knowledge of wireless networking and demonstrated a readiness for the CCIE Wireless lab examination. Be one of the first wireless professionals to get a peek at the new CCIE Wireless exam.

  • Location: Pearson Vue Testing Centers, Worldwide
  • Registration Date: October 7
  • Last Test Date: November 14, 2008
  • Cost: $US50
  • Exam Number/Name: 351-050 CCIE Wireless Beta Written Exam

More….

August 17th, 2008

Insights on CBWFQ

Think you know everything about CBWFQ? :) OK, then look at the following configuration:


class-map match-all HTTP_R6
 match access-group name HTTP_R6
!
policy-map CBWFQ
 class HTTP_R6
  bandwidth remaining percent 5
!
interface Serial 0/1
  bandwidth 128
  clock rate 128000
  service-policy output CBWFQ

and try answering a question on the imaginable scenario: Two TCP flows (think of them as HTTP file transfers) are going across Serial 0/1 interface. One of the flows matches the class HTTP_R6, and another flow, marked with IP Precedence of 7, does not match any class. The traffic flow overwhelms the interface, so the system engages CBWFQ. Now the question is: how CBWFQ will share the interface bandwidth among the flows.

This is not an easy question. It is hard to answer if you try to hang with the bandwidth allocation logic described on DocCD. First, look at the answer:


Rack1R4#show policy-map interface serial 0/1
 Serial0/1 

  Service-policy output: CBWFQ

    Class-map: HTTP_R6 (match-all)
      10982 packets, 6368035 bytes
      30 second offered rate 95000 bps, drop rate 0 bps
      Match: access-group name HTTP_R6
      Queueing
        Output Queue: Conversation 41
        Bandwidth remaining 5 (%)Max Threshold 64 (packets)
        (pkts matched/bytes matched) 10973/6363227
        (depth/total drops/no-buffer drops) 8/0/0

    Class-map: class-default (match-any)
      3429 packets, 1765978 bytes
      30 second offered rate 29000 bps, drop rate 0 bps
      Match: any

The bandwidth is shared approximately in proportions “3,3:1”. The user configured class has more bandwidth than “class-default”, even though we gave it just 5% of available bandwidth… Does not look too much predictable - what about the idea the unused bandwidth goes to “class-default”? :) Right now, we are going to show you how CBWFQ (well, at least the 12.4 implementation) works.

The following keeps true when thinking of CBWFQ:

1) CBWFQ works the same was as WFQ. You just have the option to use flexible criteria for flow classification using MQC syntax.
2) CBWFQ shares interface bandwidth inversely proportional to flow weights. If you have N flows, where flow “i” has weight value of Weight(i). The CBWFQ will guarantee the flow “i” the following share of bandwidth: Share(i)=(Weight(1)+…+Weight(i)+…+Weight(N))/Weight(i). Thus, flows with smaller weights get more bandwidth. Note that you should treat those values as relative to each other, not as absolute shares.
3) CBWFQ assigns weights to dynamic conversation (flows that don’t match any user-defined class) using the formula Weight(i) = 32384/(IP_Precedence(i)+1)
4) CBWFQ assigns weight to a user-defined class using either of the following formulas:
4.1) Weight(i) = Const*Interface_BW/Class_BW if the class is configured with explicit bandwidth value.
4.2) Weight(i)=Const*100/Bandwidth_Percent if the class is configured with either bandwidth percent or bandwidth remaining percent

Here Const is a special constant that depends on the number of flow queues in WFQ. Cisco never gave any explicit formula, but it looks like we may choose constant according to the following table:

Number of flows

Constant

16

64

32

64

64

57

128

30

256

16

512

8

1024

4

2048

2

4096

1

That’s all the mechanics behind the beautiful meaning of “bandwidth” statement. As you can see, user-configurable classes are nothing more than separate conversations within CBWFQ flows pool. The flow is simply a FIFO queue, scheduled according to its sequential number, which is proportional to flow weight (btw the formula for sequential number is the same as with WFQ). All flows share the buffer pool that system allocates to CBWFQ, using the hold-queue N out interface-level command where N is the number of buffers. In addition to that, you can even specify WFQ Congestive Discard Threshold using the command queue-limit under the “class-deafult” of your policy map.


!
! Implementing pure WFQ using MQC syntax
!
policy-map WFQ
 class-map class-default
   !
   ! number of dynamic flows
   !
   fair-queue 256
   !
   ! WFQ Congestive Discard Threshold
   !
   queue-limit 32
!
interface Serial 0/1
 no fair-queue
 service-policy output WFQ
 !
 ! WFQ total size
 !
 hold-queue 4096 out

Now look at the following table:

Flow/Conversation Numbers

Weight

Description

Below 2^N

Weight(i)=32384/(IP_Precedence(i)+1)

Dynamic flows, unclassified traffic. This is the classic “fair-queue”.

2^N…2^N+7

Weight(i)=1024

Link Queues. Routing updates, Layer 2 Keepalives etc. Basically it’s the traffic marked as PAK_PRIORITY inside the router.

2^N+8

Weight(i)=0

LLQ or the priority queue. CBWFQ always service this queue first, but de-queued packets are policed using the defined token bucket parameters.

Above 2^N+8

Weight(i) = Const*Interface_BW/Class_BW
OR
Weight(i)=Const*100/Bandwidth_Percent

User-defined classes. Those classes are treated by CBWFQ as the RSVP flows, with relatively low weights. Their weights are almost all the time better than the weights of dynamic flows.

A few notes here. The value of “N” is the base parameter that defines the number of dynamic flows for CBWFQ. Remember you can only specify the number of flows as power of 2, and that “N” is this power value. You configure the number of dynamic flows using the command fair-queue under “class-default”. Next, CBWFQ uses special hash function to distribute unclassified packets in dynamic conversations. They have the same weights as they would have with classic WFQ. Now the Link Queues – we remember they were with WFQ as well. System uses those queues to send critical control plane traffic. The link queues has weight values of 1024, which is much better than any dynamic flows weights, and as we see later are almost on par with user-defined classes weights. Since control plane traffic is intermittent (unless you pump huge BGP tables ;) those flows do not affect bandwidth distribution too much. By the way, the well-known max-reserved-bandwidth 75% rule specifically ensures that link queue will not starve.

Now, take a quick look at the user-defined classes. Think of two extreme cases:

a) We assigned all interface bandwidth to the class (you may need max-reserved-bandwidth 100). Then the weight value is 64 in the worst case of just 16 flow queues, which is better than almost any other possible weight. This class will get what it wants (almost), no matter what :)
b) We assign small amount of interface bandwidth to the class, e.g. 2%. Then, the class weight is 64*100/2=3200 in the worst case of 16 or 32 flow queues. This is getting close to 32384/(7+1)=4048 which is the weight value for the “best” dynamic queue with IP Precedence value of 7.

From those two facts, we may conclude that user-defined classes dominate dynamic flows almost all the time, unless they have small shares of bandwidth configured. Of course, priority queue beats all, but it is rate-limited, so it can’t starve other conversations, unless you set policer rate to the whole interface bandwidth. By the way, you need to account for layer 2 overhead when setting the rate-limit bandwidth for a priority class. This is important when you are working with voice traffic flows, that has small packet sizes and layer 2 size is significant compared to the payload.

Keeping all those facts in mind, let’s look at the following output – the CBWFQ queue contents from the first configuration sample:


Rack1R4#show queueing interface serial 0/1
Interface Serial0/1 queueing strategy: fair
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: Class-based queueing
  Output queue: 12/1000/64/0 (size/max total/threshold/drops)
     Conversations  2/5/32 (active/max active/max total)
     Reserved Conversations 1/1 (allocated/max allocated)
     Available Bandwidth 96 kilobits/sec

  (depth/weight/total drops/no-buffer drops/interleaves) 5/1280/0/0/0
  Conversation 41, linktype: ip, length: 580
  source: 155.1.146.6, destination: 155.1.108.10, id: 0×47F9, ttl: 254,
  TOS: 0 prot: 6, source port 80, destination port 11003

  (depth/weight/total drops/no-buffer drops/interleaves) 7/4048/0/0/0
  Conversation 17, linktype: ip, length: 580
  source: 155.1.146.1, destination: 155.1.45.5, id: 0xADC9, ttl: 254,
  TOS: 224 prot: 6, source port 80, destination port 56546

Note the weight values for each flow. The weight for user-defined HTTP conversation is 64*100/5=1280, while the weight for dynamic flow is 32384/(7+1)=4048. Thus, using the formula for bandwidth shares, we obtain the following:

Share(1)=(4048+1280)/1280=4,1
Share(2)=(4048+1280)/4048=1,3

Normalize that, dividing by the smallest number which is 1,3, and you will get the proportion “3,1:1” which is pretty close to the distribution we’ve see above. Some unfairness is probably due to the slow line and large serialization delays.

To summarize what we learned so far:

1) CBWFQ is nothing else than WFQ on steroids ;)
2) User-defined classes have much better scheduling weights than any dynamic flow queue. Therefore, the bandwidth allocated to dynamic queue usually is small compared to any user-defined class.
3) Scheduler shares interface bandwidth in relative proportions. For example, if you have two classes configured with bandwidth values of “32” and “64” and interface bandwidth 128 that does not mean system will allocate classes 32Kbps and 64Kbps. That means: in case on congestion CBWFQ will share bandwidth in proportions 32:64=1:2 between the two classes, plus some small amount to class-default. If you want bandwidth to be “realistic”, ensure your entire bandwidth values sum to interface bandwidth. The same goes to bandwidth percents. 4) If you want the scheduler to honor class-default traffic, assign it an explicit bandwidth value. This will effectively disable dynamic flow queues (though preserve Link Queues) and assign all unclassified traffic to a single FIFO queue.

Now a few simple rules to understand how various CBWFQ commands syntax applies in case of interface congestion. All those rules assume that bandwidth weights are large enough to make dynamic flows weights negligible.

1) If you have priority bandwidth configured in your policy map, subtract this value from total interface bandwidth to yield the amount of bandwidth available to other classes. The priority queue is only rate-limited under interface congestion, and in such case, it cannot get more bandwidth than configured with priority statement. Note that in the following text we will refer to priority bandwidth as configured in Kbps, but you may replace its value with priority-percent*interface-bandwidth if you configured rate in percent.

2) Suppose that you configured user-defined classes with bandwidth statement. First, IOS CLI will check that that:


bandwidth(1)+…+bandwidth(N) + priority <= max_reserved_bandwidth*interface_bandwdith/100.

In case of congestion, the scheduler allocates the following amount of bandwidth to class “k”.


share(k)=(interface_bandwidth - priority) * bandwidth(k)/(bandwidth(1)+…+bandwidth(N))  Kbps.

Therefore, as mentioned above, if you want the share to be equal to the bandwidth you set for the class, make sure all bandwidth settings sum to the interface bandwidth.

3) Another case: you configured your classes with bandwidth percent. The IOS CLI performs the following assertion:


[bw_percent(1)+…+bw_percent(N)]*interface_bandwidth + priority <= max_reserved_bandwidth*interface_bandwidth/100

In case of congestion, the scheduler allocates the following amount of bandwidth to class “k”.


share(k)= (interface_bandwidth-priority) * bw_percent(k)/(bw_percent(1)+…+bw_percent(N)).

3) Final case: you configured your classes with bandwidth remaining percent. The IOS CLI performs the following assertion:


bw_rem_percent(1)+…+bw_rem_percent(N) <= 100%

In case of congestion, the scheduler allocates the following amount of bandwidth to class “k”.


share(k)= (interface_bandwidth - priority) * bw_rem_percent(k)/(bw_rem_percent(1)+…+bw_rem_percent(N)).

The funniest thing is that this is the same formula as in the case of simple bandwidth percent. However, the verification is rather simple, and it lets you forget about all those bandwidth computations.

Now you know how to answer that tricky CBWFQ questions :) Cisco never published (at least I never seen that) the information on CBWFQ algorithm. We got all information in this post based on simulations and information available on Cisco WFQ. Hope it helps!