logo CCIE Blog

Helping you become a Cisco Certified Internetwork Expert


rss Entries (RSS) | rss Comments (RSS)
Welcome to Internetwork Expert's CCIE Blog

Welcome to Internetwork Expert’s CCIE Blog! This site is dedicated to helping you in your pursuit of becoming a Cisco Certified Internetwork Expert in Routing & Switching, Voice, Security, Service Provider, and Storage. Through this blog you can submit questions to our expert instructors, Brian Dennis - Quad-CCIE #2210, Brian McGahan – Triple CCIE #8593, and Petr Lapukhov - Quad-CCIE #16379. Check back daily as this blog will be updated frequently.

Click here to submit a question.

May 5th, 2008

Understanding BGP Outbound Route Filtering (BGP ORF)

Hi Brian,

I’m having a problem with Workbook Volume 1 Version 4.1. ORF (Outbound Route Filtering) isn’t working for me. Any help would be appreciated.

Thank you,

JoeT

Hi Joe,

First off let’s talk a little bit about what BGP ORF (Outbound Route Filtering) is designed to do for us, and then we’ll take a look at some implementation examples.

From a customer’s point of view there are typically a limited amount of choices for what routes you can receive from your Service Provider via BGP. Usually the Service provider will give the customer the option of sending them a full table view (currently about 260,000 prefixes), just a default route, or some specific subset of the table such as a default route and the Service Provider’s locally originated prefixes. In other words a BGP Service Provider generally will not implement a complex outbound filtering policy for the customer. Instead, if the customer wants to receive just a subset view of the BGP table, the Customer Edge (CE) router has to filter prefixes inbound as they are received from the upstream Provider Edge (PE) router.

From the SP’s point of view this is the optimal design for administration. They don’t need to worry about change requests constantly coming from the customer about what routes they want to see and what routes they don’t want to see. Likewise from the customer’s point of view this is the optimal administrative design, as they do not need to send change control requests to the provider, and can arbitrarily change their filtering design on the fly. However from a device resource point of view this is not optimal from both the PE and CE routers’ perspective. The SP’s PE router must still send the full BGP table to the customer, even if the CE router filters out 99% of it. Likewise the CE router must still process all of the BGP UPDATE messages, even if the majority of them are ultimately filtered out.

Let’s take this a look at the result of this in the context of the following design:

AS100 — AS200
(PE) -– (CE)

AS 200 has an upstream peering to its Service Provider, AS 100. The BGP table of AS 200 appears as follows:

AS200_CE#show ip bgp
BGP table version is 12, local router ID is 10.0.0.200
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0          10.0.0.100               0             0 100 i
*> 28.119.16.0/24   10.0.0.100                             0 100 54 i
*> 28.119.17.0/24   10.0.0.100                             0 100 54 i
*> 112.0.0.0        10.0.0.100                             0 100 54 50 60 i
*> 113.0.0.0        10.0.0.100                             0 100 54 50 60 i
*> 114.0.0.0        10.0.0.100                             0 100 54 i
*> 115.0.0.0        10.0.0.100                             0 100 54 i
*> 116.0.0.0        10.0.0.100                             0 100 54 i
*> 117.0.0.0        10.0.0.100                             0 100 54 i
*> 118.0.0.0        10.0.0.100                             0 100 54 i
*> 119.0.0.0        10.0.0.100                             0 100 54 i

Let’s suppose that from AS 200’s perspective the only routes that they want to receive from AS 100 are the default route plus the networks 28.119.16.0/24 and 28.119.17.0/24. Traditional filtering would dictate that on the CE router a prefix-list would be configured and applied as follows:

router bgp 200
 neighbor 10.0.0.100 remote-as 100
 neighbor 10.0.0.100 prefix-list AS_100_INBOUND in
!
ip prefix-list AS_100_INBOUND seq 5 permit 0.0.0.0/0
ip prefix-list AS_100_INBOUND seq 10 permit 28.119.16.0/24
ip prefix-list AS_100_INBOUND seq 15 permit 28.119.17.0/24

The result of this configuration in AS 200’s BGP table is as follows:

AS200_CE#show ip bgp
BGP table version is 4, local router ID is 10.0.0.200
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0          10.0.0.100               0             0 100 i
*> 28.119.16.0/24   10.0.0.100                             0 100 54 i
*> 28.119.17.0/24   10.0.0.100                             0 100 54 i

Although the filtering goal is achieved, efficiency is not. From the below debug output we can see exactly how AS 200’s CE router processes the updates from the upstream PE and makes a decision on what to install:

AS200_CE#debug ip bgp updates
BGP updates debugging is on for address family: IPv4 Unicast
AS200_CE#clear ip bgp 100
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Down User reset
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Up
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0
BGP(0): Revise route installing 1 of 1 routes for 0.0.0.0/0 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 115.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 114.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 119.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 118.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 117.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 116.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54 50 60
BGP(0): 10.0.0.100 rcvd 113.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 112.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): Revise route installing 1 of 1 routes for 28.119.16.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): Revise route installing 1 of 1 routes for 28.119.17.0/24 -> 10.0.0.100(main) to main IP table
AS200_CE#

Note that the AS200_CE router generates the log message “DENIED due to: distribute/prefix-list;” for every prefix that is filtered. This means that if this were the public BGP table of 260,000+ routes this router would have to process every update message just to discard it. This is where BGP Outbound Route Filtering (ORF) comes in.

Outbound Route Filtering Capability for BGP-4 is currently an IETF draft (http://www.ietf.org/internet-drafts/draft-ietf-idr-route-filter-16.txt) that describes an optimization on how prefix filtering can occur between a Customer Edge (CE) router and a Provider Edge (PE) router that are exchanging IPv4 unicast BGP prefixes. In the design we saw above the upstream PE router sent the full BGP table downstream to the CE router, and filtering was applied inbound on the downstream CE. With BGP ORF the downstream CE router dynamically tells the upstream PE router what routes to filter *outbound*. This means that the downstream CE router will only receive update messages about the prefixes that it wants.

Implementation wise the first step of this feature is for the BGP neighbors to negotiate that they both support the BGP ORF capability. Configuration-wise this looks as follows:

AS100_PE#
router bgp 100
 neighbor 10.0.0.200 remote-as 200
 !
 address-family ipv4
 neighbor 10.0.0.200 capability orf prefix-list receive
 neighbor 204.12.25.254 activate
 exit-address-family

AS200_CE#
router bgp 200
 neighbor 10.0.0.100 remote-as 100
 !
 address-family ipv4
 neighbor 10.0.0.100 capability orf prefix-list send
 neighbor 10.0.0.100 prefix-list AS_100_INBOUND in
 exit-address-family
!

The result of this configuration on AS 200’s CE is the same, however the behind the scenes mechanism by which it is accomplished is different. First, AS100_PE and AS200_CE negotiate the BGP ORF capability during initial BGP peering establishment. The success of this negotiation can be seen as follows.

AS100_PE#show ip bgp neighbors 10.0.0.200 | begin AF-dependant capabilities:
  AF-dependant capabilities:
    Outbound Route Filter (ORF) type (128) Prefix-list:
      Send-mode: received
      Receive-mode: advertised
  Outbound Route Filter (ORF): received (3 entries)
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               2          0
    Prefixes Total:                 2          0
    Implicit Withdraw:              0          0
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          0
    Used as multipath:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    ORF prefix-list:                      8        n/a
    Total:                                8          0
  Number of NLRIs in the update sent: max 4, min 2

*OUTPUT OMITTED*

AS200_CE#show ip bgp neighbors 10.0.0.100 | begin AF-dependant capabilities:
  AF-dependant capabilities:
    Outbound Route Filter (ORF) type (128) Prefix-list:
      Send-mode: advertised
      Receive-mode: received
  Outbound Route Filter (ORF): sent;
  Incoming update prefix filter list is AS_100_INBOUND
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               0          3 (Consumes 156 bytes)
    Prefixes Total:                 0          4
    Implicit Withdraw:              0          1
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          3
    Used as multipath:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Suppressed duplicate:                 0          1
    Bestpath from this peer:              3        n/a
    Total:                                3          1
  Number of NLRIs in the update sent: max 0, min 0

*OUTPUT OMITTED*

Next, AS 200’s CE router tells AS 100’s PE router which prefixes it wants to receive. The logic of this configuration is that AS 200 is “sending” a prefix-list of what routes it wants, while AS 100 is “receiving” the prefix-list of what the downstream neighbor wants. The reception of the prefix-list by the upstream PE can be verified as follows.

AS100_PE#show ip bgp neighbors 10.0.0.200 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 10.0.0.200: 3 entries
   seq 5 permit 0.0.0.0/0
   seq 10 permit 28.119.16.0/24
   seq 15 permit 28.119.17.0/24

AS100_PE#show ip prefix-list

Note that AS 100’s PE router received the list from AS 200’s CE, but the prefix-list does not show up locally in the running config. AS 100’s PE router then turns around and uses the prefix-list as an outbound filter towards the downstream CE. This can be verified two ways, by viewing the UPDATE messages on the downstream CE, and by looking at what the upstream PE is sending.

AS100_PE#show ip bgp neighbors 10.0.0.200 advertised-routes
BGP table version is 11, local router ID is 10.0.0.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Originating default network 0.0.0.0

   Network          Next Hop            Metric LocPrf Weight Path
*> 28.119.16.0/24   204.12.25.254            0             0 54 i
*> 28.119.17.0/24   204.12.25.254            0             0 54 i

Total number of prefixes 2
AS100_PE#

AS200_CE#debug ip bgp updates
BGP updates debugging is on for address family: IPv4 Unicast
AS200_CE#clear ip bgp 100
AS200_CE#
BGP(0): no valid path for 0.0.0.0/0
BGP(0): no valid path for 28.119.16.0/24
BGP(0): no valid path for 28.119.17.0/24
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Down User reset
BGP(0): nettable_walker 0.0.0.0/0 no best path
BGP(0): nettable_walker 28.119.16.0/24 no best path
BGP(0): nettable_walker 28.119.17.0/24 no best path
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Up
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0
BGP(0): Revise route installing 1 of 1 routes for 0.0.0.0/0 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24
BGP(0): Revise route installing 1 of 1 routes for 28.119.16.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): Revise route installing 1 of 1 routes for 28.119.17.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0...duplicate ignored
AS200_CE#

Note that the above output is different from the previous debug of AS 200’s CE, because now it does not receive the extra update messages. AS 200 instead now receives only the routes that it has requested of the upstream PE.

If edits of the filter are required the downstream CE can change the prefix-list, and then through the BGP Route Refresh capability, advertise the new prefix-list upstream to the PE to be used as a new downstream filter. Configuration wise this is accomplished as follows.

AS200_CE#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
AS200_CE(config)#ip prefix-list AS_100_INBOUND permit 114.0.0.0/8
AS200_CE(config)#end
AS200_CE#
%SYS-5-CONFIG_I: Configured from console by console
AS200_CE#clear ip bgp 100 in prefix-filter
AS200_CE#
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 114.0.0.0/8
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24...duplicate ignored
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24...duplicate ignored
BGP(0): Revise route installing 1 of 1 routes for 114.0.0.0/8 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0...duplicate ignored
AS200_CE#

From the “debug ip bgp updates” output we can now see that the upstream PE added the update 114.0.0.0/8, in addition to the previous three prefixes that were installed. Upstream verification on the PE also indicates this.

AS100_PE#show ip bgp neighbors 10.0.0.200 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 10.0.0.200: 4 entries
   seq 5 permit 0.0.0.0/0
   seq 10 permit 28.119.16.0/24
   seq 15 permit 28.119.17.0/24
   seq 20 permit 114.0.0.0/8
AS100_PE#show ip bgp neighbors 10.0.0.200 advertised-routes
BGP table version is 11, local router ID is 10.0.0.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Originating default network 0.0.0.0

   Network          Next Hop            Metric LocPrf Weight Path
*> 28.119.16.0/24   204.12.25.254            0             0 54 i
*> 28.119.17.0/24   204.12.25.254            0             0 54 i
*> 114.0.0.0        204.12.25.254                          0 54 i

Total number of prefixes 3
AS100_PE#

For more information on this feature:

Outbound Route Filtering Capability for BGP-4
http://www.ietf.org/internet-drafts/draft-ietf-idr-route-filter-16.txt

BGP Prefix-Based Outbound Route Filtering
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t11/feature/guide/ft11borf.html

February 22nd, 2008

Using TCL and Macro Ping Scripts for CCIE Lab Reachability Testing

One common problem that causes candidates to fail the CCIE Routing & Switching Lab Exam is the lack of complete IP reachability to various segments used in the network topology. However, due to the short time constraints of the lab exam itself it can be difficult to dedicate enough time to properly verify that reachability exists between all relevant segments. In order to solve this problem two very useful features can be implemented during the lab exam, TCL scripting on the routers and macro scripting on the Catalyst switches.

TCL (Tool Control Language) is a scripting language used extensively by Cisco to facilitate the testing and automating of various functions in the IOS. For example advanced implementations on IOS can go as far as programming a router to send you an email when its interface utilization exceeds the normally defined average. In our case we will be using very basic TCL programming to sequentially run the “ping” command.

Macro scripting on the Catalyst switches is a simple way to define templates of configuration that can be applied globally or to interfaces by issuing a single command. Examples of predefined macros include the “switchport host” command, which enables portfast, sets an interface to access mode, and disabled DTP, and the “auto qos” feature. For our implementation we will be using the macros to run pings commands sequentially.

The first step in configuring a ping script is to collect the IP addresses that will be tested in the topology. There are two simple ways to do this, either through the “show ip interface brief” command or the “show ip alias”. Both of these commands show local IP addresses allocated on the device and any addresses that are being proxied for (i.e. dns proxy or NAT). The “show ip interface brief” output can be filtered through quickly by using the “show ip interface brief | exclude unassigned” command so that only interfaces with IP addresses are listed.

Next we’ll need to copy these addresses into a text editor (i.e. Windows notepad in the CCIE lab exam), and do some basic manipulation. To do so use the column select feature of SecureCRT (the terminal emulator used in the lab exam) by holding down the ALT key and then selecting with your left mouse button. This will minimize the amount of text that we have to sort through. Once you’re done you should have a neat list of addresses in notepad that looks something like this:


192.168.255.1
192.168.255.2
192.168.255.3
192.168.255.4
192.168.255.5
192.168.255.6
192.168.255.7
192.168.255.8
192.168.255.9
192.168.255.10

Next we need to manipulate these addresses for TCL on the routers and macro scripting on the switches. For TCL we are going to define the IP addresses as an array, and then use a “for” loop to run the ping command with the array as its argument. The syntax in IOS 12.4 for this implementation is as follows:


foreach VAR {
192.168.255.1
192.168.255.2
192.168.255.3
192.168.255.4
192.168.255.5
192.168.255.6
192.168.255.7
192.168.255.8
192.168.255.9
192.168.255.10
} { puts [exec "ping $VAR"] }

Specifically we are defining the array “VAR”, which represents the IP addresses, then sending the exec command “ping” with “VAR” as the argument. Now to actually run the script in IOS we first must invoke the TCL shell. This is accomplished with the exec command “tclsh”.


Rack17R1#tclsh
+>

Now paste the script from notepad into the router and it should run the pings:


Rack17R1#tclsh
+>foreach VAR {
+>192.168.255.1
+>192.168.255.2
+>192.168.255.3
+>192.168.255.4
+>192.168.255.5
+>192.168.255.6
+>192.168.255.7
+>192.168.255.8
+>192.168.255.9
+>192.168.255.10
+>} { puts [exec "ping $VAR"] }

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/58/60 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 84/86/89 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 140/147/164 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 140/142/144 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.6, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/59/65 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.8, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.9, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.10, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/57/60 ms

Rack17R1(tcl)#

Note that in the above test we can see that the address 192.168.255.8 is unreachable. Once TCL is completed we must now make sure to exit the shell. This can be accomplished with either the “tclquit” command, or simply typing “exit” to leave the exec process. Unless TCL is exited the IOS can have some unexpected behavior, like the below example:


Rack17R1(tcl)#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
Rack17R1(config)#route-map TCL_BREAKS_SET_COMMAND

Rack17R1(config-route-map)#set local-preference 100
100
Rack17R1(config-route-map)#end

Rack17R1(tcl)#
*Mar  1 05:45:49.906: %SYS-5-CONFIG_I: Configured from console by console
Rack17R1(tcl)#show run | section route-map
route-map TCL_BREAKS_SET_COMMAND permit 10

Rack17R1(tcl)#tclquit
Rack17R1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
Rack17R1(config)#route-map SET_FIXED_WITH_TCL_EXITED
Rack17R1(config-route-map)#set local-preference 100
Rack17R1(config-route-map)#end
Rack17R1#
*Mar  1 05:46:28.882: %SYS-5-CONFIG_I: Configured from console by console
Rack17R1#show run | section route-map
route-map TCL_BREAKS_SET_COMMAND permit 10
route-map SET_FIXED_WITH_TCL_EXITED permit 10
 set local-preference 100

The above output shows that TCL was invoked, then a global route-map named “TCL_BREAKS_SET_COMMAND” was defined. In this route-map the “set” command was issued. However with TCL still running the “set” command is first interpreted by TCL as an attempt to set an environment variable, instead of sent to the exec process itself. The result is that instead of setting local-preference for the purpose of BGP manipulation we are actually creating a variable named “local-preference” that has a value of 100. Once the “tclquit” command is issued we can see that the next route-map, “SET_FIXED_WITH_TCL_EXITED” successfully uses the set command.

Next for macro scripting on the Catalyst switches we will take the same list of IP addresses used before and prefix them with the commands “do ping”. The resulting list in notepad should look like this:


do ping 192.168.255.1
do ping 192.168.255.2
do ping 192.168.255.3
do ping 192.168.255.4
do ping 192.168.255.5
do ping 192.168.255.6
do ping 192.168.255.7
do ping 192.168.255.8
do ping 192.168.255.9
do ping 192.168.255.10

Next we’ll insert these commands as a sequence in a global macro as follows:


Rack17SW1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
Rack17SW1(config)#macro name PING_SCRIPT
Enter macro commands one per line. End with the character '@'.
do ping 192.168.255.1
do ping 192.168.255.2
do ping 192.168.255.3
do ping 192.168.255.4
do ping 192.168.255.5
do ping 192.168.255.6
do ping 192.168.255.7
do ping 192.168.255.8
do ping 192.168.255.9
do ping 192.168.255.10
@
Rack17SW1(config)#

We now have a script named “PING_SCRIPT” that can be run from global configuration. To apply the script use the following syntax:


Rack17SW1(config)#macro global apply PING_SCRIPT

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/9 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 51/57/59 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 25/30/34 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 84/89/101 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 83/87/93 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.6, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 50/57/59 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.8, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.9, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.255.10, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 50/57/59 ms
Rack17SW1(config)#

Once you are proficient in this methodology building both the TCL and macro scripts should only take about 2 – 3 minutes in the lab, with another 5 minutes to run them. The resulting time savings versus running manual pings to all segments, and the resulting information on what is reachable and what is not can be the difference between passing and failing the CCIE lab exam.

February 6th, 2008

Understanding BGP Port Numbers

Guys,

I ran into a task in a lab to configure an ACL to allow BGP and the book has it configured like this:

Permit tcp host 150.1.5.5 eq bgp host 150.1.4.4
Permit tcp host 150.1.5.5 host 150.1.4.4 eq bgp

Is there a reason why it is configured like that instead of ‘Permit tcp host 150.1.5.5 eq bgp host 150.1.4.4 eq bgp’ ?

The only reason I could think of was to allow source/destination traffic from tcp 179 to a different port.

Ted

Hi Ted,

Your reasoning is correct; it is because BGP uses different source and destination ports other than 179 depending on who originates the session. BGP is essentially a standard TCP based protocol, which means that it is client and server based.

When a TCP client attempts to establish a connection to a TCP server it first sends a TCP SYN packet to the server with the destination port as the well known port. This first SYN essentially is a request to open a session. If the server permits the session it will respond with a TCP SYN ACK saying that it acknowledges the request to open the session, and that it also wants to open the session. In this SYN ACK response the server uses the well known port as the source port, and a randomly negotiated destination port. The last step of the three way handshake is the client responding to the server with a TCP ACK, which acknowledges the server’s response and completes the connection establishment.

Now from the perspective of BGP specifically the TCP clients and servers are routers. When the “client” router initiates the BGP session is sends a request to the server with a destination port of 179 and a random source port X. The server then responds with a source port of 179 and a destination port of X. Therefore all client to server traffic uses destination 179, while all server to client traffic uses source 179. We can also verify this from the debug output in IOS.

In the below topology R1 and R2 are directly connected BGP peers on an Ethernet segment with configurations as follows:

R1:
interface Ethernet0/0
 ip address 10.0.0.1 255.255.255.0
!
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
!
router bgp 1
 network 1.1.1.1 mask 255.255.255.255
 neighbor 10.0.0.2 remote-as 1

R2:
interface Ethernet0/0
 ip address 10.0.0.2 255.255.255.0
!
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
!
router bgp 1
 network 2.2.2.2 mask 255.255.255.255
 neighbor 10.0.0.1 remote-as 1

From the “debug ip packet detail” output we can see the initial TCP session establishment for BGP between these neighbors as follows:

R1:
IP: tableid=0, s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), routed via RIB
IP: s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), len 44, rcvd 3
    TCP src=11000, dst=179, seq=3564860120, ack=0, win=16384 SYN

R2:
IP: tableid=0, s=10.0.0.1 (Ethernet0/0), d=10.0.0.2 (Ethernet0/0), routed via RIB
IP: s=10.0.0.1 (Ethernet0/0), d=10.0.0.2 (Ethernet0/0), len 44, rcvd 3
    TCP src=179, dst=11000, seq=1977460611, ack=3564860121, win=16384 ACK SYN

R1:
IP: tableid=0, s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), routed via RIB
IP: s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), len 40, rcvd 3
    TCP src=11000, dst=179, seq=3564860121, ack=1977460612, win=16384 ACK

In the first code snippet we can see that R1 has received a TCP packet from R2. This packet was sourced from 10.0.0.2 (R2), destined to 10.0.0.1 (R1), has a TCP source port of 11000 (the random port), a destination TCP port of 179 (the well known port), and is a SYN packet (1st step of three way handshake).

Next R2 receives the response from R1 that is a TCP SYN ACK, but in this case we can see that the port numbers are swapped. R1, the TCP server, now uses TCP port 179 as the source port, and the randomly negotiated port 11000 as the destination.

Further debugging of the BGP flow between these neighbors will show R1 using destination 179 (as the TCP client) and R2 using source 179 (as the TCP server).

The implication of this operation is that if BGP needs to be matched for some reason, i.e. to filter it out in an access-list, to match it in a QoS policy, we must account for traffic that is both going to TCP port 179 and coming from TCP port 179.

January 25th, 2008

BGP Time-Based Policy Routing

Sometimes people need to conditionally advertise routes into BGP table based on time of day. Say, we may want to adversite IGP prefix 150.1.1.0/24 with community 1:100 during daytime and with community 1:200 at the other time. Back in days, the procedure was easy - you had to create time based ACL, and use it in route-map to set communities:


time-range DAY
 periodic daily 9:00 to 18:00

access-list 101 permit ip any any time-range DAY

route-map SET_COMMUNITY 10
 match ip address 101
 set community 1:100
!
route-map SET_COMMUNITY 20
 set community 1:200

This construct worked fine back in days with 12.2T and 12.3 IOSes up to 12.3(17)T. However, since 12.3(17)T, BGP scanner behavior has changed significally. Up to the new version, redistribution into BGP table was based on BGP scanner periodically polling the IGP routes every scan-interval (one minute by default). With the new IOS code, redistribution is purely event driven: a new route is added/deleted from BGP table based on event, signaled by IGP (e.g. IGP route withdrawn, next-hop change etc). This change in BGP scanner behavior was not clearly documented, unlike the related BGP support for next-hop address tracking feature. Ovbsiously, a change in time-range is not treated as an IGP event, hence the filter does not work anymore.

Still, there is a number of workarounds. Here is one of them: we use time-based ACL to filter or permit ICMP packets, and advertise routers based on that virtual “reachability” info.

First, we create time-range and time-based access-list:


time-range DAY
 periodic daily 9:00 to 18:00
!
access-list 101 permit ip any any time-range DAY

Next we create a special loopback interface, which is used send ICMP echo packets to “ourself” and attach the ACL to the interface to filter incoming (looped back) packets:


interface Loopback0
 ip address 150.1.1.1 255.255.255.255
 ip access-group 101 in

We create a new IP SLA monitor, to send ICMP echo packets over loopback interface. If the time-based ACL permit pings, the monitor state will be “reachable”


ip sla monitor 1
 type echo protocol ipIcmpEcho 150.1.1.1
 timeout 100
 frequency 1

Next we track our “pinger” state. The first tracker is “on” when the loopback is “open” by packet filter, the second one is active when the time-based ACL filters packets:


track 1 rtr 1 reachability
!
! Inverse logic
!
track 2 list boolean and
 object 1 not

The we create two static routes, bound to the mentioned trackets. That is, the static route with tag 100 is only active when loopback is “open”, i.e. time-based ACL permits packets. The other static route is active only when time-range is inactive (the second tracker tells that the destination is “reachable”):


ip route 150.1.1.0 255.255.255.0 Loopback0 150.1.1.254 tag 100 track 1
ip route 150.1.1.0 255.255.255.0 Loopback0 150.1.1.253 tag 200 track 2

Now we redistribute static routes into BGP, based on tag values, and also set communities based on the tags:


router bgp 1
 redistribute static route-map STATIC_TO_BGP
!
route-map STATIC_TO_BGP permit 10
 match tag 100
 set community 1:100
!
route-map STATIC_TO_BGP permit 20
 match tag 200
 set community 1:200

This is also a funny example of how you can tie up together multiple IOS features at the same time.

January 11th, 2008

BGP Order of Preference

For inbound updates the order of preference is:
1. route-map
2. filter-list
3. prefix-list, distribute-list

For outbound updates the order of preference is:
1. prefix-list, distribute-list
2. filter-list
3. route-map

January 8th, 2008

Using Extended ACLs for BGP Filtering

Prior to the support of prefix-lists in the IOS advanced filtering for BGP needed to be done using extended ACLs.  The syntax for using extended ACLs is shown below:

access-list <ACL #> permit ip <network> <wildcard mask of network> <subnet mask> <wildcard mask of subnet mask>

The source portion of the extended ACL is used to match the network portion of the BGP route and the destination portion of the ACL is used to match the subnet mask of the BGP route.  Here are some examples:

access-list 100 permit ip 10.0.0.0 0.0.0.0 255.255.0.0 0.0.0.0
Matches 10.0.0.0/16 - Only

access-list 100 permit ip 10.0.0.0 0.0.0.0 255.255.255.0 0.0.0.0
Matches 10.0.0.0/24 - Only

access-list 100 permit ip 10.1.1.0 0.0.0.0 255.255.255.0 0.0.0.0
Matches 10.1.1.0/24 - Only

access-list 100 permit ip 10.0.0.0 0.0.255.0 255.255.255.0 0.0.0.0
Matches 10.0.X.0/24 - Any number in the 3rd octet of the network with a /24 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.0 255.255.255.0 0.0.0.0
Matches 10.X.X.0/24 - Any number in the 2nd & 3rd octet of the network with a /24 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.255 255.255.255.240 0.0.0.0
Matches 10.X.X.X/28 - Any number in the 2nd, 3rd & 4th octet of the network with a /28 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.255 255.255.255.0 0.0.0.255
Matches 10.X.X.X/24 to 10.X.X.X/32 - Any number in the 2nd, 3rd & 4th octet of the network with a /24 to /32 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.255 255.255.255.128 0.0.0.127
Matches 10.X.X.X/25 to 10.X.X.X/32 - Any number in the 2nd, 3rd & 4th octet of the network with a /25 to /32 subnet mask

January 6th, 2008

Understanding BGP Regular Expressions

Hi Brian,

Can you explain the easiest way to construct a regular expression in BGP?

Thanks,

Rowan

Hi Rowan,

Regular expressions are strings of special characters that can be used to search and find character patterns. Within the scope of BGP in Cisco IOS regular expressions can be used in show commands and AS-Path access-lists to match BGP prefixes based on the information contained in their AS-Path.

In order to understand how to build regular expressions we first need to know what the character definitions are for the regex function of IOS. The below table illustrates the regex characters and their usage. This information is contained in the Cisco IOS documentation under the Appendix of Cisco IOS Terminal Services Configuration Guide, Release 12.2.

+------------------------------------------------------+
| CHAR | USAGE                                         |
+------------------------------------------------------|
|  ^   | Start of string                               |
|------|-----------------------------------------------|
|  $   | End of string                                 |
|------|-----------------------------------------------|
|  []  | Range of characters                           |
|------|-----------------------------------------------|
|  -   | Used to specify range ( i.e. [0-9] )          |
|------|-----------------------------------------------|
|  ( ) | Logical grouping                              |
|------|-----------------------------------------------|
|  .   | Any single character                          |
|------|-----------------------------------------------|
|  *   | Zero or more instances                        |
|------|-----------------------------------------------|
|  +   | One or more instance                          |
|------|-----------------------------------------------|
|  ?   | Zero or one instance                          |
|------|-----------------------------------------------|
|  _   | Comma, open or close brace, open or close     |
|      | parentheses, start or end of string, or space |
+------------------------------------------------------+

Some commonly used regular expressions include:

+-------------+---------------------------+
| Expression  | Meaning                   |
|-------------+---------------------------|
| .*          | Anything                  |
|-------------+---------------------------|
| ^$          | Locally originated routes |
|-------------+---------------------------|
| ^100_       | Learned from AS 100       |
|-------------+---------------------------|
| _100$       | Originated in AS 100      |
|-------------+---------------------------|
| _100_       | Any instance of AS 100    |
|-------------+---------------------------|
| ^[0-9]+$    | Directly connected ASes   |
+-------------+---------------------------+

Let’s break some of the above expressions down step-by-step. The first one “.*” says to match any single character (“.”), and then find zero or more instances of that single character (“*”). This means zero or more instances or any character, which effectively means anything.

The next string “^$” says to match the beginning of the string (“^”), and then immediately match the end of the string (“$”). This means that the string is null. Within the scope of BGP the only time that the AS-Path is null is when you are looking at a route within your own AS that you or one of your iBGP peers has originated. Hence this matches locally originated routes.

The next string “^100_” says to match the beginning of the string (“^”), the literal characters 100, and then a comma, an open or close brace, an open or close, a parentheses, the start or end of the string, or a space (“_”). This means that the string must start with the number 100 followed by any non-alphanumeric character. In the scope of BGP this means that routes which are learned from the AS 100 will be matched, as 100 will be the first AS in the path when AS 100 is sending us routes.

The next string “_100$” is the exact opposite of the previous one. This string says to start with any non-alphanumeric character (“_”), followed by the literal characters 100, followed by the end of the string (“$”). This means that AS 100 is the last AS in the path, or in other words that the prefix in question was originated by AS 100.

The next string “_100_” is the combination of the two previous strings with some extra matches. This string means that the literal characters 100 are set between any two non-alphanumeric characters. The first of these could be the start of the string, which would match routes learned from AS 100, while the second of these could be the end of the string, which would match routes originated in AS 100. Another case could be that the underscores represent spaces, in which the string would match any other AS path information as long as “ 100 ” is included somewhere. This would match any routes which transit AS 100, and therefore “_ASN_” is generally meant to match routes that transit a particular AS as defined by the number “ASN”.

The final string “^[0-9]+$” is a little more complicated match. Immediately we can see that the string starts (“^”), and we can see later that it ends (“$”). In the middle we see a range of numbers 0-9 in brackets, followed by the plus sign. The numbers in brackets mean that any number from zero to nine can be matched, or in other words, any number. Next we have the plus sign which means one or more instances. This string “[0-9]+” therefore means one or more instance of any number, or in other words any number including numbers with multiple characters (i.e. 1, 12, 123, 1234, 12345678, etc.). When we combine these all together this string means routes originated in any directly connected single AS, or in other words, the routes directly originated by the peers of your AS.

Now let’s look at a more complicated match, and using the above character patterns we will see how we can construct the expression step by step. Suppose we have the following topology below, where we are looking at the network from the perspective of AS 100.

+--------+ +--------+ +--------+ +--------+
| AS 200 |-| AS 201 |-| AS 202 |-| AS 203 |\
+--------+ +--------+ +--------+ +--------+ \
                                             \
           +--------+ +--------+ +--------+\  \
           | AS 300 |-| AS 301 |-| AS 302 | \  \
           +--------+ +--------+ +--------+  \  -+--------+
                                              >--| AS 100 |
                      +--------+ +--------+  /  -+--------+
                      | AS 400 |-| AS 401 | /  /
                      +--------+ +--------+/  /
                                             /
                                 +--------+ /
                                 | AS 500 |/
                                 +--------+

AS 100 peers with ASes 203, 302, 401, and 500, who each have peers as diagramed above. AS 100 wants to match routes originated from its directly connected customers (ASes 203, 302, 401, and 500) in addition to routes originated from their directly connected customers (ASes 202, 301, and 400). The easiest way to create this regular expression would be to think about what we are first trying to match, and then write out all possibilities of these matches. In our case these possibilities are:

203
203 202
302
302 301
401
401 400
500

Now we could simply create an expression with multiple lines (7 lines to be exact) that would match all of the possible AS paths, but suppose that AS 100 wants to keep this match as flexible as possible so that it will apply to any other ASes in the future. Now let’s try to generalize the above AS-Path information into a regex.

First off we know that each of the matches is going to start and going to end. This means that the first character we will have is “^” and the last character is “$”. Next we know that between the “^” and “$” there will be either one AS or two ASes. We don’t necessarily know what numbers these ASes will be, so for the time being let’s use the placeholder “X”. Based on this our new possible matches are:

^X$
^X X$

Next let’s reason out what X can represent. Since X is only one single AS, there will be no spaces, commas, parentheses, or any other special type characters. In other words, X must be a number. However, since we don’t know what the exact path is, we must take into account that X may be a number with more than one character (i.e. 10, 123, or 10101). This essentially equates to one or more instance of any number zero through nine. In regular expression syntax our two matches would therefore now read:

^[0-9]+$
^[0-9]+ [0-9]+$

This expressions reads that we either have a number consisting of one or more characters zero through nine, or a number consisting of one or more characters zero through nine followed by a space and then another number consisting of one or more characters zero through nine. This brings our expression down to two lines as opposed to our original seven, but let’s see how we can combine the above two as well. To combine them, first let us compare what is different between them.

^[0-9]+$
^[0-9]+ [0-9]+$

From looking at the expressions it is evident that the sequence “ [0-9]+” is the difference. In the first case “ [0-9]+” does not exist in the expression. In the second case “ [0-9]+” does exist in the expression. In other words, “ [0-9]+” is either true or false. True or false (0 or 1) is represented by the character “?” in regex syntax. Therefore we can reduce our expression to:

^[0-9]+ [0-9]+?$

At this point we run into a problem with the order of operations of the regex. As denoted above the question mark will apply only to the plus sign, and not to the range [0-9]. Instead, we want the question mark to apply to the string “ [0-9]+” as a whole. Therefore this string needs to be grouped together using parentheses. Parentheses are used in regular expressions as simply a logical grouping. Therefore our final expression reduces to:

^[0-9]+( [0-9]+)?$

Note that to match a question mark in IOS, the escape sequence CTRL-V or ESC-Q must be entered first, otherwise the IOS parser will interpret the question mark as an attempt to invoke the context sensitive help.

December 26th, 2007

How do I stop a confederation from being used as transit?

Suppose we have the following scenario:

R1—R2–R3–R4—R5

R1 is AS 100
R2, R3, R4 are AS 200
R5 is AS 300

R2, R3, R4 are confederated, with sub as’s 65002, 65003, and 65004 respectively. They are also originating prefixes A, B, & C respectively. If AS 200 does not want to be transit, we must only advertise out prefixes originated in these three sub AS’s.

From R2’s perspective, we see the following prefixes, and the following AS-Path’s:

A - EMPTY
B - (65003)
C - (65003,65004)

From R4’s perspective, we see the following prefixes, and the following AS-Path’s:

A - (65002,65003)
B - (65003)
C - EMPTY

Now we must consider how to match all of these cases in a single line. Remember that parentheses are special characters within the as-path list.

Our minimum case to match would be:

^$

This is our empty AS-PATH, which is prefixes locally originated in our sub-as.

Our maximum case to match would be:

\(X\)

where X is any number of AS’s, or a comma. Remember that we need to escape the parens.

To satisfy our condition of X, we should be matching 1 or more instance of any character, which equates to:

.+

Therefore our maximum case is now:

^\(.+\)$

However, we must match the minimum case at the same time. Therefore, our current expression \(.+\) is either true or false. True or false (0 or 1 instance) is covered by the expression ?.

Therefore, our final regular expression will read:

^(\(.+\))?$

Tada!

Advertise only prefixes which match this expression outbound on your border routers, and your confederated AS’s will not be transit.

-->