logo CCIE Blog

Helping you become a Cisco Certified Internetwork Expert


rss Entries (RSS) | rss Comments (RSS)
Welcome to Internetwork Expert's CCIE Blog

Welcome to Internetwork Expert’s CCIE Blog! This site is dedicated to helping you in your pursuit of becoming a Cisco Certified Internetwork Expert in Routing & Switching, Voice, Security, Service Provider, and Storage. Through this blog you can submit questions to our expert instructors, Brian Dennis - Quad-CCIE #2210, Brian McGahan – Triple CCIE #8593, and Petr Lapukhov - Quad-CCIE #16379. Check back daily as this blog will be updated frequently.

Click here to submit a question.

May 8th, 2008

Using NBAR for Application Filtering

Hi Brian,

Can we use NBAR on the gateway router to prevent internal users from watching video streams from any video web site (like Youtube.com)?

Ahmed

Hi Ahmed,

Yes, NBAR can be used to apply application based filters such as blocking youtube.com traffic. To accomplish this we can categorize traffic based on the HTTP hostname. Next we will create a policy-map that matches the youtube.com class and drops the traffic. Lastly the policy is applied outbound to the Internet. Syntax-wise this would read:

R1#
class-map match-all YOUTUBE
 match protocol http host "*youtube.com*"
!
policy-map DROP_YOUTUBE
 class YOUTUBE
   drop
!
interface FastEthernet0/0
 description TO INTERNET
 service-policy output DROP_YOUTUBE

NBAR for HTTP can also be used to match based on URL string or IANA MIME type. For more information see:

Network-Based Application Recognition and Distributed Network-Based Application Recognition

May 6th, 2008

Understanding the IP Multicast Helper-Map Command

Hi Brian,

I have a problem with the multicast helper topic, the case when a broadcast network is separated by a multicast network, and then again it continues. Can you discuss this topic?

Thanks,

Nizami

Hi Nizami,

The multicast helper-map command is similar in theory to how the unicast “ip helper-map” works. With the IP helper map feature, IP broadcast packets, such as UDP based DHCP requests, have their destination addresses translated to a unicast address, such as the DHCP server. With the IP multicast helper map feature, IP broadcast packets have their destination addresses translated to a multicast address.

The common design application of this feature is in financial trading networks where a legacy stock ticker application sends packets out as broadcast UDP. The router on the attached segment can then convert the broadcast destination to multicast, send the packet into the multicast transit network, and then on the last hop router attached to the receiver translate the multicast packet back to a broadcast. This allows the network to scale above a flat layer 2 design where all application senders and receivers are in the same IP subnet, to a hierarchical layer 3 routed multicast network, without the application itself being modified.

Configuration-wise the feature is implemented on two devices, the first hop router attached to the broadcast sender, and the last hop router attached to the broadcast receiver. The first hop router listens for broadcast packets to be received on the incoming interface attached to the sender. Based on an access-list match (usually the UDP port of the application), the router translates the destination address to a user defined multicast address, and forwards the packet out interfaces running PIM according to the multicast routing table. This design therefore assumes that the underlying PIM topology is built end-to-end. Once the last hop router receives the traffic on the incoming interface facing the multicast network, the traffic is again categorized by an access-list, and additionally by the multicast group used on the first hop. Based on the directed broadcast address defined on the last hop router the traffic is then dropped off on the LAN segment facing the receiver.

In our particular design the network looks like this:

SW1 — R4 -– R3 — R2 — R1 — SW2

SW1 is the broadcast sender (i.e. the source application), SW2 is the receiver (i.e. the destination application), R4 is the first hop router, and R1 is the last hop router. IGP and PIM adjacencies exist between R4 – R3, R3 – R2, and R2 – R1.

R4’s configuration, the first hop router, looks as follows:

R4#
interface FastEthernet0/0
 description TO SENDER APPLICATION – SW1
 ip address 173.20.47.4 255.255.255.0
 ip multicast helper-map broadcast 224.1.2.3 100
!
ip forward-protocol udp 31337
access-list 100 permit udp any any eq 31337

This configuration means that if R4 receives a UDP broadcast going to port 31337 inbound on Fa0/0 it will be translated to the multicast address 224.1.2.3. Note that the use of the “ip forward-protocol” command is necessary in order to process switch UDP traffic going to the port in question. Without process switching the helper-map feature can not correctly categorize and translate the traffic.

R1’s configuration, the last hop router, looks as follows:

R1#
interface Serial0/0.102 point-to-point
 description TO R2
 ip address 173.20.12.1 255.255.255.0
 ip pim dense-mode
 ip multicast helper-map 224.1.2.3 173.20.18.255 100
 frame-relay interface-dlci 102
!
interface FastEthernet0/0
 description TO RECEIVER – SW2
 ip address 173.20.18.1 255.255.255.0
 ip directed-broadcast
!
ip forward-protocol udp 31337
access-list 100 permit udp any any eq 31337

This configuration means that if R1 receives a UDP multicast going to the group address 224.1.2.3 at port 31337 inbound on S0/0.102 it will be translated to the directed broadcast address 173.20.18.255. Since the link 173.20.18.0/24 is directly connected and has the directed broadcast address of 173.20.18.255 by default, the configuration implies that traffic matching the helper map on S0/0.102 will be sent as a broadcast out Fa0/0.

Note the use of the “ip forward-protocol” command as before in order to process switch the UDP traffic. Additionally the “ip directed-broadcast” command is enabled on the last hop outgoing interface since in current IOS versions this is disabled by default for security purposes.

To verify the functionality of this feature we can use the IP SLA feature in the IOS to generate broadcast UDP traffic on the sender. This configuration on SW1 is as follows:

rtr 1
 type udpEcho dest-ipaddr 255.255.255.255 dest-port 31337 source-ipaddr 173.20.47.7 source-port 12345 control disable
 timeout 0
 frequency 5
rtr schedule 1 life forever start-time now

This config means that SW1 will generate a UDP packet sourced from the address 173.20.47.7 at port 12345 going to the address 255.255.255.255 at port 31337 every 5 seconds, and will not wait for a response back. The following debug on R4, the first hop router, verifies that the packet is received and is translated into multicast.

Rack20R4#debug ip packet detail
IP packet debugging is on (detailed)
IP: s=173.20.47.7 (FastEthernet0/0), d=255.255.255.255, len 44, rcvd 2
    UDP src=12345, dst=31337
Rack20R4#undebug all
All possible debugging has been turned off

Rack20R4#debug ip mpacket
IP multicast packets debugging is on
IP(0): s=173.20.47.7 (FastEthernet0/0) d=224.1.2.3 (Serial0/0) id=0, ttl=254, prot=17, len=44(44), mforward
Rack20R4#undebug all
All possible debugging has been turned off

From the unicast “debug ip packet detail” we can see the packet is received in Fa0/0 from SW2 with the proper destination and port information. Next the multicast “debug ip mpacket” shows us that the packet has been translated to multicast address 224.1.2.3 and is forwarded out Serial0/0 towards R3.

As R4, R3, R2, and R1 receive the multicast packet the multicast routing table is populated as follows.

Rack20R4#show ip mroute 224.1.2.3
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.1.2.3), 01:24:42/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    Serial0/0, Forward/Dense, 01:24:42/00:00:00

(173.20.47.7, 224.1.2.3), 00:01:27/00:02:58, flags: T
  Incoming interface: FastEthernet0/0, RPF nbr 0.0.0.0
  Outgoing interface list:
    Serial0/0, Forward/Dense, 00:01:27/00:00:00

Rack20R3#show ip mroute 224.1.2.3
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.1.2.3), 01:25:36/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    Serial1/1.312, Forward/Dense, 01:25:36/00:00:00
    Serial1/0, Forward/Dense, 01:25:36/00:00:00

(173.20.47.7, 224.1.2.3), 00:02:22/00:02:54, flags: T
  Incoming interface: Serial1/0, RPF nbr 173.20.0.4
  Outgoing interface list:
    Serial1/1.312, Forward/Dense, 00:02:23/00:00:00

Rack20R2#show ip mroute 224.1.2.3
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.1.2.3), 01:25:27/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    Serial0/0.213, Forward/Dense, 01:25:27/00:00:00
    Serial0/0.201, Forward/Dense, 01:25:27/00:00:00

(173.20.47.7, 224.1.2.3), 00:02:12/00:02:54, flags: T
  Incoming interface: Serial0/0.213, RPF nbr 173.20.23.3
  Outgoing interface list:
    Serial0/0.201, Forward/Dense, 00:02:13/00:00:00

Rack20R1#show ip mroute 224.1.2.3
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.1.2.3), 01:25:42/stopped, RP 0.0.0.0, flags: DCL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    Serial0/0.102, Forward/Dense, 01:25:42/00:00:00

(173.20.47.7, 224.1.2.3), 00:02:27/00:02:57, flags: PLTX
  Incoming interface: Serial0/0.102, RPF nbr 173.20.12.2
  Outgoing interface list: Null

Once the packet is received on R1, the last hop router, the “debug ip mpacket” shows the packet coming in as multicast, while the “debug ip packet detail” shows that the packet being converted back into a broadcast. This is also verified by the “debug ip packet” output on SW2, the receiver of the packet.

Rack20R1#debug ip mpacket
IP multicast packets debugging is on
IP(0): s=173.20.47.7 (Serial0/0.102) d=224.1.2.3 id=0, ttl=251, prot=17, len=48(44), mroute olist null
Rack20R1#undebug all
All possible debugging has been turned off

Rack20R1#debug ip packet detail
IP packet debugging is on (detailed)
IP: tableid=0, s=173.20.47.7 (Serial0/0.102), d=173.20.18.255 (FastEthernet0/0), routed via RIB
Rack20R1#undebug all
All possible debugging has been turned off

Rack20SW2#debug ip packet
IP packet debugging is on
IP: s=173.20.47.7 (Vlan18), d=255.255.255.255, len 44, rcvd 2
IP: s=173.20.47.7 (Vlan18), d=255.255.255.255, len 44, stop process pak for forus packet
Rack20SW2#undebug all
All possible debugging has been turned off

This feature can also be used in the opposite manner, where a multicast packet is received, converted to broadcast, and then converted back to multicast. In either case the configuration depends on the design and functionality of the source and destination application.

May 5th, 2008

Understanding BGP Outbound Route Filtering (BGP ORF)

Hi Brian,

I’m having a problem with Workbook Volume 1 Version 4.1. ORF (Outbound Route Filtering) isn’t working for me. Any help would be appreciated.

Thank you,

JoeT

Hi Joe,

First off let’s talk a little bit about what BGP ORF (Outbound Route Filtering) is designed to do for us, and then we’ll take a look at some implementation examples.

From a customer’s point of view there are typically a limited amount of choices for what routes you can receive from your Service Provider via BGP. Usually the Service provider will give the customer the option of sending them a full table view (currently about 260,000 prefixes), just a default route, or some specific subset of the table such as a default route and the Service Provider’s locally originated prefixes. In other words a BGP Service Provider generally will not implement a complex outbound filtering policy for the customer. Instead, if the customer wants to receive just a subset view of the BGP table, the Customer Edge (CE) router has to filter prefixes inbound as they are received from the upstream Provider Edge (PE) router.

From the SP’s point of view this is the optimal design for administration. They don’t need to worry about change requests constantly coming from the customer about what routes they want to see and what routes they don’t want to see. Likewise from the customer’s point of view this is the optimal administrative design, as they do not need to send change control requests to the provider, and can arbitrarily change their filtering design on the fly. However from a device resource point of view this is not optimal from both the PE and CE routers’ perspective. The SP’s PE router must still send the full BGP table to the customer, even if the CE router filters out 99% of it. Likewise the CE router must still process all of the BGP UPDATE messages, even if the majority of them are ultimately filtered out.

Let’s take this a look at the result of this in the context of the following design:

AS100 — AS200
(PE) -– (CE)

AS 200 has an upstream peering to its Service Provider, AS 100. The BGP table of AS 200 appears as follows:

AS200_CE#show ip bgp
BGP table version is 12, local router ID is 10.0.0.200
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0          10.0.0.100               0             0 100 i
*> 28.119.16.0/24   10.0.0.100                             0 100 54 i
*> 28.119.17.0/24   10.0.0.100                             0 100 54 i
*> 112.0.0.0        10.0.0.100                             0 100 54 50 60 i
*> 113.0.0.0        10.0.0.100                             0 100 54 50 60 i
*> 114.0.0.0        10.0.0.100                             0 100 54 i
*> 115.0.0.0        10.0.0.100                             0 100 54 i
*> 116.0.0.0        10.0.0.100                             0 100 54 i
*> 117.0.0.0        10.0.0.100                             0 100 54 i
*> 118.0.0.0        10.0.0.100                             0 100 54 i
*> 119.0.0.0        10.0.0.100                             0 100 54 i

Let’s suppose that from AS 200’s perspective the only routes that they want to receive from AS 100 are the default route plus the networks 28.119.16.0/24 and 28.119.17.0/24. Traditional filtering would dictate that on the CE router a prefix-list would be configured and applied as follows:

router bgp 200
 neighbor 10.0.0.100 remote-as 100
 neighbor 10.0.0.100 prefix-list AS_100_INBOUND in
!
ip prefix-list AS_100_INBOUND seq 5 permit 0.0.0.0/0
ip prefix-list AS_100_INBOUND seq 10 permit 28.119.16.0/24
ip prefix-list AS_100_INBOUND seq 15 permit 28.119.17.0/24

The result of this configuration in AS 200’s BGP table is as follows:

AS200_CE#show ip bgp
BGP table version is 4, local router ID is 10.0.0.200
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0          10.0.0.100               0             0 100 i
*> 28.119.16.0/24   10.0.0.100                             0 100 54 i
*> 28.119.17.0/24   10.0.0.100                             0 100 54 i

Although the filtering goal is achieved, efficiency is not. From the below debug output we can see exactly how AS 200’s CE router processes the updates from the upstream PE and makes a decision on what to install:

AS200_CE#debug ip bgp updates
BGP updates debugging is on for address family: IPv4 Unicast
AS200_CE#clear ip bgp 100
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Down User reset
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Up
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0
BGP(0): Revise route installing 1 of 1 routes for 0.0.0.0/0 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 115.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 114.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 119.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 118.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 117.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 116.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54 50 60
BGP(0): 10.0.0.100 rcvd 113.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 112.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): Revise route installing 1 of 1 routes for 28.119.16.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): Revise route installing 1 of 1 routes for 28.119.17.0/24 -> 10.0.0.100(main) to main IP table
AS200_CE#

Note that the AS200_CE router generates the log message “DENIED due to: distribute/prefix-list;” for every prefix that is filtered. This means that if this were the public BGP table of 260,000+ routes this router would have to process every update message just to discard it. This is where BGP Outbound Route Filtering (ORF) comes in.

Outbound Route Filtering Capability for BGP-4 is currently an IETF draft (http://www.ietf.org/internet-drafts/draft-ietf-idr-route-filter-16.txt) that describes an optimization on how prefix filtering can occur between a Customer Edge (CE) router and a Provider Edge (PE) router that are exchanging IPv4 unicast BGP prefixes. In the design we saw above the upstream PE router sent the full BGP table downstream to the CE router, and filtering was applied inbound on the downstream CE. With BGP ORF the downstream CE router dynamically tells the upstream PE router what routes to filter *outbound*. This means that the downstream CE router will only receive update messages about the prefixes that it wants.

Implementation wise the first step of this feature is for the BGP neighbors to negotiate that they both support the BGP ORF capability. Configuration-wise this looks as follows:

AS100_PE#
router bgp 100
 neighbor 10.0.0.200 remote-as 200
 !
 address-family ipv4
 neighbor 10.0.0.200 capability orf prefix-list receive
 neighbor 204.12.25.254 activate
 exit-address-family

AS200_CE#
router bgp 200
 neighbor 10.0.0.100 remote-as 100
 !
 address-family ipv4
 neighbor 10.0.0.100 capability orf prefix-list send
 neighbor 10.0.0.100 prefix-list AS_100_INBOUND in
 exit-address-family
!

The result of this configuration on AS 200’s CE is the same, however the behind the scenes mechanism by which it is accomplished is different. First, AS100_PE and AS200_CE negotiate the BGP ORF capability during initial BGP peering establishment. The success of this negotiation can be seen as follows.

AS100_PE#show ip bgp neighbors 10.0.0.200 | begin AF-dependant capabilities:
  AF-dependant capabilities:
    Outbound Route Filter (ORF) type (128) Prefix-list:
      Send-mode: received
      Receive-mode: advertised
  Outbound Route Filter (ORF): received (3 entries)
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               2          0
    Prefixes Total:                 2          0
    Implicit Withdraw:              0          0
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          0
    Used as multipath:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    ORF prefix-list:                      8        n/a
    Total:                                8          0
  Number of NLRIs in the update sent: max 4, min 2

*OUTPUT OMITTED*

AS200_CE#show ip bgp neighbors 10.0.0.100 | begin AF-dependant capabilities:
  AF-dependant capabilities:
    Outbound Route Filter (ORF) type (128) Prefix-list:
      Send-mode: advertised
      Receive-mode: received
  Outbound Route Filter (ORF): sent;
  Incoming update prefix filter list is AS_100_INBOUND
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               0          3 (Consumes 156 bytes)
    Prefixes Total:                 0          4
    Implicit Withdraw:              0          1
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          3
    Used as multipath:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Suppressed duplicate:                 0          1
    Bestpath from this peer:              3        n/a
    Total:                                3          1
  Number of NLRIs in the update sent: max 0, min 0

*OUTPUT OMITTED*

Next, AS 200’s CE router tells AS 100’s PE router which prefixes it wants to receive. The logic of this configuration is that AS 200 is “sending” a prefix-list of what routes it wants, while AS 100 is “receiving” the prefix-list of what the downstream neighbor wants. The reception of the prefix-list by the upstream PE can be verified as follows.

AS100_PE#show ip bgp neighbors 10.0.0.200 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 10.0.0.200: 3 entries
   seq 5 permit 0.0.0.0/0
   seq 10 permit 28.119.16.0/24
   seq 15 permit 28.119.17.0/24

AS100_PE#show ip prefix-list

Note that AS 100’s PE router received the list from AS 200’s CE, but the prefix-list does not show up locally in the running config. AS 100’s PE router then turns around and uses the prefix-list as an outbound filter towards the downstream CE. This can be verified two ways, by viewing the UPDATE messages on the downstream CE, and by looking at what the upstream PE is sending.

AS100_PE#show ip bgp neighbors 10.0.0.200 advertised-routes
BGP table version is 11, local router ID is 10.0.0.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Originating default network 0.0.0.0

   Network          Next Hop            Metric LocPrf Weight Path
*> 28.119.16.0/24   204.12.25.254            0             0 54 i
*> 28.119.17.0/24   204.12.25.254            0             0 54 i

Total number of prefixes 2
AS100_PE#

AS200_CE#debug ip bgp updates
BGP updates debugging is on for address family: IPv4 Unicast
AS200_CE#clear ip bgp 100
AS200_CE#
BGP(0): no valid path for 0.0.0.0/0
BGP(0): no valid path for 28.119.16.0/24
BGP(0): no valid path for 28.119.17.0/24
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Down User reset
BGP(0): nettable_walker 0.0.0.0/0 no best path
BGP(0): nettable_walker 28.119.16.0/24 no best path
BGP(0): nettable_walker 28.119.17.0/24 no best path
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Up
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0
BGP(0): Revise route installing 1 of 1 routes for 0.0.0.0/0 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24
BGP(0): Revise route installing 1 of 1 routes for 28.119.16.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): Revise route installing 1 of 1 routes for 28.119.17.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0...duplicate ignored
AS200_CE#

Note that the above output is different from the previous debug of AS 200’s CE, because now it does not receive the extra update messages. AS 200 instead now receives only the routes that it has requested of the upstream PE.

If edits of the filter are required the downstream CE can change the prefix-list, and then through the BGP Route Refresh capability, advertise the new prefix-list upstream to the PE to be used as a new downstream filter. Configuration wise this is accomplished as follows.

AS200_CE#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
AS200_CE(config)#ip prefix-list AS_100_INBOUND permit 114.0.0.0/8
AS200_CE(config)#end
AS200_CE#
%SYS-5-CONFIG_I: Configured from console by console
AS200_CE#clear ip bgp 100 in prefix-filter
AS200_CE#
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 114.0.0.0/8
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24...duplicate ignored
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24...duplicate ignored
BGP(0): Revise route installing 1 of 1 routes for 114.0.0.0/8 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0...duplicate ignored
AS200_CE#

From the “debug ip bgp updates” output we can now see that the upstream PE added the update 114.0.0.0/8, in addition to the previous three prefixes that were installed. Upstream verification on the PE also indicates this.

AS100_PE#show ip bgp neighbors 10.0.0.200 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 10.0.0.200: 4 entries
   seq 5 permit 0.0.0.0/0
   seq 10 permit 28.119.16.0/24
   seq 15 permit 28.119.17.0/24
   seq 20 permit 114.0.0.0/8
AS100_PE#show ip bgp neighbors 10.0.0.200 advertised-routes
BGP table version is 11, local router ID is 10.0.0.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Originating default network 0.0.0.0

   Network          Next Hop            Metric LocPrf Weight Path
*> 28.119.16.0/24   204.12.25.254            0             0 54 i
*> 28.119.17.0/24   204.12.25.254            0             0 54 i
*> 114.0.0.0        204.12.25.254                          0 54 i

Total number of prefixes 3
AS100_PE#

For more information on this feature:

Outbound Route Filtering Capability for BGP-4
http://www.ietf.org/internet-drafts/draft-ietf-idr-route-filter-16.txt

BGP Prefix-Based Outbound Route Filtering
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t11/feature/guide/ft11borf.html

May 2nd, 2008

Using RTP Loopback for VoIP/PSTN Call Testing

A voice lab rack usually utilizes dedicated piece of hardware to simulate PSTN switch. Commonly, you can find a Cisco router in this role, with a number of E1/T1 cards set to emulate ISDN network side. It perfectly suits the function, switching ISDN connections between the endpoints. Additionally, it is often required to have an “independent” PSTN phone connected to the PSTN switch, in order to represent “outside” dialing patterns - such as 911, 999, 411 1-800/900 numbers. The most obvious way to do this is to enable a CallManager Express on the PSTN router, and register either hardware IP Phone or any of IP Soft-phones (such as IP Blue or CIPC) with the CME system.

However, there is another way to accomplish the same goal using IOS functionality solely. It relies on the IP-to-IP gateway feature, called “RTP loopback” session target. It is intended to be used for VoIP call testing, but could be easily utilized to loopback incoming PSTN calls to themselves. Let’s say we want PSTN router to respond to incoming calls to an emergency number 911. Here is how a configuration would look like:

PSTN:
voice service voip
 allow-connections h323 to h323
!
interface Loopback0
 ip address 177.254.254.254 255.255.255.255
!
dial-peer voice 911 voip
 destination-pattern 911
 session target ipv4:177.254.254.254
 incoming called-number 999
 tech-prefix 1#
!
dial-peer voice 1911 voip
 destination-pattern 1#911
 session target loopback:rtp
 incoming called-number 1#911

The trick is that only IP-to-IP calls could be looped back. Because of that, we need to redirect the incoming PSTN call to the router itself first, in order to establish an incoming VoIP call leg.

While this approach permit VoIP call testing, it lack one important feature, available with “real” PSTN phone: placing calls from the PSTN phone to in-rack phones. While this may seem a serious issue, you can always use “csim start” command on the PSTN router to overcome this obstacle. Have fun!

April 28th, 2008

IP Manager Assistant Proxy Mode Explained

IPMA is yet another well-known CCM application that you may encounter on your CCIE Voice lab exam. While IPMA Proxy mode is clearly a legacy approach to configure this application its still a topic you could see in the lab. Before we discuss the configuration steps, let’s take a quick overview of a simplified model for IPMA Proxy mode operations. Refer to the diagram for IP Phone extension numbers.

IPMA Proxy Mode

The whole purpose of IPMA proxy is to intercept calls going directly to manager’s IP Phone primary line (“1001”), and proceed them using IPMA configurable call routing logic – usually divert calls to the assistant’s so-called Proxy line (“1112”). In other words, calls placed to manager phone, gets re-routed to assistant’s “proxy” line. Of course, manager has some control of the call-routing logic from it’s IP Phone, using special set of softkeys (plus a Web-interface for advanced configurations).

Now the whole idea of Proxy mode is to put the IPMA application in the call routing path between a caller and the manager’s primary extension. To accomplish this goal, manager’s primary line should be isolated into a separate partition – let’s call it PT_MANAGER. No other IP Phone in the system should have direct access to this partition – their respective CSSes should not contain this partition. Let’s name the CSS used by all IP Phones in the system as CSS_DEFAULT.

Now recall that IPMA is a Java application running in Cisco Tomcat server. IPMA uses CTI interface to control various call-routing components in the CallManager. Specifically, a CTI Route Point should be created in the CallManager system, and IPMA application should take control of it. Next, a “wildcard” extension “100X” should be associated with the CTI RP line and placed in partition PT_INTERNAL - the default partition used for all IP Phone lines within the system (Well, the DocCD recommends using a separate partition for the CTI RP – and indeed, this is a more flexible approach. However, for the sake of the configuration speed, it makes sense to use the minimum set of partitions). The “wildcard” extension number is actually used in configurations where many managers with the primary extension numbers in range “100X” should be covered with the IPMA application. If you are providing call coverage for just one manager’s phone, you can use “1001” here. Also, you may want to set the CFNA number to “100X” or “1001” – this will provide call routing backup in a case when IPMA application would happen to fail.

With the above configuration, when any phone in the system calls “1001” – the manager’s primary line, the call gets routed to “100X/PT_INTERNAL”, and eventually hits the IPMA application. At this point, the IPMA application may want to direct the call to the manager’s real line – “1001/PT_MANAGER” – and this is why the CTI RP should have a special CSS assigned, which has access to PT_MANAGER partition. Let’s name this CSS as CSS_IPMA. As a minimum, CSS_IPMA should contain PT_MANAGER and PT_INTERNAL – since the IPMA may need to redirect call to some other internal extension. (Note that “1001/PT_MANAGER” precedes “100X/PT_INTERNAL” when using CSS_IPMA. This order resolves the ambiguity, even in case when one assigns number “1001” to CTI RP).

To complete the picture, recall the proxy line on assistant’s phone. This is where IPMA application direct calls to by default. Since the assistant may need to direct the call back to manager’s phone, this proxy line should be configured to use CSS_IPMA as the Line CSS. With this setup, the proxy number “1112” is placed in PT_INTERNAL partition (so the CTI RP can reach it) and is allowed to call the manager’s primary line directly. Of course, the primary line of assistant’s phone (“1002”) has no special Line CSS configured, and will therefore hit the IPMA application when calling “1001”.

Per the recommended design, you should also create two intercom lines on both manager’s and assistant’s IP Phones. An intercom is simple a line, which has auto-answer with speakerphone turned on. On the opposite side, you just add a speed dial to reach the intercom number. Thus, you need to intercom lines plus two speed dials to accomplish the intercom configuration.

Now let’s move to the actual configuration. While CallManager has a special built-in IPMA Wizard, personally I’d prefer not to mess with it - unless you’re absolutely sure about what you are doing – the wizard will modify your partitions and CSSes, and may do that in the way you don’t expect. Configuring IPMA proxy mode manually takes a little more time, but once you understand it completely, it won’t take that much. Plus, you get full control of your configuration. So it’s a good idea to create your own IPMA configuration checklist, and use it during your practice. Here is how a checklist may look like.

[Call Intercept]

1) Create Partitions & CSSes: PT_MANAGER & CSS_IPMA
2) Create CTI RP, assign extension number “100X/PT_INTERNAL” to it, set CSS_IPMA as the device CSS. You may also use “1001” extension to cover just one manger
2.1) Set CFNA to “100X” or to “1001” if you provide call coverage for just one manager. This will provide call backup if the IPMA application fails.

[IP Phones]

1) Create a new Button Template, say “3+3 7960” to allow more than two lines on an IP Phone. You will need this template for assistant’s phone, to accommodate three lines: primary, proxy and intercom.

2) Configure the Manager’s Phone
2.1) Set Softkey template to “Standard IPMA Manager”
2.2) Configure the primary line in “PT_MANAGER”
2.3) Add an intercom line, “*1001” and a speed-dial to “*1002” to reach the assistant
2.4) Create IPMA IP Phone service & subscribe the IP Phone to it (URL could be found on DocCD)

3) Configure the Assistant’s Phone
3.1) Set Softkey template to “Standard IPMA Assisant”
3.2) Set Button Template to “3+3 7960” (assistant needs extra lines)
3.2) Add a proxy line “1112/PT_INTERNAL” and set the Line CSS to “CSS_IPMA” for this line
3.3) Add an intercom line, “*1002” and a speed-dial to “*1001” to reach the assistant

[Users]

1) Create a new user named “manager”
1.1) Allow it the use of CTI Application & CTI Super Provider
1.2) Associate this user with manager’s IP Phone

2) Create a new user named “assistant”
2.1) Allow it the use of CTI Application & CTI Super Provider
2.2) Associate this user with assistant’s IP Phone

3) Get back to “manager” user
3.1) Start the Cisco IPMA configuration dialog and disable automatic configuration
3.2) Configure the settings per your setup
3.3) Add a new assistant to the manager
3.4) Configure the assistant, matching proper primary manager’s line against the assistant’s proxy line

[IPMA Application]

1) Choose Service Parameters for Cisco IPMA Application
2) Configure CTI Manager IP Addresses (primary/backup). In simplest case just use your Publisher IP
3) Configure IPMA Application IP Addresses (primary/backup). In simplest case just use your Publisher IP
4) Set the CTI RP name for the IPMA application
5) Restart the Cisco Tomcat Windows Service or go to Tomcat manager interface at http://[IPMA server IP Address]/manager/list and restart the service there

[Verification]

1) Check that manager’s phone has IPMA softkey set on it’s screen
2) Install the Cisco IPMA Console Application and log in there as “assistant”
3) Place a call to manager’s primary line, ensure it get’s routed to the assistant phone, and pick it up from the IPMA console. Forward the call back to manager’s primary line
4) Configure from the manager’s phone to accept all calls and place a call to manager’s primary line once again

Making checklists for complex tasks is a must when preparing to CCIE Voice lab. The above list suggests a simplified manual approach to configure all IPMA application settings, in the order specifically optimized for speed. However, if you are pretty much comfortable with the IPMA Wizard, you can use it for your setup. Just make sure you performed a thorough verification after that.

The final note is about interaction of IPMA proxy mode with the voicemail system. Since we isolate the manager’s primary line in separate partition, we need to make sure MWI CSS is able to access it, in order to light the MWI lamp. Make sure you wont forget about it, since this may cost you some precious points.

Further reading:

Cisco IP Manager Assistant With Proxy Line Support

April 24th, 2008

GLBP Explained

GLBP, an acronym for Gateway Load Balancing Protocol, is a virtual gateway protocol similar to HSRP and VRRP. However, unlike it’s little brothers, GLBP is capable of utilizing multiple physical gateways at the same time. As we know, a single HSRP or VRRP group represents once virtual gateway, with single virtual IP/MAC addresses. Only one physical gateway in a standby/redundancy group is responsible for packet forwarding, others remain inactive in standby/backup state. Say if you have R1, R2, R3 sharing the segment 174.X.123.0/24 with the physical IP addresses 174.X.123.1, 174.X.123.2 and 174.X.123.3 you may configure them to represent one single virtual gateway with an IP address 174.X.123.254. The physical gateway priority settings will determine which physical gateway takes the role of packet forwarder. The hosts on the segment will set their default gateway to 174.X.123.254, staying out of the physical gateway failure issues.

GLBP further develops this idea, allowing multiple gateways to participate in packet forwarding simultaneously. Considering the example above, imagine you want the hosts on the segments to fully utilize all existing physical gateways, yet provide gateway failure recovery. For instance, you may want 50% of outgoing packets to be sent up to R1, 30% to R2 and 20% to R3. At the same time, you want to ensure, that hosts using either of the gateways will automatically switch to another if their gateway fails. On top of that, all hosts in the segment should reference to the virtual gateway using the same IP address 174.X.123.254. This is a complicated task, which is being addressed by GLBP protocol design.

To begin with, we should recall that each host on the segment would need to resolve the virtual gateway IP address 174.X.123.254 to a MAC address using ARP protocol. When we use HSRP or VRRP, the ARP response will be the virtual MAC addresses, which is assigned to the active physical gateway. At this point, GLBP differs in that it may respond with different virtual MAC addresses, belonging to various physical gateways in the GLBP group. So the key idea with GLBP is that load balancing is accomplished by responding to ARP requests with different virtual MAC addresses.

Here is how GLBP actually implements the above idea. One of the routers in a GLBP group is elected as AVG – Active Virtual Gateway. There is only one active AVG in a group, and its task is to respond to ARP requests sent to the virtual gateway IP address (e.g. 174.X.123.254) replying different virtual MAC addresses in response packets. The AVG will also implement load-sharing algorithm, e.g. by sending the replies in proportion to weights configured for physical gateways. Aside from AVG, the other logical component of GLBP implementation is AVF – Active Virtual Forwarder. Any physical gateway in a GLBP group may act as AVF – in fact all physical gateways are usually AVFs. Every AVF has a virtual MAC address assigned by an AVG and a weight value configured by an operator.

Now let’s discuss redundancy – the primary goal of any virtual router protocol. There are two logical entities used to build a GLBP group: AVGs and AVFs, and each of them needs a backup scheme. Since there is just one AVG per a GLBP group, the procedure is pretty simple: each candidate AVG has a priority value assigned; the highest priority router becomes an active AVG, the next by priority becomes a standby AVG. You may configure AVG preemption, so that a newly configured router with highest priority value becomes AVG, preemption the old one.

What about AVF redundancy? First, we need to understand that AVFs are always “active” in the sense that they are always used by a load-balancing algorithm. (However, by setting an AVG weight value below threshold, we may effectively take the AVF out of service. The weight value could be combined with object tracking to bring powerful traffic manipulation options). Next, with respect to redundancy, all AVFs backup each other. For instance, take any AVF: with respect to the other AVFs it is “Active”, and the remaining AVFs are in “Listen” state. If the AVF would fail, other gatewyas will detect the event using Hold timer expiration, and immediately try to take over the failed AVF virtual MAC address. Among the competitors, the AVF with highest weight value would win, and the remaining AVFs will switch back to “Listen” state. At this point, the “winner” will start accepting packets for two virtual MAC addresses: it’s own, and the one it has obtained from the failed AVF. At the same moment, two timers would start: Redirect and Secondary Hold. The Redirect timer determines how long will AVG continue to respond to ARP requests with the virtual MAC of the failed AVF. The Secondary Hold timer sets the amount of time the backup AVF will continue to accept packet for the virtual MAC address taken from the failed AVF.

This is basically how GLBP works. Different load-balancing algorithms are supported – the default one is round robin, with options for weighted load balancing and source-MAC based. The last one will always respond with the same vMAC to the same source MAC address, thereby defining sort of host-gateway “stickiness”. Now for a sample GLBP configuration, for the above mentioned R1, R2 and R3:

!
!  We set load-balancing to weighted only on R1
!  So if R2 will become the AVG, it will use round-robin
!  load-balancing technique
!
R1:
interface FastEthernet0/0
 ip address 174.1.123.1 255.255.255.0
 glbp 123 ip 174.1.123.254
 glbp 123 preempt
 glbp 123 weighting 50
 glbp 123 load-balancing weighted
!
!
!
R2:
interface FastEthernet0/0
 ip address 174.1.123.2 255.255.255.0
 glbp 123 ip 174.1.123.254
 glbp 123 priority 50
 glbp 123 preempt
 glbp 123 weighting 30
!
!
!
R3:
interface Ethernet0/0
 ip address 174.1.123.3 255.255.255.0
 glbp 123 ip 174.1.123.254
 glbp 123 priority 25
 glbp 123 preempt
 glbp 123 weighting 20

Some show output:

Rack1R1#show glbp
FastEthernet0/0 - Group 123
  State is Active
    2 state changes, last state change 03:12:05
  Virtual IP address is 174.1.123.254
  Hello time 3 sec, hold time 10 sec
    Next hello sent in 0.916 secs
  Redirect time 600 sec, forwarder time-out 14400 sec
  Preemption enabled, min delay 0 sec
  Active is local
  Standby is 174.1.123.2, priority 50 (expires in 8.936 sec) <-- Standby AVG
  Priority 100 (default)
  Weighting 50 (configured 50), thresholds: lower 1, upper 50 <--
<-- Should the weight go below thresh, AVF is taken offline
  Load balancing: weighted
  Group members:
    ca00.0156.0000 (174.1.123.1) local <--   Hardware MACs
    ca01.0156.0000 (174.1.123.2)
    cc02.0156.0000 (174.1.123.3)
  There are 3 forwarders (1 active)
  Forwarder 1
    State is Listen <--  All other AVFs Listen to us
    MAC address is 0007.b400.7b01 (learnt) <--  Virtual MAC 
    Owner ID is ca01.0156.0000 <--  This is R2
    Redirection enabled, 598.928 sec remaining (maximum 600 sec) <--
<-- ARP replies with this vMAC are being sent by AVG
    Time to live: 14398.376 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 174.1.123.2 (primary), weighting 30 (expires in 8.368 sec) <--
   <--  The AVF reports it’s own IP as active
    Arp replies sent: 1
  Forwarder 2
    State is Active <--  Active mean it’s us
      1 state change, last state change 03:12:45
    MAC address is 0007.b400.7b02 (default)
    Owner ID is ca00.0156.0000 <--  R1 MAC address
    Redirection enabled
    Preemption enabled, min delay 30 sec
    Active is local, weighting 50
    Arp replies sent: 1
  Forwarder 3
    State is Listen <--  All other AVFs Listen to us
    MAC address is 0007.b400.7b03 (learnt)
    Owner ID is cc02.0156.0000 <--  This is R3
    Redirection enabled, 597.916 sec remaining (maximum 600 sec)
    Time to live: 14397.916 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 174.1.123.3 (primary), weighting 20 (expires in 7.916 sec)

Rack1R2#show glbp
FastEthernet0/0 - Group 123
  State is Standby
    4 state changes, last state change 03:16:56
  Virtual IP address is 174.1.123.254
  Hello time 3 sec, hold time 10 sec
    Next hello sent in 0.236 secs
  Redirect time 600 sec, forwarder time-out 14400 sec
  Preemption enabled, min delay 0 sec
  Active is 174.1.123.1, priority 100 (expires in 9.148 sec)
  Standby is local <-- We are the standby AVG
  Priority 50 (configured)
  Weighting 30 (configured 30), thresholds: lower 1, upper 30
  Load balancing: round-robin
  Group members:
    ca00.0156.0000 (174.1.123.1)
    ca01.0156.0000 (174.1.123.2) local
    cc02.0156.0000 (174.1.123.3)
  There are 3 forwarders (1 active)
  Forwarder 1
    State is Active
      1 state change, last state change 03:18:06
    MAC address is 0007.b400.7b01 (default)
    Owner ID is ca01.0156.0000 <-- This is R2
    Preemption enabled, min delay 30 sec
    Active is local, weighting 30
  Forwarder 2
    State is Listen
    MAC address is 0007.b400.7b02 (learnt)
    Owner ID is ca00.0156.0000
    Time to live: 14398.644 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 174.1.123.1 (primary), weighting 50 (expires in 8.636 sec)
  Forwarder 3
    State is Listen
    MAC address is 0007.b400.7b03 (learnt)
    Owner ID is cc02.0156.0000
    Time to live: 14399.260 sec (maximum 14400 sec)
    Preemption enabled, min delay 30 sec
    Active is 174.1.123.3 (primary), weighting 20 (expires in 9.260 sec)

Now let’s check how ARP redirection works:

Rack1SW1#ping 174.1.123.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 174.1.123.254, timeout is 2 seconds:
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 8/12/16 ms

Rack1SW1#sh ip arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  174.1.123.254           0   0007.b400.7b01  ARPA   Vlan1
Internet  174.1.123.7             -   cc06.0156.0000  ARPA   Vlan1
Internet  174.1.123.2             0   ca01.0156.0000  ARPA   Vlan1
Rack1SW1#clear arp-cache 

Rack1SW1#ping 174.1.123.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 174.1.123.254, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 4/13/32 ms

Rack1SW1#sh ip arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  174.1.123.254           0   0007.b400.7b02  ARPA   Vlan1
Internet  174.1.123.7             -   cc06.0156.0000  ARPA   Vlan1
Internet  174.1.123.2             0   ca01.0156.0000  ARPA   Vlan1

Repeat the above actions a few more times

Rack1SW1#sh ip arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  174.1.123.254           0   0007.b400.7b03  ARPA   Vlan1
Internet  174.1.123.7             -   cc06.0156.0000  ARPA   Vlan1
Internet  174.1.123.2             0   ca01.0156.0000  ARPA   Vlan1
Internet  174.1.123.3             0   cc02.0156.0000  ARPA   Vlan1

To summarize, GLBP is a virtual gateway protocol, with built-in load-balancing capabilities. Load balancing is based on manipulating ARP responses to the requests sent to the virtual gateway IP address. AVG role is used to load-balance and respond to ARP requests. AVF role manages one or more virtual MACs and is responsible for packet forwarding. AVG redundancy is controlled by GLBP priority and AVF redundancy is implemented using AVF weight value and two additional timers.

Further reading:

GLBP Overview

April 18th, 2008

Understanding IPv6 NAT-PT

IPv6 NAT-PT is to be used with IPv4 to IPv6 migration scenarios and it’s purpose is to provide bi-directional connectivity between IPv4 and IPv6 domains. A dual-stack router with interfaces in both IPv4 and IPv6 networks is capable of performing this task. The difference from classic IPv4 NAT is that translations should be done both ways: IPv6 packets routed towards IPv4 hosts should have their src/dst addresses changed to some IPv4 equivalents and vice versa: IPv4 packets sent toward IPv6 hosts should get both src and dst addresses replaced with IPv6 addresses.

The first question that arises is how in the world IPv6 domain learns about IPv4 hosts and v4 domain knows about existence of v6. Well, the first idea that comes in mind to resolve this, is it to provide static bi-directional mappings. For example, we can manually program router to rewrite destination addresses in IPv6 packets sent to IPv6 address 2000::960B:0202 (a sample address, but note that 960B is 150.11 in decimal) to 150.11.2.2. What about the source address? To translate the source address (e.g. 3001:11:0:1::1) we set up another mapping, that tells to rewrite IPv4 packets (opposite direction) sent to 150.11.1.1 to 3001:11:0:1::1. Since the mapping is bi-directional, IPv6 packets with src/dst address pair [3001:11:0:1::1, 2000::960B:0202] would get rewritten to IPv4 packets with address pair [150.11.1.1, 150.11.2.2] and vice versa – IPv4 packet src/dst [150.11.2.2, 150.11.1.1] will be rewritten to [2000::960B:0202, 3001:11:0:1::1].

Here is how it would look like in IOS configuration. First, note that IPv6 stack classifies packets for NAT-PT via a special IPv6 NAT prefix. This prefix represents the whole IPv4 address space (2^32) embedded within IPv6 super-space, and always has length of 96 bits (128-32=96). Every IPv6 packet sent to this prefix is inspected by NAT-PT engine.

IPv6 NAT-PT

Next, using the configuration depicted on the diagram, we aim to provide connectivity between IPv6 Loopback100 of R1 and IPv4 Loopback0 of R2. In the most simple case of static v6v4 mapping, the configuration would look like the following:

R3:
!
! Enable NAT-PT on the interfaces
!
interface FastEthernet 0/0
 ipv6 nat
!
interface FastEthernet 0/1
 ipv6 nat

!
! Static translation for R1 Loopback0
!
ipv6 nat v6v4 source 3001:11:0:1::1 150.11.3.1

!
! Static translation for R2 Loopback0 
!
ipv6 nat v4v6 source static 150.11.2.2 2000::960b:0202

!
! IPv6 NAT prefix, needed to enable NAT-PT classification
!
ipv6 nat prefix 2000::/96

However, more flexible solutions are available for other deployment models. Suppose we want to provide access to an IPv4 server for a large group of IPv6 hosts. We may set up access to the IPv4 server using static IPv4 to IPv6 mapping, and translate the IPv6 hosts’ source addresses into IPv4 address pool. This way, only the IPv6 hosts will be able to initiate sessions to the IPv4 server, using dynamically allocated IPv4 addresses, but not vice-versa – the IPv6 hosts will not have any persistent mappings to IPv4 address space.

R3:
!
! Enable NAT-PT on the interfaces
!
interface FastEthernet 0/0
 ipv6 nat
!
interface FastEthernet 0/1
 ipv6 nat

!
! Dynamic NAT for IPv6 to IPv4 traffic (the hosts) 
!
ipv6 nat v6v4 source list NAT_TRAFFIC pool IPV6_TO_IPV4

!
! Static translation for R2 Loopback0 (the server)
!
ipv6 nat v4v6 source static 150.11.2.2 2000::960b:0202

!
! Dynamic NAT IPv4 pool
!
ipv6 nat v6v4 pool IPV6_TO_IPV4 150.11.3.128 150.11.3.254 prefix-length 24

!
! IPv6 NAT prefix
!
ipv6 nat prefix 2000::/96 

!
!
!
ipv6 access-list NAT_TRAFFIC
   permit ipv6 any 2000::/96

All right. But what if we want to allow the IPv6 domain to access ANY arbitrary IPv4 host? We will need some automated translation logic to do that, mapping every host under IPv4 address space to a host under our IPv6 NAT prefix, since we can’t provide manual mapping to each and every IPv4 host. The most obvious way to achieve this is to take the last 32 bits of IPv6 destination address and use them as the corresponding IPv4 address. For example, the IPv6 address 2000::960b:0202 corresponds to 150.11.2.2 under this interpretation (960b:0202 = 150.11.2.2). Using this approach, we fully utilize the IPv6 /96 NAT prefix address space. However, we need to make sure all IPv6 hosts are aware of that logic using some mechanism external to IPv6. Here is a configuration example:

R3:
!
! Enable NAT-PT on the interfaces
!
interface FastEthernet 0/0
 ipv6 nat
!
interface FastEthernet 0/1
 ipv6 nat

!
! Dynamic NAT for IPv6 to IPv4 traffic
!
ipv6 nat v6v4 source list NAT_TRAFFIC pool IPV6_TO_IPV4

!
! Dynamic NAT IPv4 pool
!
ipv6 nat v6v4 pool IPV6_TO_IPV4 150.11.3.128 150.11.3.254 prefix-length 24

!
! IPv6 NAT prefix with v4-mapped flag
! the access-list specifies IPv6 traffic eligible to
! access the IPv4 mapped addresses
!
ipv6 nat prefix 2000::/96 v4-mapped NAT_TRAFFIC
!
ipv6 access-list NAT_TRAFFIC
   permit ipv6 any 2000::/96

Verification

Rack11R1#ping 2000::960B:202 source loopback 100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2000::960B:202, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 8/8/8 ms
Rack11R1#

Rack11R3#debug ipv6 nat detailed
IPv6           more_flags = 0
IPv6 NAT: icmp src (3001:11:0:1::1) -> (150.11.3.128), dst (2000::960B:202) -> (150.11.2.2)
IPv6 NAT: ipv6nat_find_entry_v4tov6:
	 ref_count = 1,
                                usecount = 0, flags = 2, rt_flags = 0,
                                more_flags = 0

Note that a proper NAT-PT implementation requires a number of specific ALG (application level gateways) to be used along with NAT. The purpose of ALGs is to resolve application-level issues that arise from IP address change (e.g. fix up FTP PORT command, etc). Currently Cisco IOS supports only a limited number of ALGs, compared to IPv4 NAT implementation.

To summarize:

- IPv6 NAT-PT translates addresses both ways
- IPv6 NAT-PT requires IPv6 NAT /96 prefix
- IPv6 NAT-PT could be configured using static bi-directional entries
- IPv6 NAT-PT dynamic translations use IPv4 address pool to map many IPv6 addresses to a small group of IPv4 addresses
- IPv6 NAT-PT allows IPv4 address mapping inside IPv6 NAT prefix

April 16th, 2008

OSPF Virtual Links and Max Cost

OSPF virtual links are relatively simple to configure and you normally do not run into too many problem getting them up and working but an odd issue you could run into is when trying to run the virtual link over an interface who’s OSPF cost is maximized (65535 or 0xffff).  The virtual link will not come up if the only interface to reach the other end of the virtual link has a cost that is maximized.  For those of you who have not read RFC 2328 I will quote part of section 15 for you below ;)

Note that a virtual link whose underlying path has cost
greater than hexadecimal 0xffff (the maximum size of an interface
cost in a router-LSA) should be considered inoperational (i.e.,
treated the same as if the path did not exist).

Now you may say why would you ever set the cost to 65535 to begin with?  You may not directly set the cost but you may be asked to use the auto-cost reference-bandwidth command for a task and indirectly set the cost of a transit interface for a virtual-link you created earlier to 65535.  So by solving the auto-cost reference-bandwidth task you broke the virtual-link you created earlier and in turn broke a big portion of your OSPF domain.  In fact now that I think about this issue I am going to write it into version 5 of the R&S material ;)

April 16th, 2008

R&S Lab Diagrams

There are a lot of rumors floating around in regards to diagrams in the R&S CCIE lab.  Cisco officially has said little in regards to this other than the following “the lab document has L1/L2 diagrams for the physical connectivity as well as an IP or topology diagram and an IP Routing diagram”.  This is similar to what we provide in our labs but I would venture to say that they don’t take the time we do to ensure that they look as nice as ours ;)  What Cisco and we do not provide is a true layer 2 “logical” diagram but Cisco and we do provide is a physical diagram of the connections in the lab.  A physical diagram is not the same as a logical layer 2 diagram.  A logical layer 2 diagram will include the VLAN assignments, trunks, EtherChannels, dot1q tunnels, VTP and possibly spanning tree information like root bridges, root ports, designated ports, etc.  The choice to draw out the spanning tree information will really come down to the lab itself.  If there are a lot of tasks that relate to spanning tree or layer 2 traffic engineering (i.e. traffic for VLAN 100 should transit SW3, etc) then adding the spanning tree information will help answer these types of tasks.

The logical layer 3 diagram will be provided BUT the diagram they provide may not have the level of detail you want or need plus you can not write on the diagram they give you.  Technically you can write on it but they will suspend you from the lab for one year ;)  We ALWAYS recommend making your own layer 3 logical diagram.  You should also draw out the diagram for every practice lab you do.  Do not wait until the real lab to draw out your first diagram.  As I have said before you never want to do anything in the CCIE lab for the first time other than get your number ;)

There are two main benefits to making your own logical layer 3 diagram.  First off you will find it is easier to remember what the network looks like when reading the tasks and secondly you will be able to draw and/or take notes on your own diagram.   Smart people fail the lab all the time because they make stupid mistakes in the lab and by drawing out the network you will hopefully lower the chances of making these stupid mistake (i.e. configuring RIPv2 on the wrong interface, applying an ACL inbound on one interface when it should have been outbound on another, configuring a feature on the wrong router, etc).  All it takes is two or three of these little mistakes and you have lost 8 or 9 points in the lab.  We all know that it is hard enough to pass the lab without adding in stupid mistakes into the mix ;)  You will also find tasks related to BGP to be easier to answer when you have a diagram that you can take notes on (i.e. who is peering with who, which exit point to use to reach another AS, etc).  It is possible that when you get into the lab that basic BGP is done for you.  It is normally easier to work on a network that you built from the ground up so working on a network that is 50% complete without first taking the time to discover and document what is already done will be harder.

I am sure someone will comment on this and say, “but I won’t have time to draw out the network in the real lab”.  If this is the case you should not be in the lab in the first place.  If it is taking you the full 8 hours to just configure the network you more than likely will not pass the lab to begin with so taking the 10 minutes to draw out the network is not going to really matter in this case.  The percentage of people who pass the lab while configuring the network for the full 8 hours is slim.  Most people who pass the lab complete the lab within 5.5 or 6.5 hours and have the extra time to do the diagram in the beginning.

April 11th, 2008

R&S Lab Attack Plan - Part I

First off be sure to arrive at the lab at least 15 minutes early. I’ve done both arrived early and arrived late. I can tell you from personal experience that arriving early is the best option. ;) When you arrive you will be waiting in the lobby until the proctor comes out to escort you into the lab. When you get into the lab the proctor will give you a quick introduction before starting. This introduction will cover the facilities (i.e. restrooms, break room, etc), the start and stop times and then the proctor will normally ask if anyone has any questions before starting. If you have any general questions I would recommend that you clear them up with the proctor now. For any questions related to the lab itself (i.e. do I need to ping myself over Frame Relay) I would recommend that you wait until you read over the lab before asking as most of these types of question will be answered in the lab itself. After this introduction the lab will officially start. You can now go to your assigned seat and start the lab.

Bring earplugs (the small disposable type) as the lab can in some locations be a little noisy (routers and switches buzzing, phones ringing from the voice racks, etc). Dress in layers (i.e. light jacket or sweater) so that you can remain comfortable. Leave your cellphone, pager and any other electronic devices at your hotel or in your rental car. If you do bring them with you to the lab they will have a place for you to store them. This also holds true for your luggage. If you are going to the airport after the lab you can bring your luggage to the lab and they will have a place for you to store it during the exam. You may want to bring a certain drink or snack. Officially the policy is that you can not bring anything into the lab but normally the proctors will allow you to bring a drink or snack in. If you are taking the lab in RTP the lunch will be catered. You will not have a big choice as to what to eat. This being the case you may want to bring your own lunch if you are a very picky eater or on a special diet. Personally I’ve found the food to be just fine but to be honest what’s for lunch in the lab is the least of my worries as it should be for you.

But before even stepping into the lab you should have a detailed plan. This plan should include what you are going to do when you first arrive. Below is my recommended plan as to what to do when you first arrive in the lab.

1 ) After taking your seat remove the lab and diagrams from the binder. You don’t want to be flipping through the binder all day. The pages themselves will be in plastic sheet covers. Do not remove the pages from the sheet covers. I normally make two stacks for the lab material. One for the pages I haven’t completed or am still working on and the other the pages I have completed.

2 ) The pencils (colored and regular) and pens that they supply will be on your desk. Pick out the ones you will be using and sharpen any if needed. Since they are supplying you with the colored pencils you may end up with different colors then you are used to. This being the case don’t use the same colors all the time and mix it up a little between labs. Now you may try to bring your own colored pencils into the lab and sometimes the proctor will allow them but the official policy is that you can’t bring them in. In that case just use whatever they are supplying and don’t worry about it. But if you feel you can’t pass the lab without your own “magical” set of colored pencils you may want to stop preparing for the lab now and just use your $1400 for a good psychiatrist ;)

3 ) Draw out your own diagrams. Drawing out your own diagrams has two big advantages. First off, you can’t write on the diagrams they give you without being suspended from the lab for one year. By being able to write on your own diagram you can make lots of little notes when working through the lab. Secondly by drawing out the network it will help you learn the network and remember what the network looks like when reading the tasks without the need to repeatedly look back at the diagrams. I can’t tell you how many people I see during the mock lab classes make little simple mistakes like configuring a feature on the wrong device or on the wrong interface. Just one little mistake like this can be the difference between passing or failing.

4 ) Take a quick read over the entire lab to get a general idea of what you are going to be doing throughout the day. Don’t spend time trying to figure out the solution to each task but just get a feel for what they are looking for. You may also find that the order they give you the tasks in is not the ideal order in which you should do the task and you want to figure that out now. I see people in the mock lab workshop make the mistake of just starting with the first task without reading over the lab. By doing this they run into issues when a solution they implemented in an earlier task causes problems with a later task.

5 ) Now log onto your rack using the Windows XP PC provided. You will have shortcuts on the desktop to your devices. By just click on the SecureCRT shortcuts on the desktop you can connect either to the console of the access server or the individual lines. This is the same way we have our rental rack access setup. If you connect directly to the console of the access server you can then reverse telnet (i.e. R1, R2, etc) to the console of the devices. If you want to connect directly to the device’s consoles you can do so but be forewarned that you will need to have ten SecureCRT windows open. Newer versions of SecureCRT support the concept of tabs but the version of SecureCRT in the CCIE lab will not support tabs. This means that you shouldn’t use tabs during the last month or so before your lab date if you plan to connect to the devices directly. Personally I think that working with ten windows is awkward to say the least but it’s going to boil down to a personal preference.

6 ) Now that you are connected to the devices in your rack take a quick minute to ensure that the initial configurations loaded on them match the diagrams provided. To do this just compare the IP addressing on your devices with the ones on the diagram to ensure you have the correct initial configurations loaded. Although rare it does happen that someone either gets the wrong lab or initial configurations loaded.

7 ) Next open three Internet Explorer windows. The first one for the IOS 12.4 documentation, the second one for the 3550 documentation and finally the last one for the 3560 documentation. Do not expect to have a version of Internet Explorer that supports tabs. If the administrative privileges on the PC do not allow you to open multiple instances of Internet Explorer you can go to the one that you could open and then do “File->New Window”.

8 ) Using one of the scratch sheets of paper that they give you make a quick table with the following columns: Task, Point Value and Notes. Use this to document your progress as you do the lab. Also note on here when you complete each section. Include the number of points you achieved in the section, the total points you feel you have and the time you completed the section. Remember that time is not your friend in the lab so you need to make sure don’t loose track of it and you know exactly where you are in the lab at all times.

9 ) If you want to apply additional configuration (alias, logging synchronous, etc) create them in notepad now. Then paste them into the devices. A few I personally would recommend are below:

a) clock set (set the clock on all devices if they are not already set)
b) ensure the logging level for the console is set to debugging
c) logging buffered
d) ip tcp syn-wait 5
e) no ip domain-lookup
f) line con 0
history size 256

Personally I don’t use aliases other than the default ones in the IOS but its a personal preference.

Do not take more than 30 to 35 minutes to get to this point. You should be ready to start the lab now. In the next part I will cover getting up to full reachability in the lab exam.

-->