Friday, August 10, 2007

WAN Switching

In Depth
Switches are not only used in LAN networks; they are also used extensively in wide area networks (WANs). In an Ethernet switching environment, the switch utilizes Carrier Sense Multiple Access with Collision Detection (CSMA/CD). The switch or host sends out a packet and detects if a collision occurs. If there is a collision, the sender waits a random amount of time and then retransmits the packet. If the host does not detect a collision, it sends out the next packet. You may think that if the switch or host is set to full−duplex, there will be no collision—that is correct, but the host still waits between sending packets.
In a Token Ring switching environment, a token is passed from one port to the next. The host must have possession of the token to transmit. If the token is already in use, the host passes the token on and waits for it to come around again. All stations on the network must wait for an available token. An active monitor, which could be any station on the segment, performs a ring maintenance function and generates a new token if the existing token is lost or corrupted.
As you can see, both Token Ring and Ethernet switching require the node to wait. The node must wait either for the token or for the frame to reach the other nodes. This is not the most efficient utilization of bandwidth. In a LAN environment, this inefficiency is not a major concern; in a WAN, it becomes unacceptable. Can you imagine if your very expensive T1 link could be used only half the time? To overcome this problem, WAN links utilize serial transmission.Serial transmission sends the electric signal (bits) down the wire one after another. It does not wait for one frame to reach the other end before transmitting the next frame. To identify the beginning and the end of the frame, a timing mechanism is used. The timing can be either synchronous or asynchronous. Synchronous signals utilize an identical clock rate, and the clocks are set to a reference clock. Asynchronous signals do not require a common clock; the timing signals come from special characters in the transmission stream. Asynchronous serial transmissions put a start bit and a stop bit between each character (usually 1 byte). This is an eight−to−two ratio of data to overhead, which is very expensive in a WAN link. Synchronous serial transmissions do not have such high overhead, because they do not require the special characters; they also have a larger payload. Are synchronous serial transmissions the perfect WAN transmission method? No; the problem lies in how to synchronize equipment miles apart. Synchronous serial transmission is only suitable for distances where the time required for data to travel the link does not distort the synchronization. So, first we said that serial is the way to go, and now we’ve said that serial has either high overhead or cannot travel a long distance. What do we use? Well, we use both, and cheat a little bit. We use synchronous serial transmission for a short distance and then use asynchronous for the remaining, long distance. We cheat by putting multiple characters in each frame and limiting the overhead. When a frame leaves a host and reaches a router, the router uses synchronous serial transmission to pass the frame on to a WAN transmission device. The WAN device puts multiple characters into each WAN frame and sends it out. To minimize the variation of time between when the frames leave the host and when they reach the end of the link, each frame is divided and put into a slot in the WAN frame. This way, the frame does not have to wait for the transmission of other frames before it is sent. (Remember, this process is designed to minimize wait time.) If there is no traffic to be carried in a slot, that slot is wasted. Figure 1 shows a diagram of a packet moving from LAN nodes to the router and the WAN device.
Figure 1: A packet’s journey from a host to a WAN device. The WAN transmission is continuous and does not have to wait for acknowledgement or permission.
Let’s take a look at how this process would work in a T1 line. T1 has 24 slots in each frame; each slot is 8 bits, and there is 1 framing bit:
24 slots x 8 bits + 1 framing bit = 193 bits
T1 frames are transmitted 8,000 frames per second, or one frame every 125 microseconds:
193 bits x 8,000 = 1,544,000 bits per second (bps)
When you have a higher bandwidth, the frame is bigger and contains more slots (for example, E1 has 32 slots). As you can see, this is a great increase in the effective use of the bandwidth.
Another asynchronous serial transmission method is Asynchronous Transfer Mode (ATM). ATM is a cell−based switching technology. It has a fixed size of 53 octets: 5 octets of overhead and 48 octets of payload. Bandwidth in ATM is available on demand. It is even more efficient relative to the serial transmission method because it does not have to wait for assigned slots in the frame. One Ethernet frame can consist of multiple consecutive cells. ATM also enables Quality of Service (QoS). Cells can be assigned different levels of priority. If there is any point of congestion, cells with higher priority will have preference to the bandwidth. ATM is the most widely used WAN serial transmission method.

WAN Transmission Media
The physical transmission media that carry the signals in WAN are divided into two kinds: narrowband and broadband. A narrowband transmission consists of a single channel carried by a single medium. A broadband transmission consists of multiple channels in different frequencies carried on a single medium. The most common narrowband transmission types are T1, E1, and J1. See Table 1 for the differences among the transmission types and where each is used. The time slots specify how much bandwidth (bit rate) the narrowband transmissions have.
Table 1: Narrowband transmission types
Narrowband is most commonly used by businesses as their WAN medium because of its low cost. If more bandwidth is needed than narrowband can provide, most businesses use multiple narrowband connections. The capability of broadband to carry multiple signals enables it to have a higher transmission speed. Table 2
displays the various broadband transmissions, which require more expensive and specialized transmitters and
receivers.Table 2: The different broadband transmission types and their bandwidth.

Digital signal 2 (DS2), E2, E3, and DS3 describe digital transmission across copper or fiber cables. OC/STS resides almost exclusively on fiber−optic cables. The OC designator specifies an optical transmission, whereas the STS designator specifies the characteristics of the transmission (except the optical interface). There are two types of fiber−optic media:

Single−mode fiber—Has a core of 8.3 microns and a cladding of 125 microns. A single light wave powered by a laser is used to generate the transmission. Single−mode can be used for distances up to 45 kilometers; it has no known speed limitation. Figure 2 shows an example of a single−mode fiber.



Figure 2: Single mode fiber.
Multimode fiber—Has a core of 62.5 microns and a cladding of 125 microns. Multiple light waves powered by a light−emitting diode (LED) are used to power the transmission. Multimode has a distance limit of two kilometers; it has a maximum data transfer rate of 155Mbps in WAN applications. (It has recently been approved for use for Gigabit Ethernet.) Figure 3 shows an example of a multimode fiber. The core and cladding boundary work as a mirror to reflect the light waves down the fiber.



Figure 3: Multimode fiber.
Synchronous Transport Signal (STS)
Synchronous transport signal (STS) is the basic building block of the Synchronous Optical Network (SONET). It defines the framing structure of the signal. It consist of two parts: STS overhead and STS payload. In STS−1, the frame is 9 rows of 90 octets. Each row has 3 octets of overhead and 87 octets of payload, resulting in 6,489 bits per frame. A frame occurs every 125 microseconds, yielding 51.84Mbps.

STS−n is an interleaving of multiple (n) STS−1s. The size of the payload and the overhead are multiplied by n. Figure 4 displays an STS diagram.

Figure 4: The STS−1 framing and STS−n framing. The overhead and payload are proportionate to the n value, with the STS−1 frame as the base.
You may wonder why we’re talking about synchronous transmission when we said it is only used over short distances. Where did the asynchronous transmission go? Well, the asynchronous traffic is encapsulated in theSTS payload. The asynchronous serial transmission eliminates the need for the synchronization of the end transmitting equipment. In SONET, most WAN links are a point−to−point connection utilizing light as the signaling source. The time required for the signal to travel the link does not distort the synchronization. The OC−n signal itself is used for the synchronization between equipment. This combination of asynchronous and synchronous serial transmission enables signals to reach across long distances with minimal overhead.

Monday, August 6, 2007

Simple Network Management Protocol

Simple Network Management Protocol
Introduction
Since its introduction in 1988, the Simple Network Management Protocol (SNMP)has become the most popular network management protocol for TCP/IP based networks.The IETF created SNMP to allow remote management of IP based devicesusing a standardized set of operations. It is now widely supported by servers, printers,hubs, switches, modems, UPS systems, and (of course) Cisco routers.The SNMP set of standards define much more than a communication protocol usedfor management traffic. The standards also define how management data should beaccessed and stored, as well as the entire distributed framework of SNMP agents andservers. The IETF has officially recognized SNMP as a fully standard part of the IPprotocol suite. The original SNMP definition is documented in RFC 1157.In 1993, SNMP Version 2 (SNMPv2) was created to address a number of functionaldeficiencies that were apparent in the original protocol. The added and improvedfeatures included better error handling, larger data counters (64-bit), improved efficiency(get-bulk transfers), confirmed event notifications (informs), and most notably,security enhancements. Unfortunately, SNMPv2 did not become widelyaccepted because the IETF was unable to come to a consensus on the SNMP securityfeatures.

So, a revised edition of SNMPv2 was released in 1996, which included all of the proposedenhancements except for the security facility. It is discussed in RFCs 1905,1906, and 1907. The IETF refers to this new version as SNMPv2c and it uses thesame insecure security model as SNMPv1. This model relies on passwords calledcommunity strings that are sent over the network as clear-text. SNMPv2c neverenjoyed widespread success throughout the IP community. Consequently, mostorganizations continue to use SNMPv1 except when they need to access the occasionallarge counter variable. The IETF recently announced that SNMPv3 would bethe new standard, with SNMPv1, SNMPv2, and SNMPv2c being considered purelyhistorical.

The compromise that became SNMPv2c left the management protocol without satisfactorysecurity features. So, in 1998, the IETF began working on SNMPv3, which isdefined in RFCs 2571–2575. Essentially, SNMPv3 is a set of security enhancementsto be used in conjunction with SNMPv2c. This means that SNMPv3 is not a standalonemanagement protocol and does not replace SNMPv2c or SNMPv1.SNMPv3 provides a secure method for accessing devices using authentication, messageintegrity, and encryption of SNMP packets throughout the network. We haveincluded a recipe describing how to use the SNMPv3 security enhancements

SNMP Management Model
SNMP defines two main types of entities, managers and agents. A manager is a serverthat runs network management software that is responsible for a particular network.These servers are commonly referred to as Network Management Stations (NMS). There are several excellent commercial NMS platforms on the market. Throughoutthis book we will refer to the freely distributed NET-SNMP system as a reference NMS.
An agent is an embedded piece of software that resides on a remote device that youwish to manage. In fact, almost every IP-capable device provides some sort of built-inSNMP agent. The agent has two main functions. First, the agent must listen forincoming SNMP requests from the NMS and respond appropriately. And second, theagent must monitor internal events and create SNMP traps to alert the NMS thatsomething has happened. This chapter will focus mainly on how to configure the router’s agent.

The NMS is usually configured to poll all of the key devices in the network periodically using SNMP Get requests. These are UDP packets sent to the agent on the wellknown SNMP port 161. The SNMP Get request prompts the remote device to respond with one or more pieces of relevant operating information. However, because there could be hundreds or thousands of remote devices, it is often not practical to poll a particular remote device more often than once every few minutes (and in many networks you are lucky if you can poll each device more than a few times per hour). On a schedule like this, a remote device may suffer a serious problem that goes undetected—it’s possible to crash and reboot in between polls from the NMS. So, on the next poll, the NMS will see everything operating normally and never know that it completely missed a catastrophe.
Therefore, an SNMP agent also has the ability to send information using an SNMP trap without having to wait for a poll. A trap is an unsolicited piece of information, usually representing a problem situation (although some traps are more informational in nature). Traps are UDP packets sent from the agent to the NMS on the other well-known SNMP port number, 162. There are many different types of traps that an agent can send, depending on what type of equipment it manages. Some traps represent non-critical issues. It is often up to the network administrator to decide which types of traps will be useful.The NMS does not acknowledge traps, and since traps are often sent to report network problems, it is not uncommon for trap reports to get lost and never make it to the NMS. In many cases, this is acceptable because the trap represents a transient transmission problem that the NMS will discover by other means if this trap is not delivered. However, critical information can sometimes be lost when a trap is not delivered.
To address this shortcoming, SNMPv2c and SNMPv3 include another type of packet called an SNMP inform. This is nearly identical to a standard trap, except that the SNMP agent will wait for an acknowledgement. If the agent does not receive an acknowledgement within a certain amount of time, it will attempt to retransmit the inform.
SNMP informs are not common today because SNMPv2c was never widely adopted. However, SNMPv3 also includes informs. Since SNMPv3 promises to become the mainstream SNMP protocol, it seems inevitable that enhancements such as SNMP informs will start to be more common.

MIBs and OIDs
SNMP uses a special tree structure called a Management Information Base (MIB) to organize the management data. People will often talk about different MIBs, such as the T1 MIB, or an ATM MIB. In fact, these are all just branches or extensions of the same global MIB tree structure. However, the relative independence of these different branches makes it convenient to talk about them this way. A particular SNMP agent will care only about those few MIB branches that are relevant to the particular remote device this agent runs on. If the device doesn’t have any T1 interfaces, then the agent doesn’t need to know anything about the T1 branch of the global MIB tree. Similarly, the NMS for a network containing no ATM doesn’t need to be able to resolve any of the variables in the ATM branches of the MIB tree.
The MIB tree structure is defined by a long sequence of numbers separated by dots, such as .1.3.6.1.2.1.1.4.0. This number is called an Object Identifier (OID). Since we will be working with OID strings throughout this chapter, it is worthwhile to briefly review how they work and what they mean. The OID is a numerical representation of the MIB tree structure. Each digit represents a node in this tree structure. The trunk of the tree is on the left; the leaves are on the right. In the example string, .1.3.6.1.2.1.1.4.0, the first digit, .1, signifies that this variable is part of the MIB that is administered by the International Standards Organization (ISO). There are other nodes at this top level of the tree. The International Telephone and Telegraph Consultative Committee (CCITT) administers the .0 tree structure. The ISO and CCITT jointly administer .2. The first node under the ISO MIB tree of this example is .3. The ISO has allocated this node for all other organizations. The U.S. Department of Defense (DOD) is designated by the branch number .6. The DOD, in turn has allocated branch number .1 for the Internet Activities Board (IAB). So, just about every SNMP MIB variable you will ever see will begin with .1.3.6.1. There are four commonly used subbranches under the IAB (also called simply “Internet”) node. These are designated directory (1), mgmt (2), experimental (3) and private (4). The directory node is seldom used in practice. The mgmt node is used for all IETF-standard MIB extensions, which are documented in RFCs. This would include, for example, the T1 and ATM examples mentioned earlier. However, it would not include any vendor-specific variables such as the CPU utilization on a Cisco router. SNMP protocol and application developers use the experimental subtree to hold data that is not yet standard. This allows you to use experimental MIBs in a production network without fear of causing conflicts. Finally, the private subtree contains vendor specific MIB variables. Before returning to the example, we want to take a brief detour down the private tree, because many of the examples in this book include Cisco-specific MIB variables.
A good example of a Cisco MIB variable is .1.3.6.1.4.1.9.2.1.8.0, which gives the amount of free memory in a Cisco router. There is only one subtree under the private node, and it is called enterprises, .1.3.6.1.4.1. Of the hundreds of registered owners of private MIB trees, Cisco is number 9, so all Cisco-specific MIB extensions begin with .1.3.6.1.4.1.9.
Referring again to the previous example string (.1.3.6.1.2.1.1.4.0), you can see this represents a variable in the mgmt subtree, .1.3.6.1.2. The next digit is .1 here, which represents an SNMP MIB variable. The following digit, .1, refers to a specific group of variables, which, in the case of mgmt variables, would be defined by an RFC. In this particular case, the value .1 refers to the system MIB, which is detailed in RFC 1450.
From this level down, a special naming convention is adopted to help you to remember which MIB you are looking at. The names of every variable under the system node begin with “sys”. They are sysDescr (1), sysObjectID (2), sysUpTime (3), sys-Contact (4), sysName (5), sysLocation (6), sysServices (7), sysORLastChange (8), and sysORTable (9). You can find detailed descriptions of what all of these mean in RFC1450.
In fact, reading through MIB descriptions is not only an excellent way to understand the hierarchical structure of the MIB, but it’s also extremely useful when you are trying to decide what information you can and should be extracting from your equipment.
In the example string, .1.3.6.1.2.1.1.4.0, the value is .4, for sysContact. The following .0 tells the agent to send the contents of this node, rather than treating it as the root of further subtrees. So the OID string uniquely identifies a single piece of information. In this case, that information is the contact information for the device.

How to Choose the Best Router Switching Path for Your Network (Part II)

Cisco Express Forwarding
Cisco Express Forwarding, also uses a 256 way data structure to store forwarding and MAC header rewrite information, but it does not use a tree. Cisco Express Forwarding uses a trie, which means the actual information being searched for is not in the data structure; instead, the data is stored in a separate data structure, and the trie simply points to it. In other words, rather than storing the outbound interface and MAC header rewrite within the tree itself, Cisco Express Forwarding stores this information in a separate data structure called the adjacency table.

This separation of the reachability information (in the Cisco Express Forwarding table) and the forwarding information (in the adjacency table), provides a number of benefits:
The adjacency table can be built separately from the Cisco Express Forwarding table, allowing both to build without process switching any packets.
· The MAC header rewrite used to forward a packet isn't stored in cache entries, so changes in a MAC header rewrite string do not require invalidation of cache entries.
· Recursive routes can be resolved by pointing to the recursed next hop, rather than directly to the forwarding information.
Essentially, all cache aging is eliminated, and the cache is pre−built based on the information contained in the routing table and ARP cache. There is no need to process switch any packet to build a cache entry.


Other Entries in the Adjacency Table
The adjacency table can contain entries other than MAC header rewrite strings and outbound interface
information. Some of the various types of entries that can be placed in the adjacency table include:
cache A MAC header rewrite string and outbound interface used to reach a particular adjacent host or router.
· receive Packets destined to this IP address should be received by the router. This includes broadcast addresses and addresses configured on the router itself.
· drop Packets destined to this IP address should be dropped. This could be used for traffic denied by an access list, or routed to a NULL interface.
· punt Cisco Express Forwarding cannot switch this packet; pass it to the next best switching method(generally fast switching) for processing.
· glean The next hop is directly attached, but there are no MAC header rewrite strings currently
available.

Glean Adjacencies
A glean adjacency entry indicates that a particular next hop should directly connected, but there is no MAC header rewrite information available. How do these get built and used? A router running Cisco Express Forwarding and attached to a broadcast network, as shown in the figure below, builds a number of adjacencytable entries by default.

The four adjacency table entries built by default are:

10.1.1.0/24, version 17, attached, connected
0 packets, 0 bytes
via Ethernet2/0, 0 dependencies
valid glean adjacency
10.1.1.0/32, version 4, receive
10.1.1.1/32, version 3, receive
10.1.1.255/32, version 5, receive

Note there are four entries: three receives, and one glean. Each receive entry represents a broadcast address or an address configured on the router, while the glean entry represents the remainder of the address space on the attached network. If a packet is received for host 10.1.1.50, the router attempts to switch it, and finds it resolved to this glean adjacency. Cisco Express Forwarding then signals that an ARP cache entry is needed for 10.1.1.50, the ARP process sends an ARP packet, and the appropriate adjacency table entry is built from the
new ARP cache information. After this step is complete, the adjacency table has an entry for 10.1.1.50.

10.1.1.0/24, version 17, attached, connected
0 packets, 0 bytes
via Ethernet2/0, 0 dependencies
valid glean adjacency
10.1.1.0/32, version 4, receive
10.1.1.1/32, version 3, receive
10.1.1.50/32, version 12, cached adjacency 208.0.3.2
0 packets, 0 bytes
via 208.0.3.2, Ethernet2/0, 1 dependency
next hop 208.0.3.2, Ethernet2/0
valid cached adjacency
10.1.1.255/32, version 5, receive

The next packet the router receives destined for 10.1.1.50 is switched through this new adjacency.

Load Sharing
Cisco Express Forwarding also takes advantage of the separation between the Cisco Express Forwarding table and the adjacency table to provide a better form of load sharing than any other interrupt context switching mode. A loadshare table is inserted between the Cisco Express Forwarding table and the adjacency table, as illustrated in the figure below.


The Cisco Express Forwarding table points to this loadshare table, which contains pointers to the various adjacency table entries for available parallel paths. The source and destination addresses are passed through a hash algorithm to determine which loadshare table entry to use for each packet. Per packet load sharing can be configured, in which case each packet uses a different loadshare table entry.

Each loadshare table has 16 entries among which the paths available are divided based on the traffic share counter in the routing table. If the traffic share counters in the routing table are all 1 (as in the case of multiple equal cost paths), each possible next hop receives an equal number of pointers from the loadshare table. If the number of available paths is not evenly divisible into 16 (since there are 16 loadshare table entries), some paths will have more entries than others.
Beginning in IOS 12.0, the number of entries in the loadshare table is reduced to make certain each path has a proportionate number of loadshare table entries. For instance, if there are three equal cost paths in the routing table, only 15 loadshare table entries are used.

Which Switching Path Is Best?
Whenever possible, you want your routers to be switching in the interrupt context because it is at least an order of a magnitude faster than process level switching. Cisco Express Forwarding switching is definitely faster and better than any other switching mode. We recommend you use Cisco Express Forwarding if the protocol and IOS you are running supports it. This is particularly true if you have a number of parallel linksacross which traffic should be load shared.

How to Choose the Best Router Switching Path for Your Network (Part I)

Introduction
There are a plethora of switching paths available to various Cisco routers and Cisco IOS releases. Which is the best one for your network, and how do they all work? This white paper is an attempt to explain each of the following switching paths so you can make the best decision about which switching path fits your network.
First, examine the forwarding process itself. There are three steps to forwarding a packet through a router:
1. Determine if the packet's destination is reachable.
Determine the next hop toward the destination, and the interface through which that next hop is reachable.
2.Rewrite the Media Access Control (MAC) header on the packet so it will successfully reach its next hop.
3.Each of these steps is critical for the packet to reach its destination.

Note: Throughout this document, we use the IP switching path as an example; virtually all the information provided here is applicable to equivalent switching paths for other protocols, if they exist.

Process Switching
Process switching is the lowest common denominator in switching paths; it is available on every version of IOS, on every platform, and for every type of traffic being switched. Process switching is defined by two essential concepts:
The forwarding decision and information used to rewrite the MAC header on the packet are taken from the routing table (from the routing information base, or RIB) and the Address Resolution Protocol (ARP) cache, or from some other table that contains the MAC header information mapped to the IP address of each host that is directly connected to the router.
· The packet is switched by a normal process running within IOS. In other words, the forwarding decision is made by a process scheduled through the IOS scheduler and running as a peer to other processes on the router, such as routing protocols. Processes that normally run on the router aren't interrupted to process switch a packet.
The figure below illustrates the process switching path.

Examine this diagram in more detail:
The interface processor first detects there is a packet on the network media, and transfers this packet to the input/output memory on the router.
1.The interface processor generates a receive interrupt. During this interrupt, the central processor determines what type of packet this is (assume it is an IP packet), and copies it into processor memory if necessary (this decision is platform dependent). Finally, the processor places the packet on the appropriate process' input queue and the interrupt is released.
2.The next time the scheduler runs, it notes the packet in the input queue of ip_input, and schedules this process to run.
3.When ip_input runs, it consults the RIB to determine the next hop and the output interface, then consults the ARP cache to determine the correct physical layer address for this next hop.
4.ip_input then rewrites the packet's MAC header, and places the packet on the output queue of the correct outbound interface.
5.The packet is copied from the output queue of the outbound interface to the transmit queue of the outbound interface; any outbound quality of service takes place between these two queues.
6.The output interface processor detects the packet on its transmit queue, and transfers the packet onto the network media.
7.Almost all features that effect packet switching, such as Network Address Translation (NAT) and Policy Routing, make their debut in the process switching path. Once they have been proven, and optimized, these features may, or may not, appear in interrupt context switching.

Interrupt Context Switching
Interrupt context switching is the second of the primary switching methods used by Cisco routers. The primary differences between interrupt context switching and process switching are:
The process currently running on the processor is interrupted to switch the packet. Packets are
switched on demand, rather than switched only when the ip_input process can be scheduled.
The processor uses some form of route cache to find all the information needed to switch the packet.
The following figure illustrates interrupt context switching.



Examine this diagram in more detail:
The interface processor first detects there is a packet on the network media, and transfers this packet to the input/output memory on the router.
1.The interface processor generates a receive interrupt. During this interrupt, the central processor determines what type of packet this is (assume it is an IP packet), and then begins to switch the packet.
2.The processor searches the route cache to determine if the packet's destination is reachable, what the output interface should be, what the next hop towards this destination is, and finally, what MAC header the packet should have to successfully reach the next hop. The processor uses this information to rewrite the packet's MAC header.
3.The packet is now copied to either the transmit or output queue of the outbound interface (depending on various factors). The receive interrupt now returns, and the process that was running on the processor before the interrupt occurred continues running.
4.The output interface processor detects the packet on its transmit queue, and transfers the packet onto the network media.
5.The first question that comes to mind after reading this description is "What is in the cache?" There are three possible answers, depending on the type of interrupt context switching:
· Fast Switching
· Optimum Switching
· Cisco Express Forwarding
We will look at each of these route cache types (or switching paths) one at a time.

Fast Switching
Fast switching stores the forwarding information and MAC header rewrite string using a binary tree for quick lookup and reference. The following figure illustrates a binary tree.



In Fast Switching, the reachability information is indicated by the existence of a node on the binary tree forthe destination of the packet. The MAC header and outbound interface for each destination are stored as part
of the node's information within the tree. The binary tree can actually have 32 levels?the tree above is extremely abbreviated for the purpose of illustration.
To search a binary tree, you simply start from the left (with the most significant digit) in the (binary) number you are looking for, and branch right or left in the tree based on that number. For instance, if you're looking for the information related to the number 4 in this tree, you would begin by branching right, because the first binary digit is 1. You would follow the tree down, comparing the next digit in the (binary) number, until you reach the end.

Characteristics of the Fast Switching
Fast Switching has several characteristics that are a result of the binary tree structure and the storage of the MAC header rewrite information as part of the tree nodes.
Since there is no correlation between the routing table and the fast cache contents (MAC header
rewrite, for example), building cache entries involves all the processing that must be done in the
process switching path. Therefore, fast cache entries are built as packets are process switched.
· Since there is no correlation between the MAC headers (used for rewrites) in the ARP cache and the structure of the fast cache, when the ARP table changes, some portion of the fast cache must be invalidated (and recreated through the process switching of packets).
· The fast cache can only build entries at one depth (one prefix length) for any particular destination within the routing table.
· There is no way to point from one entry to another within the fast cache (the MAC header and
outbound interface information are expected to be within the node), so all routing recursions must be resolved while a fast cache entry is being built. In other words, recursive routes can't be resolved within the fast cache itself.

Aging Fast Switching Entries
To keep the fast switching entries from losing their synchronization with the routing table and ARP cache, and
to keep unused entries in the fast cache from unduly consuming memory on the router, 1/20th of the fast cache
is invalidated, randomly, every minute. If the routers memory drops below a very low watermark, 1/5th of the
fast cache entries are invalidated every minute.
Fast Switching Prefix Length
What prefix length does the fast switching build entries for if it can only build to one prefix length for every
destination? Within the terms of the fast switching, a destination is a single reachable destination within the


routing table, or a major network. The rules for deciding what prefix length to build a given cache entry are:
· If building a fast policy entry, always cache to /32.
If building an entry against an Multiprotocol over ATM virtual circuit (MPOA VC), always cache to/32.
· If the network is not subnetted (it is a major network entry):
. If it is directly connected, use /32;
. Otherwise use the major net mask.
· If it is a supernet use the supernet's mask.

If the network is subnetted:
. If directly connected, use /32;
. If there are multiple paths to this subnet, use /32;
In all other cases, use longest prefix length in this major net.

Load Sharing
Fast switching is entirely destination based; load sharing occurs on a per−destination basis. If there are multiple equal cost paths for a particular destination network, fast cache has one entry for each host reachable within that network, but all traffic destined to a particular host follows one link.

Optimum Switching
Optimum switching stores the forwarding information and the MAC header rewrite information in a 256 way multiway tree (256 way mtree). Using an mtree reduces the number of steps which must be taken when looking up a prefix, as illustrated in the next figure.



Each octet is used to determine which of the 256 branches to take at each level of the tree, which means there are, at most, 4 lookups involved in finding any destination. For shorter prefix lengths, only one−three lookups may be required. The MAC header rewrite and output interface information are stored as part of the tree node, so cache invalidation and aging still occur as in the fast switching. Optimum Switching also determines the prefix length for each cache entry in the same way as fast switching.