Always Acyclic Distributed Path Computation
A B S T R A C T
(See below for source code)
Distributed routing algorithms may give rise to transient loops during path recomputation, which can pose significant stability problems in highspeed networks. We present a new algorithm, Distributed Path Computation with Intermediate Variables (DIV), which can be combined with any distributed routing algorithm to guarantee that the directed graph induced by the routing decisions remains acyclic at all times. The key contribution of DIV, besides its ability to operate with any routing algorithm, is an update mechanism using simple message exchanges between neighboring nodes that guarantees loopfreedom at all times. DIV provably outperforms existing loopprevention algorithms in several key metrics such as frequency of synchronous updates and the ability to maintain paths during transitions. Simulation results quantifying these gains in the context of shortest path routing are presented. In addition, DIV’s universal applicability is illustrated by studying its use with a routing that operates according to a no shortest path objective. Specifically, the routing seeks robustness against failures by maximizing the number of nexthops available at each node for each destination. Algorithm used: Distancevector routing algorithm, Linkstate routing algorithm. INTRODUCTION Distributed path computation is a core functionality of modern communication networks and is expected to remain so, even though some recent proposals contemplate the use of more centralized solutions. Depending on the mode of information dissemination and subsequent computation using the disseminated information, there are two broad classes of algorithms:
In both approaches, nodes choose successor (nexthop) nodes for each destination based only on local information, with the objective that the chosen paths to the destination be efficient in an appropriate sensee.g., having the minimum cost. Because endtoend paths are formed by concatenating computational results at individual nodes, achieving a global objective implies consistency across nodes both in computation and in the information on which those computations are based. Inconsistent information at different nodes can have dire consequences that extend beyond not achieving the desired efficiency. Of particular significance is the possible formation of transient routing loops, one which can severely impact network performance, especially in networks with no or limited loop mitigation mechanisms, e.g., no TimetoLive (TTL) field in packet headers or a TTL set to a large value. In the presence of a routing loop, a packet caught in the loop comes back to the same nodes repeatedly, thereby artificially increasing the traffic load many folds on the affected links and nodes. The problem, a significant issue even with unicast packets, is further aggravated by broadcast packets, which not only are always caught in any loop present in the network, but also generate replicated packets on all network links. The emergence of a routing loop then often triggers networkwide congestion, which can lead to the dropping or delaying of the very same control (update) packets that are needed to terminate the loop; thereby creating a situation where a transient problem has a lasting effect. Avoiding transient routing loops remains a key requirement for path computation in both existing and emerging network technologies. Linkstate algorithms, of which the OSPE protocol is a wellknown embodiment, disseminate the state of each node’s local links (their status and the node(s) they connect to) to all other nodes in the network by means of reliable flooding. After receiving linkstate updates from the rest of the nodes, each node independently computes a path to every destination. The period of potential information inconsistency across nodes is small (a few tens of milliseconds per node for typical present day networks), so that routing loops, if any, are very shortlived. On the flip side, linkstate algorithms can have quite high overhead in terms of communication (broadcasting updates), storage (maintaining a full network map), and computation (a change anywhere in the network triggers computations at all nodes). These are some of the reasons for investigating alternatives as embodied in distancevector algorithms, which are the focus of this paper. Distancevector algorithms couple information dissemination and computation. Information disseminated by a node now consists of the results of its own partial path computations (e.g., its current estimate of its cost to a given destination) that it distributes to its neighbors, which in turn perform their own computations before further propagating any updated results to their own neighbors. The Distributed BellmanFord (DBF) algorithm is a wellknown example of a widely used distancevector algorithm that computes a shortest path tree from a given node to all other nodes. Coupling information dissemination and computation can reduce storage requirements (only routing information is stored), communication overhead (no relaying of flooded packets), and computations (a local change needs not propagate beyond the affected neighborhood). Thus, distancevector algorithms avoid several of the disadvantages of linkstate algorithms, which can make them attractive, especially in situations of frequent local topology changes and/or when high control overhead is undesirable. The downside of coupling information dissemination and computation is that information dissemination is gated by computation speed since a node cannot send updates before finishing its current computations. This can in turn extend periods when nodes have inconsistent information, which, as discussed earlier and illustrated in Section V, can lead to more frequent and longer lasting routing loops. In addition, coupling information dissemination and computation can also result in slower convergence. This is because each node depends on the (partial) computation results of its neighbors, which can introduce cyclic dependencies that increase the number of steps needed to reach a final, correct result. Indeed, when destinations become unreachable, a distancevector algorithm may not even converge in a finite number of steps. This is known as the countingtoinfinity problem, which is absent from linkstate algorithms where nodes compute paths independently. (In practice, when the costtodestination reaches a maximum value, the destination is declared unreachable and the computation is terminated.) Thus, we see that realizing the benefits of distancevector based solutions, even in environments where they might be a natural fit, calls for developing approaches to overcome these problems. Such a realization is not new. Since the 1970s, several works have targeted this goal in the context of shortest path computations. We review and contrast the most significant of these prior works in Section II, but the problem remains timely. Our research was triggered by a renewed interest in devising lightweight, loopfree path computation solutions for largescale Ethernet networks. Specifically, we were considering extending the scalability of Ethernet networks through the introduction of distributed shortest path algorithms in lieu of the existing distributed spanning tree algorithm. Linkstate based solutions have been proposed to improve Ethernet networks, although not necessarily for the sake of scalability, and a distance vector solution seemed an attractive alternative.
HARDWARE
REQUIREMENTS:
SOFTWARE REQUIREMENTS:
MODULE ANALYSIS 1. Distributed TimetoLive Module TimetoLive (TTL) field in packet headers or a TTL set to a large value. In the presence of a routing loop, a packet caught in the loop comes back to the same nodes repeatedly, thereby artificially increasing the traffic load many folds on the affected links and nodes. The problem, a significant issue even with unicast packets, is further aggravated by broadcast packets, which not only are always caught in any loop present in the network, but also generate replicated packets on all network links. The emergence of a routing loop then often triggers networkwide congestion, which can lead to the dropping or delaying of the very same control (update) packets that are needed to terminate the loop; thereby creating a situation where a transient problem has a lasting effect. Avoiding transient routing loops remains a key requirement for path computation in both existing and emerging network technologies 2. Loop Free Routing Module The Loop free routing information dissemination and computation can also result in slower convergence. This is because each node depends on the computation results of its neighbors, which can introduce cyclic dependencies that increase the number of steps needed to reach a final, correct result. Indeed, when destinations become unreachable, a distancevector algorithm may not even converge in a finite number of steps. This is known as the countingtoinfinity problem, which is absent from linkstate algorithms where nodes compute paths independently 3. Robust Routing Module We illustrate the benefits of this decoupling using a cost function that instead of the standard shortest path distance function, seeks to maximize the number of nexthops available at all nodes for each destination. The availability of multiple nexthops ensures that the failure of any one link or neighbor does not impede a node’s ability to continue forwarding traffic to a destination. A failure results in the loss of at most one next hop to a destination, so that the node can continue forwarding packets on the remaining ones without waiting for new paths to be computed. In other words, the routing is robust to local failures. This may be an appropriate objective in settings where endtoend latency is small and bandwidth plentiful. 4. Shortestpath computation Module (or) shortestpath Simulation Module The shortestpath Simulations are performed on random graphs with fixed average degree of 5, but in order to generate a reasonable range of configurations, a number of other parameters are varied. Networks with sizes ranging from 10 to 90 nodes are explored in increments of 10 nodes. For each networksize, 100 random graphs are generated. Link costs are drawn from a bimodal distribution: with probability 0.5 a link cost is uniformly distributed in [0,1]; and with probability 0.5 it is uniformly distributed in [0,100]. For each graph, 100 random linkcost changes are introduced, again drawn from the same bimodal distribution. All three algorithms are run on the same graphs and sequences of changes. Processing time of each message is random: it is 2 s with probability 0.0001, 200 ms with probability 0.05, and 10 ms otherwise.

