You are here

Routing Loop guidelines?

5 posts / 0 new
Last post
w6bi's picture
Routing Loop guidelines?

Any general recommendations on how to avoid routing loops?   (And in a mesh network, what's considered a loop and what's considered an alternate path)?


Orv - W6BI

If you are referring to the

If you are referring to the comment in the Ops Notes, about avoiding loops it is more referring to a switching loop, the simple solution to this is NEVER plug two switches into each other, or into a 3rd switch that creates a loop (follow your wires, ignore the vlan configuration and follow the wire if you find any path back to your starting point  you have a loop)

Now in regards to the mesh itself, loops may happen from time to time where nodes are not fully synced up yet, these should be short lived in most cases, in general the solution is "Make sure your links are stable" this won't always be possible in the real world.  The next best solution "Make sure your primary routes are stable"  now this is possible in the real world by designing your network (Don't just plan on gear showing up in a disaster, have the network built and being used daily)

Generally this should only happen (as I can think of it) where a link 'goes down' all of a sudden when it was the preferred link, and the node receives the pack and sends it back the way it came knowing it has the best working path and it should clear up in very short order.


This could also happen when the link quality changes (but link stays online)  Same solution as above though, make sure the primary routes are stable and low loss so they are the preferred routes.  If you can do this the odds are that these good links will be better than multi hop ever changing paths that could create loops.

As networks get more connections I suspect some 'loop routing' is probably bound to occur at some point in the network lifecycle, but should self correct in fairly short order.

Again good stable links (even poor quality, just stable) will significantly reduce the chances of it  occurring..

AE6XE's picture
mesh loop example

See attached pic from, "Reducing Routing Loops Under Link-State Routing in Wireless Mesh Networks", by "Takuya Yoshihiro and Masanori Kobayashi".  

This looks like a good example of the loops that can occur on a mesh.  The link can go down, or look to go down to the olsr routing protocol, because of RF loss and/or high traffic.  In both cases the olsr packets may be lost causing olsr to change the routing.  Until all the nodes have the updated routing information, unintended things happen.   These kind of loops can be minimized by designing and building in higher quality links.  

When we think about emcomm needs, we also start thinking about Quality of Service (QOS) built into the node to give priority to incident traffic to assist and sustain stable links.  Something we've talked about putting into a release.



Image Attachments: 
routing and switching loops

If the problem is loops with Ethernet switches, there's an elegant solution: get switches that support STP (Spanning Tree Protocol). If you create a loop, STP running on the switches will automatically discover it and disable enough links to eliminate the loops, i.e., by creating a spanning tree.

Should an active link fail, the disabled links will be automatically re-enabled to restore connectivity; that's the advantage of STP over avoiding loops by manually disconnecting links. There's no way to split the load between links in different directions; you need IP routing for that.

Regarding IP QoS, I'm willing to help. I have a lot of experience with this under Linux, mainly as a cure for "bufferbloat" on cable and DSL routers. There's a 6-bit DSCP (Differentiated Services Code Point) field in every IP header that can be used to indicate the packet's priority or special nature (e.g., VoIP). The mechanisms are already there in the Linux kernel, they just need to be turned on and configured. Different networks can define different DSCP values, so the main thing is that everyone agree on the meaning of a set of DSCP values. There are some standard conventions that I do recommend starting with. E.g., DSCP=0 is default handling and DSCP=8 is commonly used to indicate "scavenger class", a priority below default that's good for bulk traffic like Bit Torrent. I can let it saturate my cable modem and yet not interfere with interactive response at all.


Linux supports STP

Oh, I should have mentioned that Linux has supported the 802.1d Spanning Tree Protocol for some time. It does not support the newer 802.1w rapid STP, but switches that do support 802.1w automatically fall back to 802.1d as needed.

This means a Linux system (such as a AREDN/BBHN node) could potentially be used as an Ethernet bridge in conjunction with regular commercial Ethernet switches. Right now this isn't much use because most of the Ubiquity nodes have only one physical Ethernet port, but I can see how it might be useful in the future with more complex node configurations. Again, the big advantage of STP over just manually avoiding switching loops is that you can add redundant links that will be automatically disabled as necessary to avoid loops, but automatically re-enabled should it be necessary to heal the network against a link failure.

Even though Ethernet is normally quite reliable, I've seen my share of RJ-45 connectors with bent, misaligned or slightly corroded pins that stop working without warning. If this happens in a remote hilltop site, STP could save you from having to make an immediate trip up to the site.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer