Our local ham emcomm group is having a rather heated discussion about the topology for our network.
The teraain is some hills and some flat terrain with lots of former creek channels running across the area: I.e. elevation differences of 20 to 30 feet over a small area. Total coverage area is around 10 square miles. There are LOTS of trees in both the hilly and flat areas.
One group feels the goal is a fairly dense network of small rooftop omni directional nodes, like a PicoStation with the included antenna, or perhaps a gain omni. The use of directional antennas is frwoned upon. The stated goal is to be able to drop a node in anywhere and have it connect. The process is to just drop nodes wherever someone has interest and wait for the mesh to happen.
The other group proposes a backbone, or perhaps double backbone, made up of point-to-point (I.e. directional) links, as necessary to insure a sufficiently high signal quality. Equipment selection would be based on the specifics of each link with a preference for MIMO stations. Links would be made off the backbone to neighborhoods desiring service. Those links would probably be directional as well. But, might rely on omni directionsl antennas if there was sufficient density.
I assume this is not a unique discussion and that other groups have had to resolve this, or similar, issues.
I would be very interested in hearing from others who have grappled with this type of issue. Specifically, what were the factors you considered and how did you resolve the design questions.
BTW, I have not posted the arguments in favor and against each design philosophy to avoid prejudicing the discussion at the start. But, I would be quite willing to post them later on.
Richard - wb6tae
This technology adapted to our community is in the early stages, consequently there isn't any posted data one way or the other that I've seen. What I have seen is mostly university mesh network research 'in the lab" with published papers.
My crystal ball says this is sort of like decentralization vs centralization discussions--which one is better to achive the organization's goals? Well, it depends... and often times both options are in use at the same time.
Trying to tune the network performance depends greatly on the traffic patterns--which are also not well known and don't yet exist to examine for your usage. What is the purpose? Is there an EOC hub somewhere? You might need a solid high speed connection to another location you can count on--what % of the nodes in the area are off-grid powered? What can you count on?
Locally in my area, we have a lot of rolling hills, with local mountains (like mile high elevation from the beach). We will end up using all available options--high nodes to get started for everyone to join in and learn what this off-the-grid network can do. Then as we gain momentum (we surely will!) begin to leverage a node on every ham's roof.
Joe AE6XE
I would just add to Joe's comments... most mesh's aren't driven by "served agency" requirements and don't typically have a large budget that affords a structured rollout. They start as grass roots efforts and then combine into larger, more effective networks. So what you end up with to start are mesh islands that need to get connected somehow. When you do that... whether you intended to or not, you end up with a backbone joining these islands.
There's nothing wrong with that, OLSR handles it just fine... but the Carrier Sense Multiple Access/ Collision Detect (CSMA/CD) protocol inherent in all Ethernet networks that share the same physical layer---including AREDN---probably isn't the best choice for a backbone due to RF's hidden transmitter problem.
An alternative that avoids this problem is Time Division Multple Access (TDMA). We are just now experimenting with this in the San Diego Mesh Working Group and will post results when we have them.
Andre, K6AH
Joe:
When you say "on every ham's roof, what kind 0f density do you envision? If I figure a PicoStation with the factory antenna has an effective radius of 1/4 mile (400 meters), I'd need 4 (well placed) nodes per square mile. If I figure the effective radius is more like 1/8 mile (200 meters) I'd need 16 nodes per square mile.
I've heard that "hams are dense" ;-) but, not that dense. Which suggests to me that as node sites become available each one would need to be studied for the best means of connecting, thus negating the idea of "dropping in" nodes as needed/desired.
Richard - wb6tae
Yes, I agree that each situation needs to be engineered to optimize how a local mesh performs. Too much density would have a negative affect--if too many nodes are contending for the same RF space--CSMA issue Andre discussed. (High relay node with many distant client nodes OR many nodes in close proximity that all see each other.)
Without a lot of data yet, a reasonable rule of thumb might be to target between 2 and ~8 neighbor link paths for everyone (of shared RF space) no matter what the density of nodes is on the ground. If everyone has 20 neighbors (and half of them are all the same neighbors as everyone else), then the traffic volume is impacted/reduced and contention occurs in the RF space. If everyone has only 1 neighbor, then more RF hop routing and no fault tolerance.
Joe AE6XE
Andre...
To make sure I understand. Are you saying: The "hidden transmitter" problem affects all meshed networks that use CSMA/CD at the physical layer and that this issue becomes particularly problematic when trying to construct a relaible "backbone?"
Richard - wb6tae
Yes. If not designed correctly. The following only applies to larger implementations.
When you have "high-ground" sites that hear lower-ground sites that don't hear each other, you will potentially have collisions at the physical layer (the RF ether in our case). This is okay for lower data rates... CSMA/CD deals with it fine just as Ethernet does on the wire and hubs. To avoid this on higher data-rate backbone you will want to:
Remember, you may not experience the effects of these collisions until the backbone takes on more and more data. But eventually these links will become overwhelmed with retransmissions. Those of us who were around during the AX.25 packet days will remember the frustrating effects of high-ground digipeaters due to the hidden transmitter problem. To a lesser extent APRS is still dealing with it today.
Andre
This is good info. I'm building a brand new mesh, planning on using 900MHz as a quick experiment due to antennas we already have. I think using a separate band for the core is more expensive, but in the end may make the mesh more reliable.
Good strategy. Any time you can move data traffic to another band and avoid that traffic's contribution to local RF noise and congestion, the better it's going work.
Now that the firmware supports three bands we're having those same discussions: which band is best suited for backbone, do you laternat bands between mountaintops, which is more appropriate for the "valley floor" meshing, etc.
Knowing when we might be able to move the 2.4 GHz radios into the Part 97 portion of that band would greatly assist our planning, even if it was a SWAG.
..alternate..
Attached below is an image of what might be a direction for us to go in for a semi-urban netwwork.
The blue lines are 2.4gHz and mostly connect omni-directional nodes, though in a few cases (backed up against a mountain) the antenna might be directional to a greater or lesser degree.
The red lines connect nodes on some frequency other than 2.4gHz, probably 5gHz and almost certainly directional. In fact, there might be multiple directional nodes linked together as a virtual node point.
I am assuming the well engineered "red line" links will have low ETX costs associated with them and traffic between more distant parts of the mesh will flow across those links. The entire mesh will be one logical network. Any external links will be incidental and the main function of the network is directed to community emergency services.
Are there any gotcha's to look out for here?
I have also though about setting the "red links" as a separate IP network and routung across it. However, that would have the disadvantage of requiring the blue nodes to be broken into separate IP network blocks as well. Any thoughts on this approach?
This is a typical mesh.
The only variable you're adding is frequency separation at two co-located nodes. In the OSI model its all about sharing layer 1, the Physical layer (the RF ether in this case). IP addressing doesn't come into play until layer 3, so that would have no impact.
So aside from these two co-located nodes, I don't see any real value in the network layout.
Andre
Andre:
I am not quite sure I understand your comment. In the example I drew, the idea between the red links was to reduce the number of hops required between two distant nodes. In the example, the reduction was 2 or 3 hops, depending on how the route would have been chosen. Of, coiurse, in practice, that might be mich higher, depending on node density and the visibility of one blue node to another. The red links do not need to be on another band, that was just to avoid adding noise. They could simply be highly directional links on 2.4gHz.
However, my main question here is HOW do we reduce the number og hops required to connect two distant points on a dense network?
Sorry, I misunderstood the question.
OLSR defines the best route based on the estimated transmissions (ETX) required to get the data from the source node to the destination node. Each hop is assigned an ETX based on Link Quality (LQ). Mathematically, ETX is inversely proportional to LQ... for example, an LQ of 50% has an ETX of 2. All the hop ETX values are added up and the lowest path ETX is the path the data will take. Note that Ethernet links are given an ETX value of 0.1 because they are very reliable. You can see these ETX values next to the Remote Nodes listed on the Mesh Status screen.
So it's all dependant on LQ, not hop-count... althogh each RF hop adds a minimum of 1 to the ETX path value.
Andre
Here's for the math folks in the crowd:
LQ = % sucess of OLSR received packets from my neighbor
NLQ = % sucess of OLSR received packets my neighbor receives from me
ETX = 1/(LQ * NLQ) <- Expected transmissions to get a round trip packet (or 'link cost') where '1' is perfect link.
A backbone that jumps over several nodes has a perfect link cost of 1. Hoping through, say 3 nodes with perfect links, has a link cost of 3.
I can't make the formula work. An adjacent node has an LQ of 75%. His LQ to a node one hop from him is 76%. According to the formula the ETX I should see at my site would be 1/(.75 * .76) which equals 1.75 calculated ETX at my node. But the ETX shown is consistently higher than that - in the range of 4 to 6.
What am I doing wrong?
Orv - W6BI
The NLQ is not displayed in the mesh status screen you would have to look at OLSR daemon.
What you need to do is take the LQ from you to the neighbor node and the lq of the neighbor node to you (that should be the same as the NLQ since it's looking at LQ from remote side) and throw it into the formula
With that value you get the 1 hop ETX. Now repeat with the hop between the neighbor node and it's neighbor and add that to the ETX in the step above you now have the two hop ETX cost and so on and so on.
Note: DtDLink is hard coded at .1 and is an exception to the formula.
That's much closer. Thanks for clearing that up.
73 - Orv - W6BI
Note, however that OLSR (and this is version 1 of OSLR) doesn't do a great job of characterizing the true 'quality' of a path through a mesh. The olsr.org folks are busy working on v2, but not quite 'code complete' to begin serious testing. olsr v2 is months and years away for our use and they are dealing with better methods to characterize the quality.
Back to olsr v1 we use today. Consider the characterization of these two mesh paths:
1) 1 hop RF link from node A to B with ETX = 5 (not a good quality link)
2) 6 hops via RF links through 6 nodes from A to B with ETX = 6 (6 perfect quality RF links)
If path 1 was the long distant backbone, it would have great difficulty in streaming video with ETX of 5. There's too much loss. Path 2 does not have loss, but with higher ETX of 6, would not be selected for use by OLSR. The streaming video would work with no issues on path 2.
However, in reality, will we have 6 back-to-back perfect RF paths very often? It remains to be seen if OLSR v1 method is going to get it wrong in practice and our usage. But we can be sure that v2 will bring improvements in the future to better characterize the 'quality' of paths through the mesh.
Yeah, I've read up on the weaknesses of the current implementation of OLSR; lot of improvement potential there.
I've told our local group that we need to shoot for an LQ/NLQ of at least 75% on every backbone link. Does that seem a reasonable goal?
w6bi, I suspect at the end of the day, the final indicator is the quality of the services going across this link and that if these services are functioning sufficiently, it is good link. I do see on some of the local mesh a wide variation of LQ from day to night, from day to day. Getting a LQ measure at any one point may not give the full picture.
For example, from our local City Hall there are dual 2Ghz and 5.8Ghz 8-mile links to communicate out. The primary on 5.8Ghz is a 30dB rockdish to a 16dB sector panel on the other end. Much of the time this link is high 90% range and often 100%. But there are times when it will drop down to the 70% range. I'd characterize it as a good link we can count on. If a link shows 75% on a good moment, then drops to 40% at times, this is problematic. There's a ~13 mile link locally with this 40% to 70% characterization that I have frequent VOIP calls over. I suspect my low cost grandstream ATA could handle the lower LQ better, but my VOIP calls are often fragmented and sometimes drop out.
Thus, if 75% is the bottom of the day-to-day typical link quality, the example here says that's good. If it's at the top of LQ range observed, then this suggests a few more $$s are needed :) .
Put into context this is Orange County, CA and heavily RF congested. I'm looking at moving to ch165 on 5.8Ghz. Note that the default on ch149 in all versions of the firmware is only the default because it sorts in the selection list first. The 165 20Mhz channel does not have another assigned channel in the ISM space to pair with for some of the 802.11n 40Mhz bandwidth modes. There were some other reasons too, but my wifi scans show it completely unused everytime I've looked. I'm hoping to see the LQ range narrow when moving to 165. The 3Ghz option is very close too and also a good option for a backbone. It looks promising to test 3GHz in May or June on top of Pleasants PK at 4000' over Orange County.
Joe AE6XE
Higher is always better, I don't think there is a single "good" number, I will say that you can expect it to decrease as load increases.
A lot of this will also depend on your network design, if there is a key resource for the area it should be located as few hops away from everyone as possible and on as high a quality link as possible, for example I would say the EOC or similar central command station should be on a 100% link (if possible) to the 'backbone' network so that its own link does not degrade the system.
A backbone (which I think as a core infrastructure solely dedicated to handling moving data from one local area to another area) should have as little loss as possible and as many direct paths as possible.
Take a 4 hop network with a LQ of 75% on each hop (local->->backbone->backbone->backbone->destination) has only around a 31% chance (if I understand the math correctly) of the message making it from local to destination. Each time it fails it has to retransmit it will slow the local access layer (not such a big deal) and fill up the backbone (more of a big deal)
Factor into this most everything one does (like sending an email, or winlink message, etc) usually needs more than 1 packet and it adds up.
EDIT 2305 PDT
I ran the above math on an interdependent occurrence calculation. I however forgot to take into account that while LQ may only be 75% the 802.11 protocol has methods in place to retransmit a packet at that individual hop, rather than having to wait for the packet to be fully resent from the originator. This would likely significantly increase the probability of the packet making it to its destination above the 31% quoted, I however am not skilled enough at statistics to give a new calculation with this dependent/independent stream combined together.