You are here

WG Tunnel Connectivity Loss

6 posts / 0 new
Last post
aa7cl
WG Tunnel Connectivity Loss
Anyone seeing repeated issues with WireGuard Tunnel clients/servers losing connectivity via the tunnel? I have 5 clients and 1 server configured. Periodically, I'm unable to connect to the nodes across the tunnels. This seems to be somewhat random. Of the 5 tunnel clients, I may lose three of them but the other two are fine. The tunnels show as active, but no communication. To attempt to gain connectivity back, I've asked the clients to reboot but still have no connectivity. Only when I reboot my node am I able to regain connectivity. After reboot, all tunnel connectivity is restored for a day or two, then I lose 1-3 tunnels. 

Yesterday, I downloaded and installed the latest nightly build, and three hours later, lost one server tunnel and two clients. 

In the log files, I'm seeing duplicate IP warnings from OLSRD. The strange thing is that the duplicate IP is associated with a tunnel that is configured on one of my client's nodes, not mine. Also, seeing Kernel warnings about out of memory. 

What, if any, troubleshooting can I perform to determine the cause of this issue? I thought the latest nightly was supposed to resolve some of these issues?
I have downloaded a support data file if necessary.

Thoughts and ideas would be greatly appreciated!

Jim - AA7CL


 
K7EOK
Something is definitely wrong
Something is definitely wrong right now with Babel.  I am watching three tunnels into a central location and the Babel routing had stopped working completely on all three Wireguard connections.  I believe all four nodes (3 client and one server) are on 20250909...

The navigation page via Babel only showed the one node the user is connected to and not even the other side of the tunnel.  The OLSR side shows the entire mesh network and functions.  I could not download a report from the tunnel server.  I've rebooted the tunnel server node and Babel routing has returned.  

One of the users is on a Babel only version so had absolutely nothing at all.  Thank goodness I haven't yet gone to Babel only on any nodes I depend on otherwise I would have zero control over my remote nodes.

The nightlies 0909 and beyond seemed to be just fine ... until this morning.

Ed

 
w6bi
w6bi's picture
Yet another nightly
Ed, 0912 (#2947) has a fix that may help.  Give it a shot.

And remember the only difference between Babel-only builds and the regular nightly builds is the presence or absence of OLSR.   If both ends of a link are Babel capable, they'll run Babel instead of OLSR. regardless of what version of nightly build they're on.

Orv W6BI
aa7cl
Fix?
Hi Orv,
I'm not able to locate the fix you mention. Is this a current bug tracking number (#2947)?
Where might I find this info?
I attempted to update to the latest nightly; however, my hAP won't accept the firmware upgrade.

Looking at the log files, I see more complaints about duplicated IPs and Memory crashes. I even tried the Babel-nightly build and same thing, my hAP reboots with the same firmware. Not sure what is going on.

Snippets:
Wed Sep 17 07:12:24 2025 daemon.info olsrd[22392]: olsr.org - 0.9.6.2-git_0000000-hash_c72a27c146a93beb7637d5a92ad1d1c3 successfully started
Wed Sep 17 07:12:24 2025 daemon.info olsrd[22392]: You might have another node with main ip 10.15.210.19 in the mesh!
Wed Sep 17 07:12:25 2025 daemon.info olsrd[22392]: You might have another node with main ip 10.15.210.19 in the mesh!

Wed Sep 17 07:10:53 2025 kern.err kernel: [49174.579478] Out of memory: Killed process 2096 (babeld) total-vm:2180kB, anon-rss:692kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:16kB oom_score_adj:0


Thanks,
Jim - AA7CL
w6bi
w6bi's picture
Updates
I've seen a duplicate memory report once or twice in the last 10 years.  It's rare but it does happen.  Once you get your station stabilized see if you can track down the other station.   One of you is going to have to do something about it.

If you have the internal radios in the hAP enabled, turn them off and try the upgrade.  It turns out the radio daemon is a memory hog.

There are two methods of updating an AREDN node.  One is to have the node find it and download it.  If that fails, go to https://downloads.arednmesh.org/afs/www/ . enter your model and download it to your PC.  Then upload it to your hap.

Orv W6BI
K7EOK
Orv, your comment reveals one
Orv, your comment reveals one of the most difficult part of the aredenmesh.org website.

There is no repository linked and available to earlier nightly builds.  I cannot go find the 0912 nightly for a node type I have not already downloaded and archived.  Many of our problems have occurred by applying nightly builds that turn out to be new problems.

Your post was on Sept 16, and you reference a nightly from Sept 12.  Sure, I'd try that one as no one has had any emergencies (at least that I'm aware of).  But the website won't let me choose this.

Ed

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer