You are here

Fixing the sluggish degrading node issue

9 posts / 0 new
Last post
AE6XE
AE6XE's picture
Fixing the sluggish degrading node issue
Good news, there is hope the symptoms of Nodes becoming sluggish and/or stop responding will go away.    The upstream release of Openwrt 19.07 is very close to the release candidate and final release.   AREDN strategy is to subsequently and quickly do a release to bring the numerous upstream fixes and new features to our community.      Openwrt developers are tracking the remaining work here:

https://github.com/openwrt/openwrt/pull/2507

Of particular note is item #7 "​Blocker: rpcd leaks memory".     This defect can be reproduced by repeatedly running "iwinfo" -- the command to obtain wireless information.     Well, it so happens this is what AREDN runs every minute to capture snrlog data,  another time every minute to mitigate the deaf node conditions,  and when nodes are queried to produce maps.   Over time, calling this command means memory is consumed, and not released.  At some point the memory quota is all used up and the node doesn't have enough to function.

While no guarantee, this is complex enough and other factors could be at play, I think it is warranted to have high hopes this upstream fix will make the symptoms we see go away.

Joe AE6XE
w6bi
w6bi's picture
Excellent!

Great news, Joe!   Does this OpenWRT release still support the 32 MB models AREDN currently supports?

Orv W6BI

AE6XE
AE6XE's picture
Short answer 'yes', 32Mb
Short answer 'yes', 32Mb devices on ar71xx architecture are still supported in Openwrt 19.07.  

Long answer for future, 32Mb devices will phase out and not be supported at some point.   How long?  We'll have to define the the term "Support".    With AREDN and Openwrt, unlike a commercial vendor, support means there is someone with ability and an interest to keep a given model-device sustained and compatible with the stream of upgrades and bug fixes.  The 19.07 release is still building images for the 32Mb RAM devices and generally all these devices are considered 'supported'.   AREDN has added models, e.g. LHG, LDF that don't have compatible openwrt images, but very close to other models and images that are in openwrt.

There will be some models  left behind in the future 2020+ major release of openwrt.  The linux kernel for all AREDN compatible devices is being modernized and catching up to today's linux kernel.  As an example, how the hardware of a given device is defined within linux, moves from 'c' compiled code to a data definition file, called "device tree".   The openwrt buzz words are moving from architecture "ar71xx" to "ath79".   We'll have to consider renaming or starting fresh with a new AREDN github repository, which is currently called "github.com/aredn/ar71xx".    

The ongoing discussion with openwrt developers is a proposal to not create images for the 32Mb devices.  This means the images are not readily available to test.  This also means that the images may still build and someone could still be submitting code to keep them 'supported'.    There's also a hurdle, if anyone has created or ported the definitions in the new ath79 architecture to create an image for all the 32Mb devices compatible with AREDN today.

In  a worse case scenario, the images in 19.07 for a device would get bug fixes for one to 2 years, and still work on devices for many years beyond. AREDN stakeholders can decide to keep updating AREDN features on these devices, on top of 19.07, or not.   The jury won't be in session until getting to that bridge...

Joe AE6XE
Ai6bx
Great News
Joe,

This is great news as I do have a Nano M2 XW that seems to fall prey to this. I look forward to the release candidate and testing this out. 

Orv, 

You do raise a good question as I do still have a few 32 MB nodes out there on local drops.

Keith AI6BX
AE6XE
AE6XE's picture
Openwrt has create the 19.07 release candidate -- getting close
AA7AU
AA7AU's picture
Size? Network interfaces??

Do you think we may have size problems affecting the already tight space in the AREDN implementations? I note: "Images for some device became too big to support a persistent overlay, causing such models to lose configuration after a reboot."

Is there anything in the new Openwrt release that better addresses some of the networking issues where DHCP on the LAN side seems to disappear after an install or even reboot even while the RF side continues to be available?

Thanks for all your continuing work on this!
- Don - AA7AU

AE6XE
AE6XE's picture
Openwrt is referring to the
Openwrt is referring to the 4Mb RAM devices.  All the AREDN images only run on 8Mb RAM devices, we blew past that a while back :) .

See this Openwrt expectation settings statement:

https://openwrt.org/supported_devices

The nightly build has a fix to address a senario of the sysupgrade failing.   There are still some issues with copying our images to mikrotik devices that still needs to be looked at.  Although there is still sufficient RAM, the node is running our of space intermitantly on mikrotik after copying up the new firmware image -- need to review the limits for various memory allocations. 

Joe AE6XE
w6bi
w6bi's picture
Sysupgrades still possible?
Joe, this is in the release notes:

"Sysupgrade from ar71xx to ath79 and vice versa is not officially supported, a full manual reinstall is recommended to switch targets for devices supported by both ar71xx and ath79"

Will that impact our ability to do sysupgrades of AREDN software?

Orv W6BI
AE6XE
AE6XE's picture
This is not a issue for the
This is not a issue for the moment.  There are 2 bridges we will cross:

Upcoming AREDN release based on openwrt 19.07:   For the moment there aren't any devices in the AREDN release with ath79 target for this to be an issue -- everything is still ar71xx.   However, it is a consideration if an AREDN device is only supported in Openwrt ath79 19.07.  We have to choose to uniquely continue to support from AREDN, or upgrade and benefit the device support for free coming from openwrt.  With this extra sysupgrade problem to solve if we go to ath79, suggests we'd continue with ar71xx in AREDN, probably the less-of-the-evils.  ​The jury will be convening soon.

Future AREDN release on top of openwrt 20.xx (year:month notation):    There are many groups like us that will be dealing with the same issue.  With a little patience, someone upstream will fix the sysupgrade issues before they do the 20.xx release.   In other words, we hope this will be a non-issue in the future.  If not, when we get to that bridge, we'll figure out what to do.

Joe AE6XE

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer