You are here

Performance issues with Mikrotik hAP on nightly 1307 ?

13 posts / 0 new
Last post
K6CCC
K6CCC's picture
Performance issues with Mikrotik hAP on nightly 1307 ?
Is anyone else seeing serious performance issues on a Mikrotik hAP when running nightly builds 1300 or 1307?  I am.  First noticed performance issues on my hAP with nightly 1300, but I think it is worse with nightly 1307.  Let me describe my hap installation.

This is K6CCC-hAP-at-Home.local.mesh or 10.32.147.197 (primary address) or 10.4.158.41 (LAN address).  I live in an AREDN RF desert so my only connection to the rest of the mesh is via tunnel at this time. All five LAN ports on the hAP are in use as follows:
1)  Internet connection via switches and a router to my spectrum cable internet.
2)  Connects to one of my routers by way of switches that allows LAN connection from my "normal" home LAN to the AREDN network.
3)  Connects to a Grandstream 1625 IP phone
4)  Connects to a Raspberry Pi-4 that mostly does nothing, but occasionally used to access the AREDN network.
5)  DtD link to a dumb switch that also has another hAP and a Rocket M3.
Mesh RF is currently turned off, and the hAP is operating as a WiFi access point on 5 GHz, but there is nothing connected to it.
There is a Raspberry Pi-3 running MeshChat is connected to a switch in my data cabinet that gets to the hAP by way of port 2.

At this time, there is only one tunnel expected to be connected - and right now that is down due to a networking issue at the far end.  My live tunnel to the rest of the network is coming out of the other hAP because of the pre 1300 vs post 1300 tunnel issue.

I am observing that I frequently can not access the rest of the AREDN network from my desktop PC, and even accessing the hAP is at times VERY slow or non-responsive.  Yesterday the hAP would report that OLSR is not running when selecting "Mesh Status".  Manually restarting OLSR (from the advanced Configuration page) did not change that.  I have observed CPU loading from the Node status page as high as 16.xx, although 3 to 5 is more common.

A power cycle or reboot restores service for a few hours and then it starts getting bad again.
 
K6CCC
K6CCC's picture
LAN drawing

Here is a simplified extract of my LAN drawing as it applies to AREDN at this time.

K6CCC
K6CCC's picture
Locked up...
After about eight hours after power cycle:
AE6XE
AE6XE's picture
Jim, I'd suspect that the
Jim, I'd suspect that the fixes to the ath10k 802.11ac 5GHz driver may still be consuming too much RAM.   To isolate, disable use of the 5GHz, and don't use.  Does this make the symptoms go away?   OLSR is a big consumer of RAM and at the top of the list if linux has to start killing things to survive (out-of-memory OOM state).

If/When it gets sluggish, grab a support download file from the LAN.  This will tell if it is a RAM issue.

Joe AE6XE
K6CCC
K6CCC's picture
All RF killed for the test
Thanks Joe, I have now killed all RF from the node and re-booted.  We'll see what it does...
 
K6CCC
K6CCC's picture
OK with all RF off.
Seems to be pretty good with both mesh and WiFi (both client and access point) turned off.
 
AJ6GZ
OK
I've had one running for 3 days now on 1307, with 5Ghz OFF and 2Ghz ON the live mesh. It's doing fine, although it's not really "doing" anything but sitting on the local mesh. uptime load average 2 days, 23:26 0.05, 0.04, 0.05 free space flash = 8836 KB /tmp = 29600 KB memory = 39516 KB I'm going to turn on 5Ghz and let it sit for a while.
AE6XE
AE6XE's picture
If it gets sluggish or worse,
If it gets sluggish or worse, try to capture a support download in this state -- should be able to see the smoking gun.

Joe AE6XE
AJ6GZ
5GHz

With 5Ghz on with no wifi clients, 2 Ghz on the live mesh:

Hours Free
0:30 27584 KB
1:00 27708 KB
2:00 27904 KB
6:00 24372 KB
7:00 20980 KB
9:00 14664 KB
20:00 12728 KB
24:00 12508 KB
30:00 10816 KB
(unit crashed overnight)

AE6XE
AE6XE's picture
To be 100% confident it would
To be 100% confident it would be good to turn 2GHz mesh RF on, but leave 5GHz off.     Based on the results, however, it surely is related to the 5GHz wireless driver still consuming too much RAM.   I'll work with the OpenWrt developers to discuss dialing back the memory allocation another notch.

Joe AE6XE
K6CCC
K6CCC's picture
2 GHz is back on.
2 GHz mesh RF has been turned back on with one connection (6 feet away).  WiFi remains off.
AE6XE
AE6XE's picture
fix in pipeline
I got to the bottom of this.   The Openwrt applied patch to address this issue did not make it into 19.07.0 that we are using :) .  The fix did make it into 19.07.1, now released.    I'm incorporating the 19.07.1 fixes into the AREDN nightly build, should be available in the next few days.

Joe AE6XE
K6CCC
K6CCC's picture
Glad you figured it out
Well, hot damn!  Glad you were able to find a cause.
 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer