You are here

Reboot

19 posts / 0 new
Last post
We1btv
We1btv's picture
Reboot

On two of my Nano Loco M2's I'm running V3.15.1 software. One of the NLM2's is used for tunneling. The other to connect from another room to the tunnel node by RF. The tunnelnode runs for day's without problems. It has 5 client and one server connection going.
When connected, through the other node, the tunnel node starts rebooting spontaneously over and over. If I take the second node up and connect it to the switch, tunnel in to it through a external server it keeps running without a problem.
Took a old 54GL with BBHN v3.1 and connected with the same result.

Reloaded the software several times but it keeps rebooting when connected over RF.
Replaced the node by another NLM2 and.... again the same, reboots when connected by RF, not by tunneling in to it..

Can anyone tell if or what I'm doing wrong..

Ruud
K5DLQ
K5DLQ's picture
do you have any other
do you have any other software loaded?  like MeshChat or HamChat?
 
We1btv
We1btv's picture
Hi Darryl.

Hi Darryl.

Only the tunnel software is loaded on the node. 

Ruud
We1btv
We1btv's picture
No one? Just me? :-)
No one? Just me? :-)
AE6XE
AE6XE's picture
WE1BTV,   I would have
WE1BTV,   I would have thought this was a hardware-memory type issue, but if you've swapped out with another NSL2, this would be very unlikely.  Is there a way to download support dumps before it spontaneously reboots?    We'd need to check resource consumption--memory, file system space--to see if we can isolate the issue.   

If anyone else reading this is using a NSM2  loco for tunneling, please confirm A) usage; and B) can reproduce or not.   Can't think of why this would be unique to a NLM2, but sometimes things aren't always intuitive.

Triple check basics:  swap cat 5 cables, proper power is sustained between 24 and 12v,  different RF channel changes anything. 

Joe AE6XE
We1btv
We1btv's picture
dump
Joe,

if you tell me how to do that i'm more then willing to do so.
The problem is the timing. I'll try several times as close before the reset as possible.

Ruud
AE6XE
AE6XE's picture
Ruud,   Look for a button at
Ruud,   Look for a button at the bottom of the Setup->Administration page.
We1btv
We1btv's picture
Joe.
Joe.

I have tryed to get a few files as close as I could get them before a spontanious reboot. I have sevearal files.
Aached is one of them. If you want more just mail me and I'll snd them to you. There are "a few" ,hi..
Before a reboot the load avarage go's up to way above normal. Seen 7.02-3.2-0.68. Also the memory go's down from aroun 2900Kb to less the 2100Kb. Wait for 10 seconds and you can see the node reboot.

Ruud
 
We1btv
We1btv's picture
Sorry, forgot the file..
Sorry, forgot the file..
Support File Attachments: 
We1btv
We1btv's picture
 

 

Joe, Conrad, Darryl.

 

I have changed the config over here.

  • First I switched of the server tunnels. Average load went down to under 0.50. No reboots and memory stayed near 4000Kb.

  • Added 4 extra tunnels to see what would happen. The load went op marginal, no reboots.

  • Added just 1 server tunnel and removed the 4 extra client's and the rebooting sequence started again.

  • Removed al but one client and one server tunnel but the rebooting continued.

To make sure the node itself was not the problem I swaped the NLM2's but without succes.

Put the clients and server on different NLM2's. Linked the client and server. They talk ot each other and no reboots ever since. The load averages stay down, under 0.75 on the server and 0,25 on the client node.
The memory stays near 3800Kb on both nodes..

For now the problem is solved by using two NLM2's but it looks if something is wrong with the tunneling software?.

I know, we want radio links but the tunneling is the only way of getting of my island over here. For that reason I (have to) use the tunnels, hi..

 

Ruud

AE6XE
AE6XE's picture
Ruud, I'll be back on the
Ruud, I'll be back on the grid monday. Been out camping for the week making my way back home. Joe AE6XE
We1btv
We1btv's picture
Reboot
Joe,

did you ever find anything in the file I attached. The problem is far from over and is abt 75-25% to server-client node.
It's not every 2 - 3 hours but once a day to once every 3 days before reboot.
I sill have some more files if want or need them.

73, Ruud
AE6XE
AE6XE's picture
Ruud,  been out over the
Ruud,  been out over the weekend and offline.  I did look through the support dumps and nothing jumping out--no 'ah ha' moment.   Let me take another pass and refresh my memory...   However, since we are not seeing this wide spread, this looks to be a localized issue and not something in the firmware itself.  Maybe it is a memory or other hardware failure on a given device?   Is there any further hardware swapping around on your end that might help isolate the problem, or everything has been tried at this point?
We1btv
We1btv's picture
Reboot

Hi Joe.

t seems I'm not the only one having this problem. Eddie, ZL2AQY seems to have the same problem.
I'm getting more and more convinced it is a memory problem like you suggested before.
This morning I send him a e-mail and put you in the CC (at.arrl.org)

Over here the only thing for both nodes to do is to tunnel and connect the localy used nodes on wifi. There are NO services other the the tunneling software running on the nodes. On one node the client connections and the other the server connections. connected to eachother by wifi and not DtD.The client node is the one rebooting most.

73, Ruud

ZL2AQY
ZL2AQY's picture
Reboot


Hi Joe,

I have been doing some testing today with only Meshchat as a Service and 2 tunnel clients.
All was ok until I tried to use the tunnel server.
As soon as I entered my DNS name the node started to play up locking up and finally rebooting.
This behavior continued until I removed my DNS name and left nothing entered in the tunnel server.
I am running firmware version 3.16.1.0b02
It looks like maybe it doesn't like both the server and the client being used at the same time.
.
73 Eddie

AE6XE
AE6XE's picture
Thanks to ZL2AQY and WA1BTV,
Thanks to ZL2AQY and WA1BTV, I was able to connect in and trap debug information.   It is confirmed that the kernel is running out of memory and calling the Out-of-memory Killer (OOM Killer).  This then scores processes and picks one to kill :) -- not completely random, but still devastating and leading to a kernel reboot.    We'll need to quantify the memory a tunnel connection consumes and other add-on programs from the core functionality to determine guidelines.    

Note that a rocket has 64Mb of RAM and 2x the maneuvering room of the other UBNT devices which have only 32Mb.    A typical linux operating system has disk drives and ability to shuttle data between memory and a disk drive (just gets a bit slower), to continue operating and have more space.  Embedded devices that don't have disks, do not have this luxury.  

Joe AE6XE
We1btv
We1btv's picture
Thanks Joe for the effort put
Thanks Joe for the effort put in to finding the problem. Knowing where / when the problem shows is half the solution preventing it from happening.


Ruud WE1BTV
ZL2AQY
ZL2AQY's picture
Thanks Joe

Thanks for working out that problem Joe.
For now I am running MeshChat on my Raspberry pi (which also runs FreePbx) and I have had no more crashes.

73
Eddie ZL2AQY
 

AE6XE
AE6XE's picture
Eddie,  loved the ipCam view!
Eddie,  loved the ipCam view!   Whenever I'm starting to feel a bit dried out--Southern CA is a desert--I'll be taking a view of the nice green New Zealand landscape :) .

Joe AE6XE

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer