You are here

3.18.9.0 Update fails

36 posts / 0 new
Last post
w8erd
3.18.9.0 Update fails
I tried to update my nanostation XM to 3.18.9.0 from 3.16.
After starting the upload, it waited for several minutes and then disconnected, as usual.
Upon reconnecting, I see that it is still running 3.16.
I used the nanostation update for XM, that has the M in the name.

what have I done wrong?

Bob W8ERD
K6AH
K6AH's picture
Try shutting down any tunnels...
Try shutting down any tunnels you may be running and reboot the node before attempting the sysupgrade.
 
w8erd
3.18.9.0 update failed
I have no tunnels.I tried rebooting the node.  This time I got further with the update.
I have the message:  The firmware is being updated.  Do not remove power etc.
That message stays on the screen forever.

Bob W8ERD

 
kg6wxc
kg6wxc's picture
firmware update message
It will sit there forever sometimes, if it's been more than about 3.5 minutes just try loading the status page of that node again.
w8erd
3.18 installation
I tried a different approach. Made  direct connection to the node I am trying to update.  The previous connection was via wireless from another node in the
same room, with 90 db SNR.  That always failed.  This time the update succeeded, but with a strange side effect. Meshchat has disappeared, although it is still advertised as
a service.  How do we understand this?

Bob W8ERD
K6AH
K6AH's picture
The installation process likely dumped MeshChat...
The installation process likely dumped MeshChat to make room for the new firmware image.  Simply reinstall it.  Your database is still there or it'll be rebuild as soon as it connects again with the mesh.  BTW, the fact you hadn't purged the node of MeshChat is probably the root cause of the problems you were seeing initially.

The Service Advertisement is a DNS record & shortcut and is completely separate from the actual server software installation.

Andre
w8erd
3.18 installation
That all seems right.  I did the same thing on another node, with the same result. Meshchat is gone.

Since this is a known and reproducible situation, I suggest it be explained and included in the installation instructions.

Bob W8ERD
AE6XE
AE6XE's picture
The firmware upgrade process
The firmware upgrade process does not reinstall the add-on packages.    Before starting the firmware upgrade process, do inventory of the these add on packages (meshchat, iperf, tunnels, tcpdump, block known encryption, etc. ).   Then install these packages as the last step to complete the firmware upgrade, assuming you still want them.  

The firmware upgrade process will attempt to preserve config settings of these add-on packages.  Error on the side of preserving this information.   You should not have to save and re-enter these settings.   However, should you no longer want to install and run meshchart, or any package that service advertisements were created, then you would have to manually remove these.

Joe AE6XE
W6GSW
W6GSW's picture
Problem with 32 MB devices?
Yesterday I started updating nodes to 3.18.9.0.

The 64 MB devices (Rockets, MikroTiks, et al.) updated with no issues noted.

However, the 32 MB devices I attempted (NanoBridge, Bullet, AirRouter, etc.) exhibited issues as described in the original post.  Other than trying multiple devices I did not work further on the issue, instead moving on to other devices until available time expired.  ;-(

None of the 32 MB devices have the tunnel module installed, and all were restarted before starting the upgrade.  I can tftp these devices, but would prefer not to pull them down if there is a "trick" I am missing.


Gary
W6GSW

Los Angeles Emergency Communications Team
Pasadena-San Gabriel Valley Emcomm Mesh
 
K6AH
K6AH's picture
I'll defer to one of the developers...
I'll defer to one of the developers to tackle this issue.

If you find you need to tftp, you should never need to "pull them down".  The trick is to always carry a UBNT PoE power pack.  Most all of them have a remote reset button on the back that can be used in the same manner you would use the reset button on the node itself.  Simply disconnect the bottom-end of the CAT5 cable and power it up with the PoE power pack.  When it boots, hold the reset for at least 30 seconds (more than is required... because you likely won't be able to see the LED indicators).  It doesn't save you a trip to the site, but at least it avoids climbing the tower and taking the node down.

Andre, K6AH
 
W6GSW
W6GSW's picture
That's a great reminder ...
Thanks Andre.

That's a great reminder about the Ubiquiti POE adapter.  Though we don't use them at the primary site.  We don't have AC power up top, just our 12 VDC nominal power.

I could string a long extension cord.  ;-)

Though in our case that might be a bit more work than "taking them down".  We need to climb a bit to access the area "up top" where the nodes are located, but then its easy. Nothing compared to a tower climb.
 
K6AH
K6AH's picture
...or perhaps carry a small
...or perhaps carry a small 12vDC powered AC inverter.  But to your point, agreed there's always going to be a trade off between cost and effort.

Andre
kg6wxc
kg6wxc's picture
Try it again
@w8erd & @W6GSW, I have upgraded many many 32MB Nanostations over the last few weeks and I have had the same thing happen to me on several occasions. I just kept on trying and eventually they would "take" the upgrade. One "trick" I have learned is to look at the available memory on the main status page and if there is about 6MB free then you *should* have better luck. Another trick is try and isolate the node(s) in question from the rest (or most) of the network if you can, this will lessen the OLSR table(s) and free up some memory for the upgrade. Yet another trick is go into the 3.16.xx nodes and remove any packages you can that pertain to IPv6. There are a couple other "tricks" I have too, but they are more complex than the above and I won't post them here, sorry. :)
w8erd
3.18 failed
I rebooted and tried again.  This time it just went back to the browser after a while, as it did the first time.
The memory says flash  2284   /tmp 14280  memory 3428

The only thing obviously IPv6 are three files called ipv6 tables. Should I delete those?
Meshchat is installed.

Bob W8ERD

 
kg6wxc
kg6wxc's picture
meshchat
I would try removing meshchat first, but I think even removing it will leave the messages file behind (which is probably what is taking up all the space). I'll see if I can't find info on the leftover files from a meshchat install...
a quick search reveals this: https://www.arednmesh.org/comment/6271#comment-6271
Basically, ssh into the node and delete the file: /tmp/meshchat/messages.MeshChat
When/If you reinstall MeshChat it should sync it's messages back up with the rest of em. packages safe to remove are: odhcp6c, odhcpd-ipv6only (if present), iptables6, libip6tc (if present).
if you can, you can also remove libpcap, and tcpdump, and anything to do with USB (IIRC).
(there might be a couple of others too on a 3.16 node that I am forgetting about)
AE6XE
AE6XE's picture
I suspect, and let's try to
I suspect, and let's try to confirm, the 3.16.x.x nodes which do not reliably complete the upgrade process have meshchat or other add-on packages installed.    For those of you that are seeing upgrades not complete, is this the case, a common theme?

The image sizes of 3.18.9.0 are a little larger than 3.16.x.x, which means a little more temp RAM space is needed in 3.16.x.x to upload these images.   (/tmp consumes RAM space when files are added or an upgrade firmware is uploaded).    meshchat will consume RAM to cache all the messages.    It's not really an issue how many packages we have install, which consumes flash memory, different memory.   It's what programs are live running and have temporary files created.  Both the running program and the temporary files consume RAM space. 
wa2ise
wa2ise's picture
I tried to upgrade a loco
I tried to upgrade a loco m900, no tunnel server or client SW installed, nothing else extra either.  free space  
flash = 1584 KB
/tmp = 14288 KB
memory = 3964 KB

It seemed to be upgrading, but it came back with 3.16.1.1 as it was before. 

sysinfo says:
 node: WA2ISE9OO
model: Ubiquiti Bullet M

eth0   00:15:6D:9D:19:13
eth0.1 00:15:6D:9D:19:13
eth0.2 00:15:6D:9D:19:13
wlan0  00:15:6D:9C:19:13
wlan0-1 00-15-6D-9C-19-13-00-44-00-00-00-00-00-00-00-00

/proc/cpuinfo
system type		: Atheros AR7240 rev 2
machine			: Ubiquiti Bullet M
processor		: 0
cpu model		: MIPS 24Kc V7.4
BogoMIPS		: 259.27
wait instruction	: yes
microsecond timers	: yes
tlb_entries		: 16
extra interrupt vector	: yes
hardware watchpoint	: yes, count: 4, address/irw mask: [0x0000, 0x0ff8, 0x0ff8, 0x0ff8]
isa			: mips1 mips2 mips32r1 mips32r2
ASEs implemented	: mips16
shadow register sets	: 1
kscratch registers	: 0
core			: 0
VCED exceptions		: not available
VCEI exceptions		: not available


nvram
hsmmmesh.settings=settings
hsmmmesh.settings.wifimac=00:15:6d:9c:19:13
hsmmmesh.settings.mac2=156.25.19
hsmmmesh.settings.dtdmac=157.25.19
hsmmmesh.settings.config=mesh
hsmmmesh.settings.node=WA2ISE9OO
W2TTT
W2TTT's picture
Download the sysupgrade for the M9 Nanoloco and then upload it
Bob, First get the download of 3.16.1.1 from the node Admin screen. This will default the login and password to root/hsmm. Once it comes up, change the password, save it and reboot. Download the sysupgrade version for the M9 Nanoloco to your PC from the AREDNMESH.ORG web site. You should upload it to the node from the Admin panel. I had to do this for nodes runing the defunct 3.17 (rc) code and everything worked fine. 73, Gordon Beattie W2TTT
wa2ise
wa2ise's picture
Got it to work.  Seems that I
Got it to work.  Seems that I had to access the node via the WAN address.  Going thru the LAN didn't seem to work.  Upgraded all my nodes.
kg6wxc
kg6wxc's picture
problem nodes
I can say that it has happened to me on nodes without meshchat, tunnels or anything other than "stock".
Mostly 32MB devices only though.
It's not just a network issue either...
I can understand some of ours up in Santa Barbara on very poor links failing, but even over time I have managed to upgrade those as well.
If IIRC though, I have even had it happen a time or two on my local nodes connected via wire.

I just tell everyone to keep trying, remove everything that isn't needed, reboot and try again, it's a pain, but it works!
That's what we get sometimes for pushing the limits of WiFi. :)
w6mrr
Same problem using serial console
I have a M2 XW Rocket that I bricked doing the 3.18.9 upgrade from 3.16.x.  I have tried using the serial console to start the TFTP load of the the .bin file with TFTP2.  I have tried downgrading to 5.5 with the same results.
 
image.png

73 - Martin
w6mrr
Never mind - was unable to
Never mind - was unable to unbrick using XM .bin file.  Label says FCCID: SWX-M2 so thought it was XW.

--Martin
W2TTT
W2TTT's picture
Upgrading to 3.18.9 is smooth
Hi Folks!
So far, I've updated about a dozen nodes of various types with and without tunnels, both locally and remotely over multiple hops of various quality and it works and well!  I even had the challenge of rolling back through the 3.16.1.1 release from the now withdrawn 3.17 (rc) code, and it defaulted to the original system password of Hamm, and that was about as challenging as it could be. 

Great job team AREDN!

A day later, everything is stable and running well!

Vy 73,
Gordon Beattie, W2TTT
201.314.6964

 
K5DLQ
K5DLQ's picture
thanks Gordon!   good to hear
thanks Gordon!   good to hear from you!
k1ky
k1ky's picture
Rolling back challenges
Rolling back to 3.16.1.1 creates a challenge because the data sets on the configuration aren't very backward compatible.  That's why it start out fresh.  Always consider moving "forward" when upgrading.
N7JYS
3.18.9.0 Update fails
Yes I have confirmed this issue on the NS5M as I have tried several times to upgrade from 3.16.2.0 which is stable. 
Just for kicks giggles I also tried a factory version using TFTP, Pumkin gives a TFTP:2 Firmware check failed. So I reloaded the factory 3.16.2.0  and all is back
to normal operation.

Eric 

N7JYS
 
AE6XE
AE6XE's picture
Eric,  can you post the full
Eric,  can you post the full name of the firmware image .bin file attempted?   Also, attach a support download, assuming it is live on 3.16.2.0.  An error from tftp process of firmware check could be explained by wrong or corrupted image. 

Joe AE6XE
N7JYS
3.18.9.0 Update fails
Joe, It turns out I used the xm version instead of the xw version for the nano station. Got version 3.18.9.0 to load via tftp however running window's 7 the new browser is having some issues comming up. Very very very slow unlike 13.6.2.0. 

Eric

N7JYS 
N7JYS
3.18.9.0 Update fails
After  doing a firmware update from 13.6.2.0 to 13.18.9.0 on an xw nanostation, I find the browser is extremely slow to respond using chrome in windows7.
Functions the same using explorer. Unfortunately I will be downgrading back to 13.6.2.0. (:

Eric

N7JYS
 
N7JYS
3.18.9.0 Update fails
Time to go back to bed and start over! Problem solved! They Don't call us Amateurs for no reason! :) 

N7JYS
AE6XE
AE6XE's picture
I'm guessing a Rocket image
I'm guessing a Rocket image was installed on the Nanostation M5 XW?   I did this not too long ago and saw the same symptoms.   There's an enhancement (maybe considered a defect) request submitted on this issue.

Joe AE6XE
N7JYS
3.18.9.0 Update fails
BINGO!


Eric 
N7JYS
W8XG
Upgrade

We also had a NSM2-XM that would not upgrade. It had two web links and meshchat installed. We deleted everything and flashed it just fine. Links are back up, but left out meshchat.

Seems we need to work on the pi version for that. 

W6GSW
W6GSW's picture
Upgrade report

Thanks to all for the recommendations.  (Andre, good suggestion about the small inverter, I had not thought of doing that!)

Saturday I completed updating 25+ nodes to 3.18.9.0.  None of the devices with 64 MB memory exhibited any issue.

Several of the 32 MB memory devices required multiple upgrade attempts, but most were successful after a couple tries.  Two I ultimately upgraded using tftp because it was easier.

I suspect the issue may be related to the size of the Southern California mesh (750+ nodes per Eric's MeshMap).  Before Saturday I was performing upgrades over the mesh.  While I was rebooting nodes before starting the upgrade, sometimes I would be waiting for the node status screen to return.  I speculate that the olsr data could be using enough memory to prevent the upgrade if I wasn't able to gain control of the node soon enough (or I may be wrong).

Visiting sites Saturday I was able to isolate nodes from the active mesh.  Not conclusive of course, but that seemed to eliminate the issue.

Thanks to the AREDN team for the upgrade!
 

Ka3BQS
reconnect to node
can i reconnect to my node while its up on the roof its a m2
K6CCC
K6CCC's picture
Kind of a vague question in a
Kind of a vague question in a three year old thread.  If you have a cable to it or RF connectivity, you can connect to it.
 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer