You are here

hAP ac lite update fails with "Bad Gateway"

21 posts / 0 new
Last post
K2QA
hAP ac lite update fails with "Bad Gateway"

Hello,

Trying to flash​ hAP ac lite - (RB952Ui-5ac2nD-US) with AREDN firmware using Windows 10 instruction video from KK6RAY.

I can TFTP boot from .elf file with tiny PXE.

When I try update using setup/administration to upload the .bin file, wireshark shows lots of packets from Windows to hAP, then hAP returns "Bad Gateway" and resets the TCP connection.

I have tried:
aredn-3.18.9.0-mikrotik-rb-nor-flash-16M-ac-sysupgrade.bin
aredn-3.19.3.0-mikrotik-rb-nor-flash-16M-ac-sysupgrade.bin

I've tried many times.

Any suggestions?

Thanks

K5DLQ
K5DLQ's picture
unplug your cat5, close your

unplug your cat5 from your computer, close your browser.  try again.

K2QA
Firmware File Not Valid

Multiple attempts, always one of two outcomes; hAP always resets the connection.

1. Bad gateway message from hAP
2. Message on Administration screen: Firmware CANNOT be updated, firmware file is not valid, Failed to restart all services, please reboot this node., current version: 3.19.3.0, hardware type: mikrotik (rb-952ui-5ac2nd)

It's like the upload process crashes.

I have Wireshark and supportdata if they will be useful. (hAP, PC w/Wireshark on 192.168.1.100, PC w/TinyPXE on 192.168.1.10, all plugged into dumb switch)

I can try the Linux process, but it would be useful to find out why Windows update fails.

AE6XE
AE6XE's picture
Make sure you are loading the

Make sure you are loading the 3.19.3.0 bin file when it has been booted with the 3.19.3.0 elf file.    Don't try to mix between releases, would be unknown behavior.   

I've seen this symptom periodically.  We need to be able to reproduce the problem reliably to fix it.   If you have some command line familiarity, you might proceed:  scp the bin file to /tmp on the node, then run this command after telnet or ssh into the node, you would see any errors that might cause the bad gateway message.  

sysupgrade -n /tmp/<image name>.bin

Joe AE6XE

K2QA
sysupgrade worked

Joe,
I've been traveling and just getting back to this.
Thanks for the info. It took me a while to find the ssh port to use, but I was able to scp the 3.19.3.0 bin file and run sysupgrade.
As soon as sysupgrade started, the connection was closed, so I couldn't watch for any messages.
Is there a secret to keeping the connection open until reboot?
Are there any log files you want to look at?

John K2QA
 

kc5hwb
Are you moving the cat5 from

Are you moving the cat5 from the internet port to one of the LAN ports when attempting to load the 'sysupgrade' firmware?

AE6XE
AE6XE's picture
The only way to see further

The only way to see further details is to connect to the serial console port. This result is expected behavior (and rules out some issues since you didn't see error messages).    Need more details on what happened, what cables were unplugged and when.     I have found it best to not do anything for ~4 minutes after typing the sysupgrade command. Then unplug the network cable from port1, wait 15 sec or so, then plug into port 2, make sure the laptop acquires a new IP address via DHCP.

Joe AE6XE 

k1ky
k1ky's picture
Initial load sysupgrade issues - "new" units??

I (we) too are experiencing similar behavior with several "new" HAP AC Lite models.  I noticed that the packaging box is a little bigger on the new units as well.  The model number appears to be the same.

Where do we go to see the boardid version on factory Microtik units??  I intend to spend more "productive" time with this either tomorrow or over the weekend.  I have been successful loading all versions of AREDN firmware on these HAPAC Lite units.... until this past few weeks.  It "appears" that the upgrade fails or restarts anywhere from 4% up to around 96% during the load (using Chrome).  Still fails with other browsers as well.  I have tried old and new nightlies as well as production versions.  Just a "few" stubborn nodes at this point so far, but it's starting to sound and look like a trend.

KB9OIV
I have recently flashed two

I have recently flashed two of these devices.  

I did get the 'Bad Gateway' message on both units when I tried newer initial firmware.

I was able to flash 3.19.3.0 firmware successfully, however.

I did not try to reproduce the 'Bad Gateway' message to see if it was repeatable, by starting over.  I have not tried any newer permanent firmware.

K2QA
Bad Gateway on another hAP AC Lite

Joe,

Since i was able to flash my first unit using scp, I decided to buy another one to use for debugging.
I get the same Bad Gateway error when doing upgrade via browser on the node after tftp booting with 3.19.3.0.

When I hit the upload button, I get the 'Bad Gateway' message and ssh session also disconnects.

logread, isn't very useful since SSH terminated.
-----------------------------------------------
  3.19.3.0, r7676-cddd7b4c77
    root@NOCALL:~# logread -f
      Fri Mar 22 19:29:00 2019 cron.info crond[1022]: USER root pid 3150 cmd /usr/local/bin/clean_zombie.sh
      Fri Mar 22 19:29:29 2019 kern.info kernel: [  217.099458] sh (3203): drop_caches: 3
      Failed to find log object: Not found 

I have also attached Wireshark conversation extract which might also give some insight. Entire trace is 6MB, so it uploads most of the .bin file.

What other tools are available to help diagnose?

Thanks,

John K2QA

AE6XE
AE6XE's picture
Support data?

can you click on the support data download link at the bottom of the Administration page?   This will have the info I need.   Been traveling last 2 weeks, I think someone sent this data on the suspected new hardware earlier.   Getting home today should have time to look at I think this week.

K2QA
Joe,

Joe,

After 'Bad Gateway' I went back to 192.168.1.1 - web server still worked, so I collected support data.
I bought this second hAP just to diagnose this problem, so let me know what else I can do to help.

P.S. I checked md5sum of upload file and it is good, but wireshark trace suggests that there is an issue with the data.
....
OpenWrt kernel loader for AR7XXX/AR9XXX
..Copyright (C) 2011 Gabor Juhos <juhosg@openwrt.org>
....Incorrect LZMA stream properties!
..
System halted!
....Decompressing kernel... ....done!
..failed, ....data error!
....Starting kernel at %08x...

....m....L.;.......o......L9.i$.zn.<.}N.qB.S\.`$S6.....6...:....H.%...4.
...


John K2QA

Support File Attachments: 
AA7AU
AA7AU's picture
I ran into something similar

I ran into something similar when I was trying to un-brick my CPE220. If memory serves, I had used tftp to upload, which seemed successful. but I couldn't get past the upload phase, and then tried 192.168.1.1:8080 and found my CPE220 responded with my nodename and all my prior settings (including where I had turned off RF) even though it wouldn't respond to "localnode:8080". It's possible that I uploaded the upgrade image rather than factory image (but I thought I had it right) and I finally had to tftp upload the correct image all over again to get all to thankfully reset - the node went totally off into never-never land after trying to continue with that first cycle.

This experience raises a number of questions in my mind about data cleanup during firmware upload, etc and/or how incredibly creative pilot error can be ...

But the reason I post now is that I did see that same unexpected presence at 192.168.1.1 during my flailing about.

Just thought I'd throw that in for whatever it's worth,
- Don - AA7AU

AE6XE
AE6XE's picture
out of RAM

The system log is showing the device is running out of RAM and the kernel is going into Out-of-Memory (OOM) mode, killing processes to survive.  We'll need to reproduce and review the options for Mikrotik devices.   I'd expect all Mikrotik models to have the same issue with 64MB RAM.  The kernel ranks processes to decide which one to kill.
 

[  451.464626] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[  451.599359] [ 3764]     0  3764     1229     1057       5       0        0             0 setup
[  451.634765] Out of memory: Kill process 3764 (setup) score 67 or sacrifice child
[  451.642432] Killed process 3764 (setup) total-vm:4916kB, anon-rss:1476kB, file-rss:2752kB, shmem-rss:0kB

 

[  428.586694] [ 3667]     0  3667     1128      956       4       0        0             0 admin
[  428.613178] Out of memory: Kill process 3667 (admin) score 61 or sacrifice child
[  428.620847] Killed process 3667 (admin) total-vm:4512kB, anon-rss:1080kB, file-rss:2744kB, shmem-rss:0kB


...and a bunch more.

As a work around, for new devices, load 3.19.3.0 -- using both the .elf and .bin files from this release.   Once AREDN is loaded to flash and booting,  then go to the admin page, and upload the nightly build .bin file, a 'sysupgrade'.     It should only be the nightly build .elf and .bin combination usage triggering this failure.  This assumes it really is a memory root cause in the nightly build and not something else.   Can you please confirm this works?

We are consuming more flash and RAM to replace the UI.  We know we'll push the limits with 2 languages supported right now.  At some point, we can retire perl, a large consumer of flash and RAM, the current UI depends on.   This will free up a lot of RAM/flash and, cross fingers, more memory head room on 32MB RAM devices to extend their life span. 

Joe AE6XE

K2QA
Microtik hAP Testing

Joe,

Thanks.
I'm not sure exactly what you want me to do.

I was able to flash the first unit by tftp booting 3.19.3.0.elf, using scp to copy 3.19.3.0.bin, and then running sysupgrade from terminal, so I know that works.

So I bought the second unit to see if I could reproduce and debug the problem.

After tftp boot with aredn-3.19.3.0-mikrotik-vmlinux-initramfs.elf, it is the browser admin/upload of aredn-3.19.3.0-mikrotik-rb-nor-flash-16M-ac-sysupgrade.bin that is failing.

Are you saying to try to admin/upload the nightly build .bin file instead?

I can also send you the unit if you want it for testing.

John K2QA

 

AE6XE
AE6XE's picture
John,  

John,  

Working around 2 issues here:

Issue 1 - installing the nightly build
Yes, fully install 3.19.3.0 to be a working AREDN mikrotik device as step 1 (using the 3.19.3.0 .elf and .bin).   Then as a step 2, if desired to install the nightly build firmware, from the admin page, upload the nightly build .bin.     

Issue 2 - booted with .elf and AREDN UI upload of .bin file returns 'bad gateway'
For now the command line 'sysupgrade' work around can be used.   Further investigation is needed.

Joe      

K2QA
hAP Nightly Build Upload Successful

Joe,

I successfully updated first unit using admin/upload from 3.19.3.0 to nightly build aredn-853-94816c4-mikrotik-rb-nor-flash-16M-ac-sysupgrade.bin.

firmware version 853-94816c4
configuration       mesh
free space
flash = 9144 KB
/tmp = 30260 KB
memory = 35120 KB

John

W9IKU
No luck following suggestions in forum on install

Has this been resolved?  I am trying to get the hAP installed. I can get the elf loaded. However, through the GUI and Telnet I stall out.  Using the command line, I get the "killed" response.

Please help

Greg
W9IKU
greg@w9iku.net

K2QA
I ended up using a friend's

I ended up using a friend's computer and the sysupgrade to 3.19.3.0 worked fine on two hAP units.
Do not know root cause of why it failed on my computer.
3.19.3.0 .elf always loaded. Tried different browsers, etc., but sysupgrade always failed.
Worked first time on friend's computer.

 

AA7AU
AA7AU's picture
Which computers?

Please post the model and O/S type and level for each of the two that you tried. Maybe there's some pattern here someplace. I struggled hard a few months back for my three installs on the hAP units.

TIA,
- Don - AA7AU

KC0EUW
Web interface vs command line sysupgrade install

One method that has always worked for me is described here:
  https://arednmesh.readthedocs.io/en/latest/arednGettingStarted/installin...
Installing the sysupgrade image via command line seems to work when the web interface doesn't.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer