You are here

Second attempt with nanostation loco M5, no Ethernet problem

32 posts / 0 new
Last post
AA6E
AA6E's picture
Second attempt with nanostation loco M5, no Ethernet problem

Results with "clean" install on second brand new NanoStation loco M5 (XW):

 

1. Replace AirOS 6.06 with 5.6 via TFTP

2. Replace AirOS 5.6 with 5.5 via AirOS update

3. Replace AirOS 5.5 with AREDN-develop-176-8c5be0cf-ubnt-loco-m-xw-squashfs-factory.bin

==First, with AirOS upgrade ---> leads to dead Ethernet problem (no response to DHCP discovery)

==Then repeat with TFTP install ---> dead Ethernet again

 

I do get WiFi access through the "MeshNode" SSID.

But when I Save Changes, I get this error: "parameter 'DTDMAC' in file '_setup.default' does not exist". My SYSINFO dump gives the following nvram data:
 

nvram

hsmmmesh.settings=settings
hsmmmesh.settings.wifimac='78:8a:20:aa:0e:29'
hsmmmesh.settings.mac2='170.14.41'
hsmmmesh.settings.node='AA6E-LocoM5-001'
hsmmmesh.settings.config='mesh'

And in fact there is no 

hsmmmesh.settings.dtdmac=... 

line as there should be if I compare with some other sysinfo dumps on the web.

 

So, to my feeble mind, it appears that the bin file for this hardware is defective on this hardware. I get the same results with two devices. Or is there something else I can try?

 

Unfortunately, I don't have any other nodes here to work against. I'm trying to bring up a new 2-node mesh. Ideas?

 

73 Martin AA6E

AE6XE
AE6XE's picture
Martin,
Martin,

Hopefully I'm able  to keep up on all the details and history, I've been out traveling for work this past week.    I suspect the symptoms above (from save settings) are because the ubnt-nano-m-xw image is live on this loco M5 XW?   If so, accessing via the 'meshnode' wifi AP, should only be used to upload the "ubnt-loco-m-xw" sysupgrade image, and uncheck the "keep settings" box, so it comes up clean in firstboot.   

Joe AE6XE
AA6E
AA6E's picture
Some progress, but still needing help...

Joe et al.

I did the on-line update (via meshnode wifi) to AREDN-3.16.1.1-ubnt-nano-m-xw-squashfs-sysupgrade.bin (with keep settings unchecked) and then initialized the node name, password, and max. distance fields.  I then rebooted.

However, I am still getting no LAN connection.  (No DHCP service or any other activity according to wireshark.)  And I still see the 'meshnode' wifi service.

Another (older XM type?) loco M5 is apparently working correctly.  I do get the expected LAN service with it.  And it does NOT offer 'meshnode' wifi.

Is my suspicion correct that the firmware will put up 'meshnode' iff it can't address the Ethernet hardware?  

Should I possibly be using the 'nightly' build firmware?  

Thanks for further guidance.  Has ANYONE reported success with the XW?

Martin AA6E

AA6E
AA6E's picture
one more test...

Same result using AREDN-develop-176-8c5be0cf-ubnt-loco-m-xw-squashfs-sysupgrade.bin firmware.

BTW, my AREDN status page shows Wi-Fi address = LAN Address = default gateway = "none" and it is presenting the "meshnode" wifi.

I think I've run out of things to try! :-(

Martin

AE6XE
AE6XE's picture
If you are confident the
If you are confident the device is running the "AREDN-develop-176-8c5be0cf-ubnt-loco-m-xw-squashfs-sysupgrade.bin" image, are you able to access the "meshnode" wifi and under the Setup->Administration page, click on the link at the bottom to obtain the Support data?   If so, please attach the file here.   This should give enough information to see what is going on. 
AA6E
AA6E's picture
loco m5 (xw) support data

Yes, I have the support data and also got a bunch of screenshots off my phone.

Check it out at https://aa6e.net/files/AREDN/

Thanks for the help!

NOTE ADDED:  I discovered that an older loco M5 has identical XW hardware and works fine with an older software Linux date Apr 12 2017. I included the support data for that unit in the website above, just for reference.
 

AE6XE
AE6XE's picture
Martin,   this is the message

Martin,   this is the message on the loco M5 XW linux kernel logs:

[    1.200000] ag71xx ag71xx.0: no PHY found with phy_mask=00000002

This means the firmware does not detect any Ethernet port-hardware.  This would be because:

A) hardware failure
B) the wrong firmware image is running that doesn't have the right Ethernet port driver

This device needs a re-install of the "loco-m-xw"  aredn image.   You can do that with one of these options:

1) use the tftp method and upload the 'factory" flavor of this image
2) access the node on wifi via the Access Point "meshnode" from your laptop, go into setup and upload the "sysupgrade" flavor of this image.

Joe AE6XE
 

AA6E
AA6E's picture
tried everything on loco M5 XW (two units)

Have tried uploading various binary factory versions by tftp from Linux, upgrading through MeshNode to sysupgrade versions, etc. as advised.  The uploads seem to "take", but I always have the PHY device missing.  These units appear to run airOS 5.5+ just fine, and tftp of course uses the Ethernet port.  So there's no reason to suspect bad hardware, is there?

I note that a few others report success with the loco M5 recently.  It could be something related to my specific hardware batch, I suppose.  Or -- just possibly -- I'm doing something stupid.  I'm pretty experienced with Linux and small machines, but that does not preclude mistakes, of course.

I attach the ID labels from the unit in question.   If you have a locoM5 XW, maybe you could compare with your ID info?

We're at a loss unless we come up with a new bright idea.   

Thanks/73
Martin
AA6E

Image Attachments: 
AE6XE
AE6XE's picture
Martin I can see in the

Martin I can see in the support data this line (in the "dmesg" output) of your node:
[    0.000000] Kernel command line:  board=UBNT-NM-XW console=ttyS0,115200 mtdparts=spi0.0:256k(u-boot)ro,64k(u-boot-env)ro,7552k(firmware),256k(cfg)ro,64k(EEPROM)ro rootfstype=squashfs,jffs2 noinitrd

The "UBNT-NM-XW" indicates a wrong image file.    Here is a loco M5 XW node I recall on the SoCal network and have access to, thanks to the Jet Propulsion Lab ARC:  The correct image will show "UBNT-LOCO-XW":

[    0.000000] Kernel command line:  board=UBNT-LOCO-XW console=ttyS0,115200 mtdparts=spi0.0:256k(u-boot)ro,64k(u-boot-env)ro,7552k(firmware),256k(cfg)ro,64k(EEPROM)ro rootfstype=squashfs,jffs2 noinitrd

This should be resolved with a tftp method with the "loco-m-xw" image from here:

http://downloads.arednmesh.org/snapshots/trunk/targets/ar71xx/generic/

Right now the image is aredn-80-014217a-b84a1c5-ubnt-loco-m-xw-factory.bin .   But with the next nightly build, this build #80 will be #84 or bigger #, so this link will very soon not work.

Joe AE6XE

 

AA6E
AA6E's picture
slogging some more

Joe - I sent you the wrong support data file last night.  Sorry!  I repeated my (by now) standard process with the latest nightly files.  Here is my log.  (Sorry for the length.)

TL;DR - Looks like a firmware error to me (on two separate locoM5 units).  Except other folks seem to get good results. (But maybe not with the same hardware batch that I have?)

UBNT-AREDN firmware loads, 7/23/18, AA6E

Download aredn-84-014217a-b84a1c5-ubnt-loco-m-xw-factory.bin and ...sysupgrade.bin to "fw" directory (Google drive)

On locoM5 unit 1, go to recovery mode (flashing red-green, yellow-green).

Attach to Toshiba (Ubuntu 18.04) Ethernet port. Verify Toshiba IP (ip -4 addr) is manually set to 192.168.1.22.

Note: Using tftp-hpa 5.2

cd <fw directory>
tftp 192.168.1.20
mode binary
trace on
put aredn-84-014217a-b84a1c5-ubnt-loco-m-xw-factory.bin

response:
...
received ACK <block=12673>
Sent 6488476 bytes in 10.2 seconds [5102292 bit/s]

M5 unit does fancy stuff... ending with Power, LAN, LED4 steady.

We see MeshNode appearing on 5G Wifi, showing initialization screen "NOCALL-170-14-41".

Before ANY further operations, go to "admin" screen and take FIRST SuppportData download. (Downloads to Nexus 5, because Toshiba has no 5G.)

Upload SupportData to G drive.  Its label is supportdata-NOCALL-201807211859.tgz .
Examine this: "board=UBNT-LOCO-XW",
[    0.684816] libphy: Fixed MDIO Bus: probed
[    0.691820] libphy: ag71xx_mdio: probed
[    1.323815] ag71xx ag71xx.0: no PHY found with phy_mask=00000002

I.e., can't find Ethernet.

Hypothesis:  I think this gives the game away at the very beginning.  Firmware can't find Ethernet device, and all further behavior stems from this fact. This is the behavior I have gotten from all firmware releases for the locoM5 XW, I believe.

Now, initialize things via MeshNode:
nodename = AA6E-LocoM5-001
message = hello world
password = something good
max range ~ 10 km.

Save changes.
Error message:
"Configuration NOT saved! / parameter DTDMAC in file '_setup.default' does not exist"

Reboot.
Node name and initial message are as stored above, so error message lied!  Configuration was (at least partially) saved.

Take SECOND support data download. Filename is supportdata-NOCALL-201807211914.tgz .

Moving on, we try an "over the air" download of aredn-84-014217a-b84a1c5-ubnt-loco-m-xw-sysupgrade.bin .  Over the air in our case means via MeshNode wifi, since of course we have no mesh connectivity.  It's not clear why a sysupgrade.bin process should be better than the factory.bin, but we give it a try anyway.

Download to Nexus 5, upload to Me.  Then... 

Reconfigure via MeshNet.  (Reset "Keep Settings") Upload to node from file aredn-84-014217a-b84a1c5-ubnt-loco-m-xw-sysupgrade.bin .
Flash red, wait, wait, finally finishes load/boot sequence.

Node NOCALL-170-14-41
Setup as before to AA6E-....
Get same "Configuration NOT saved!" error
reboot anyway
returns as before to MeshNode login with node id AA6E-LocoM5-001

take THIRD support data download. Filename is supportdata-NOCALL-201807211859 (1).tgz

ALL support data files show similar PHY not found problem.

I have tried virtually the same procedure with a number of other bin file versions -- all for locoM5 XW hardware, and they all give the same results.



 

Support File Attachments: 
AE6XE
AE6XE's picture
This would be a 3rd option,
This would be a 3rd option, there is a hardware REV on this device and a different Ethernet chip. To investigate this issue, can you tftp load the AirOS factory image back,    then   "ssh ubnt@192.168.1.1" (password is default 'ubnt') into the device and capture the output of "dmesg"?   Let's see what psychical Ethernet device is showing.    You may be the very lucky :) first we've seen a new hardware REV of this device from ubnt.   If all else fails, there's always an option to open the cover to inspect the chip.
AE6XE
AE6XE's picture
Martin, what we're looking
Martin, what we're looking for is some sort of chip identification detail reported in AirOS dmesg logs.   Here is what previous loco M5 XW devices show from dmesg in AREDN firmware:
[    1.328452] ag71xx ag71xx.0: connected to PHY at ag71xx-mdio.0:01 [uid=004dd023, driver=Atheros 8032 ethernet]
These Ubiquti devices had an unusual hardware design.  The 8032 PHY is an external chip and the SOC AR9342 also has a physical built in ethernet  that wasn't used for some reason.   There is a GPIO external pin from the AR93xx routed to the AR8032 Ethernet chip to reset it.  This was why this device would lock up a year or two ago, a special design and problem unique to this ubiquiti hardware that had to be worked around in the firmware from other AR93xx based devices.    

I'm wondering if this is a new hardware REV and they went back to using the native SOC Ethernet, lowering cost.  I see your device and older ones are still both using "AR9342 rev 2". 

Joe AE6XE  
 
AA6E
AA6E's picture
AirOS info

Joe,
I set up the current AirOS, which I think is v 6.1, and extracted the dmesg info, along with other "interesting" config files. Attached as a tgz.

I see a mention of an AR8035 device (not 8032).  I confirm that by inspecting the board.  I don't know what's under the heatsink or inside the big shield can.  Too chicken for that. There is a mention of an "eth1: link not ready" FWIW, but the AirOS GUI is not complaining.

If it would be helpful, I could loan you one or both of our devices.

73 Martin AA6E

Support File Attachments: 
AE6XE
AE6XE's picture
There's something more going
There's something more going on than upgrading to the AR8035.   This by itself should work.  The kernel driver in use is the  "803x" which supports the following chips:
#define AT803X_PHY_ID_MASK            0xffffffef
#define ATH8030_PHY_ID                0x004dd076
#define ATH8031_PHY_ID                0x004dd074
#define ATH8032_PHY_ID                0x004dd023
#define ATH8035_PHY_ID                0x004dd072

The loco-m-xw image is hardcoded to find the chip here (this is probably the issue):

static struct mdio_board_info ubnt_loco_m_xw_mdio_info[] = {
    {
        .bus_id = "ag71xx-mdio.0",
        .phy_addr = 1,
        .platform_data = &ubnt_loco_m_xw_at803x_data,
    },
};

This corresponds to older hardware that finds the chip at "ag71xx-mdio.0:01".   I just checked on a rocket M5 XW and it finds the 8035 at "ag71xx-mdio.0:04".
Guessing a little here, but you might be able to load the rocket-m-xw image and a chance it will work.

Joe AE6XE
AA6E
AA6E's picture
When your loco is a rocket...

The good news is that my loco m5 xw does boot with the rocket image: aredn-96-014217a-3f0d44b-ubnt-rocket-m-xw-factory.bin .  That is, the Ethernet is found and the web server and DHCP start up.

The bad news is that there is no Mesh.  OSLRD seems kaput.  Advice? Support data below.

Thanks for help so far!

Support File Attachments: 
AE6XE
AE6XE's picture
Bingo.   Looks like the new
Bingo.   Looks like the new hardware REV of the nanotstation loco M5 XW has changed the ethernet chip hardware to be identical to the Rocket devices.    it now finds the same chip at the same physical bus and device ID:

[    1.422839] ag71xx ag71xx.0: connected to PHY at ag71xx-mdio.0:04 [uid=004dd072, driver=Atheros 8035 ethernet]

The support data is showing OLSR running.  Double check that the configuration settings are consistent on the same wireless channel and bandwidth as the other nodes you are mesh'ing with.    This device is on  5Mhz channel width on ch 149.  Are the other devices also set the same?   Maybe on 10Mhz or 20Mhz channel width?

Joe AE6XE
 
AA6E
AA6E's picture
olsrd working? Not?

Still slogging with the rocket image on my loco M5.   I'm not sure what "success" looks like exactly, but I think something is wrong.  First of all there is no OSLR tab showing on the status page.  (See attached photo.)  Also there are multiple daemon.err messages in the dmesg support data list:

Thu Jul 26 06:54:06 2018 daemon.err dnsmasq[1659]: failed to load names from /var/run/hosts_olsr: No such file or directory
Thu Jul 26 06:55:55 2018 daemon.err uhttpd[843]: Argument "-49|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 06:57:05 2018 daemon.err uhttpd[843]: Argument "-52|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 06:57:19 2018 daemon.err uhttpd[843]: Argument "-52|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 06:57:33 2018 daemon.err uhttpd[843]: Argument "-50|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 07:06:20 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:06:45 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:07:15 2018 daemon.err uhttpd[843]: Argument "-55|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 07:07:29 2018 daemon.err uhttpd[843]: Argument "-57|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 07:07:34 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:07:42 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:09:56 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:10:08 2018 daemon.err uhttpd[843]: Argument "-55|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 07:10:19 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:10:22 2018 daemon.err uhttpd[843]: Argument "-53|184| |&#x41&#x52&#x45&#x44&#x4E&#x2D&#x35&#x2D&#x76&..." isn't numeric in sort at /usr/local/bin/wscan line 300.
Thu Jul 26 07:10:25 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:11:08 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:11:14 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:20:55 2018 daemon.err uhttpd[843]: uci: Entry not found
Thu Jul 26 07:21:24 2018 daemon.err uhttpd[843]: uci: Entry not found

(See the full support data provided in most recent prior message.)
These errors do not show up on the listing from my working locoM5 node running 3.17.1.0RC1.  So maybe the "rocket" architecture is really not compatible with the "loco"?  What to do next?  Any hope of a proper loco software version that supports the new Ethernet arrangement?

Thanks again!  - Martin AA6E

Image Attachments: 
K5DLQ
K5DLQ's picture
the OLSR status button has

the OLSR status button has been intentionally removed to mitigate a defect in the olsr_http plugin (and to conserve RAM).  this information is also available via the olsr_json plugin at port 9090   (ie.   http://localnode:9090)

AE6XE
AE6XE's picture
Martin,   Change all your
Martin,   Change all your nodes to be on 10Mhz or 20Mhz channel width.  This should solve the problem.   Sorry, this didn't ring a bell before, but we had previously discovered that 5Mhz channels on the 5Ghz nodes is not functional.  This is not unique to the loco M5 XW, but all 5Ghz devices, including Mikrotik.    This is something we're going to have to dig into and work with the openwrt group.    The issue does not exist on 2Ghz devices.

Joe AE6XE
AA6E
AA6E's picture
Not sure about bandwidth, but...

I seem to have my 3-node test all working, including the "new" loco/rocket units.  I found they worked fine at 5 MHz, BTW, but I have switched to 10 MHz anyway.  Now, on to trying to do actual useful stuff with AREDN.

tnx/73 Martin AA6E

AE6XE
AE6XE's picture
That's good news.  It has
That's good news.  It has been a couple of months since I tested 5Mhz channel width on 5Ghz. 

We do find that 10Mhz is the sweet spot for most links and yields higher thoughput than both 20Mhz and 5Mhz.   However, if the link is relatively short distance, Maybe up to a mile or 2, then 20Mhz may offer higher throughput.
AE6XE
AE6XE's picture
Martin,  can you find any
Martin,  can you find any information on the device's label or motherboard that would clue us in to a loco M5 XW having 8032 or 8035 chip?    Any revision numbering, etc. anywhere?  The test date may help us zero in worse case.
 
AA6E
AA6E's picture
labeling

I may have posted this before (this is a LONG thread!).  Here is the data I have on the "new" loco M5 XW (board and box) that seems to require the rocket binary.

73 Martin AA6E

Image Attachments: 
W2NP
Same issue, but no luck with Rocket
I'm having this same issue; the ethernet port quit working with latest nightly build - 105 - flashed to firmware.  I tied a hard reset, and the button broke off inside the box, and I tried to do a reset using the wireless PoE adapter, but even after holding in the button for 3 full minutes, the lights didn't change.   I took pictures of the board inside the Loco M5 and uploaded them here: https://imgur.com/a/4zf8gVQ

I was able to update the firmware to the sysupgrade.bin firmware using the wifi connection, but still no luck in being able to finish configuring the node.  I tried the Rocket firmware, but it threw an error saying "the firmware CANNOT be updated, the upload file is not recognized."

I've attached the support data file.
Support File Attachments: 
AE6XE
AE6XE's picture
W2NP,  I can confirm from the
W2NP,  I can confirm from the support download you have the XW hardware (an Atheros AR9342 chip).  It indeed can not find the ethernet chip.   Double check that you are attempting to load the "rocket-m-xw" image and if through the tftp method, is the 'factory' flavor.     If you attempt to upload from over wifi connection to this device with AREDN  UI, make sure it is the "sysupgrade" flavor. 

Joe AE6XE
W2NP
Well, this is strange.

Well, this is strange.

I uploaded the image over the AREDN UI as you suggested: 

aredn-112-22e4557-5828113-ubnt-rocket-m-xw-sysupgrade.bin


But now it's not showing up over wifi.  I waited over half an hour, no change.  The I powercycled the air gateway and the M5, and at first it showed up, I clicked on the setup button, and it went away again.  

I connected it by ethernet cable, and the same kind of thing happened: The main start screen showed up, I clicked setup, entered root/hsmm for user/pass and then it went away.

Not sure how to proceed, and cant access the supportdata file to attach here.

W2NP
OK, well, it seems like it

OK, well, it seems like it stays upon ethernet for about 45 seconds or so, then it crashes.  Luckily that was just long enough to get the support data file.  I've attached it here.

Support File Attachments: 
Support File Attachments: 
W2NP
Well, this is strange.
Well, this is strange.

I uploaded the image over the AREDN UI as you suggested: 
aredn-112-22e4557-5828113-ubnt-rocket-m-xw-sysupgrade.bin

But now it's not showing up over wifi.  I waited over half an hour, no change.  The I powercycled the air gateway and the M5, and at first it showed up, I clicked on the setup button, and it went away again.  

I connected it by ethernet cable, and the same kind of thing happened: The main start screen showed up, I clicked setup, entered root/hsmm for user/pass and then it went away.

Not sure how to proceed, and cant access the supportdata file to attach here.
W2NP
OK, well, it seems like it
OK, well, it seems like it stays upon ethernet for about 45 seconds or so, then it crashes.  Luckily that was just long enough to get the support data file.  I've attached it here.
Support File Attachments: 
AE6XE
AE6XE's picture
W2NP,  Not finding any
W2NP,  Not finding any smoking gun in the supportdata.    Be sure to clear the browser cache and see if you can get through the setup screen.  
W2NP
Had the same problem on two
Had the same problem on two different machines, and my phone, so I don't think that's it.  So weird...
W2NP
got it working
I ended up using the TFTP procedure to upload the latest nightly build of the factory rocket firmware (aredn-116-22e4557-4f30825-ubnt-rocket-m-xw-factory) and it appears to be working!

Thanks for all your help!

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer