You are here

Potosi M2Rocket still off-line

10 posts / 0 new
Last post
AA7AU
AA7AU's picture
Potosi M2Rocket still off-line

Starting a new thread to deal with on-going problem. For details, please see last few posts in this thread:
https://www.arednmesh.org/content/bullet-m2hp-xw-firmware

As of Sunday 20-Jan-2019, the Rocket M2 (with 120* sector) on Mt Potosi in the SW corner of the LV area is still unavailable. This is a very important node here (in the short- to interim- timeframe) and this problem has crippled many new mode users who were relying upon linking thru Potosi.

Basically we can "see" the node in WiFi Scan at good signal strength and with the proper SSID etc. It just seems like it won't handshake with other nodes which used to hit it fairly easily. This removes any IP-based options for trouble-shooting.

It almost looks like it flipped into AP mode instead of Mesh mode. Physical access to this node is currently problematic, and I don't have full details yet.

I have two questions now so as to hopefully move forward on this:

1- is there any way to use that MAC address to connect over RF to the node if it's in Mesh operation?

2- if we wanted to try to connect to the node in AP mode, how would we configure one of our operational nodes to contact it (on -2/10), and then could we somehow remotely reboot that node back into proper operation?

TIA,
- Don - AA7AU

AA7AU
AA7AU's picture
Still shows up in Real-Time SNR charts

The SNR charts must be MAC based as Potosi still shows up with a consistent reading for real-time SNR - so that can't be IP-based.

Just adding another data point - still need HELP!

- Don - AA7AU

AE6XE
AE6XE's picture
If you see this node on 10MHz

If you see this node on 10MHz channel width, then it couldn't be in AP mode -- no settings ever defined to be in this state.   There has been occurrences of moisture shorting out the cat5 wiring, which can put a node in firstboot state (same as pressing the remote reset button on a UBNT power brick from ~15 seconds).   The node would be in firstboot, but the AP is on a standard 20MHz channel. 

On a node receiving the Potosi signal, please grab a support download file.   In this data is the output of a command to see that it is connected with an 802.11 adhoc network, "iw dev wlan0 station dump".   this will confirm Potosi is still in mesh mode, if listed in this output.   If it has an 802.11n adhoc connection, then we'd be looking at the next level for OLSR activity to exchange IP addresses and hostnames.  This gets a bit more technical, but on your local node, install the tcpdump package and  from the command line, "tcpdump -i wlan0 port 698"  and look to see if any data is coming from the Potosi node.  If not, then OLSR is not functioning at Potosi.

Joe AE6XE

AA7AU
AA7AU's picture
Potosi is the first entry in that list

Potosi's MAC is DC:9F:DB:36:81:99 - still shows up in WiFi scans. Here's your data:

root@W7HEN-HARC-M2R90-TDY:~# iw dev wlan0 station dump
Station dc:9f:db:36:81:99 (on wlan0)
        inactive time:  350 ms
        rx bytes:       2632078556
        rx packets:     12549391
        tx bytes:       821955257
        tx packets:     6158694
        tx retries:     4135115
        tx failed:      1683
        rx drop misc:   1136849
        signal:         -82 [-85, -85] dBm
        signal avg:     -82 [-84, -87] dBm
        tx bitrate:     19.5 MBit/s MCS 2
        rx bitrate:     39.0 MBit/s MCS 10
        expected throughput:    13.366Mbps
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            no
        TDLS peer:      no
        DTIM period:    0
        beacon interval:100
        connected time: 1500991 seconds

What's next?

Potosi remains unresponsive on the mesh IP-layer,
- Don - AA7AU

edited to add: this data is from a node which has NOT rebooted since before Potosi went missing.
 

AE6XE
AE6XE's picture
This says that there is an

This says that there is an 802.11 adhoc connection between Potosi node and this node.   looks like about a 17db SNR received signal.   The Potosi node is live and making a wireless link.   Next step is to run the tcpdump command to see if OLSR is up and sending out hello packets.  I'd suspect there are none and thus, no traffic can be exchanged as there is no routing information to communicate with IP traffic. 

You'll need to locally sync with owners of the node to gain access to further investigate.  Don KE6BXT is at Quartzsite, not sure about Frank to gain access.

Joe AE6XE

AA7AU
AA7AU's picture
TCPDUMP doesn't find Potosi

OK, did the tcpdump now:       tcpdump -i wlan0 port 698
just cycles thru a few known nodes except not sure about this one;
22:06:46.172362 IP 10.71.178.144.698 > 10.255.255.255.698: OLSRv4, seq 0xfde8, length 60
but no entries for Potosi and its old IP#

Looks like its not talking .... but it still shows up in WiFi scan this eve.

Frank responded to my earlier email this evening: "... no remote resets available. Physical hill top access is not likely any time soon."

Is there anything else we can try remotely?

Thanks,
- Don - AA7AU

AE6XE
AE6XE's picture
Not much that can be done at

Not much that can be done at this point.   The node has a watchdog reset feature if olsr stops responding on the node, so how it got into this state is unexplained.  I'd want a support data download which can be obtained from a laptop on the LAN of the node at the site before rebooting it. 

You might try this command to see if any traffic is coming out, "tcpdump -i wlan0 ether host dc:9f:db:36:81:99".

Joe AE6XE 

AA7AU
AA7AU's picture
Nada

Thanks, Joe!

Ran  the "tcpdump -i wlan0 ether host dc:9f:db:36:81:99" and got *nothing* for three minutes of waiting:
root@W7HEN-HARC-M2R90-TDY:~# tcpdump -i wlan0 ether host dc:9f:db:36:81:99
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

Understand you need a support data file from the node using LAN to try to figure this one out. I hope we can get that for you. However, I have no control over who will ultimately end up at the site and perform the power cycle. I will inform Frank (no reply yet to my last email) that this data capture needs to be done to help AREDN going forward.

I think that this Potosi failure certainly makes it real clear to all that a hub-and-spoke network design is a very poor choice - when the central focus stumbles and fails ... and even more so when that central point in only accessible at certain times of the year and then with difficulty. I have five new mesh users here in the HOA-dominated west side of Henderson, all pointed at Potosi (with no other current alternative) who are now deaf!  --sigh--

- Don - AA7AU

K6AH
K6AH's picture
Hub and Spoke Not Always a Bad Choice

Don, central site nodes are maintainable if you have alternate ways into them.  Most all such sites in the SoCal network have access through a separate channel and usually on a different band.  In addition, I would never place a node at a hard-to-get-to site without having a managed PoE switch that you can turn power off/on to each node.  It's all in designing to a set of requirements which must include maintainability. 

Andre, K6AH
 

AA7AU
AA7AU's picture
Apologies for the overly broad comment.

Thanks Andre. Sorry for the overly broad comment. You are absolutely right and AREDN has good guidance on how to properly design networking.

The fellow in charge of this site now writes that they intend to implement a power-cycle type control over another RF access. But for now that node is unavailable; hopefully we'll get that data capture before power-cycle. But, I'm somewhat out of the loop on that.

Up in Idaho, our shoe-string budget sometimes precludes common-sense things like remote control over POE. But we're working on it. Luckily we have mostly a true interconnected mesh up there and the mountain top mesh node is not central to continued operations.

Thanks for all you do,
- Don - AA7AU

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer