After upgrading to 13.22.8.0 from 13.22.6 advertised services after a few hours are being dropped..
Rebooting the node supporting those services bring services back for a few hours. Then they
no longer show up under :Services" again..
I have reloaded the 13.22.8 several times with the same results both on a rocket and nanostation.
I did not see anything in the advance configuration setting that would impact services
Rolling back to 13.22. 6 clears the problem.
The screen capture shows what is normal When the services are dropped there are no entries under Services.
Also below is the configuration entries
Bill KA2FNK
On the default AREDN page:
https://www.arednmesh.org/
Scroll down to:
"Now run an hourly check on published service and “unpublish” any which aren’t really available. Re-enabled services will be..."
Click 'Read More':
See:
"Now run an hourly check on published service and “unpublish” any which aren’t really available. Re-enabled services will be republished automatically."
So, for some reason, your node thinks that the configured services are not available.
Are your configured services available?
73, Chuck
No changes to the configuration made between the two versions for these services.
Yes the services are present, So why the time out ?
I did under 13.22.8.0 disable and re-enable the services, but they still
time out.
Bill KA2FNK
You say 'time out' and 'are present', but do they work?
Documentation uses 'available'.
I think the AREDN code is using a 'ping' test on the advertised service.
Are your advertised services that are being 'dropped' PING'able?
I had to be more careful with syntax when adding in services in the current production release; 3.22.8.0.
Version 3 (three), not 13 (Thirteen)
73, Chuck
The services are not available when the "timeout" occurs. I said time out because the
service is available for a period of time after that node is rebooted, or if I delete and
reconfigure that service. within 4 hours the service stops being advertised.
Below is with the service working.
Current neighbors NQ NLQ TxMpbs Services
ka2fnk-5g-o-p1 100% 100%
KA2FNK-Cam1 KA2FNK-Cam1
Not being advised ( timeout)
Current neighbors NQ NLQ TxMpbs Services
ka2fnk-5g-o-p1 100% 100%
KA2FNK-Cam1
Note the Missing KA2FNK-Cam1 under Services
Preamble: We just spent Saturday the 4th on our mtn top repeater site (9100'+) above town where we also have two AREDN nodes installed along with two brand-X IP-cameras (which have lived there thru two very severe winters with flying colors). To be able to make this visit this time required a special permission letter from the local Forest Circus Ranger and from ICS Operations for the "Moose Fire" (100k acres and burning) so we jumped thru all the hoops and finally got our "papers" in order for the visit. While there for the entire day, we accomplished everything on our list to be ready for the next upcoming winter cycle (we generally only get at best a 4-5 month seasonal availability).
While there, I upgraded both nodes to the latest-and-greatest "stable" AREDN firmware after making some needed corrections for proper DTD in our local VLAN switch. Everything tested all OK - Hooray. We did what we needed to do and left tired/hypoxic/happy. So, happy/contented/still breathing-OK, we wrapped up and rolled back down to our little rural remote river valley.
Waking up the next morning, I find that my two (password-protected) cameras up there are now "unpublished" even though they are operational and responsive over the mesh from a direct IP-link. According to what I read here, the node seems to be thinking that they are not responsive and therefore, now under the new v#8 protocol, the properly setup and defined entries for those cameras are NOT listed as active links, although the underlying "server" names still show up.
The probability of getting another kitchen pass from the authorities to return to that site is minimal; they are letting the forests burn until the coming winter snows put them out; and so I will likely have to wait for next spring (or buy a snowcat) to get the next physical access.
What can I do remotely to re-instate my two important mtn-top cameras as active links? (Yes, I know I can rely up direct IP-addressing, but I like to try to stay at least one step above bush league)..
Help? (cough, cough, cough, cough ... they would rather burn the forest than properly manage/log it)
Smoke the Bear is dead and we are more blind than before,
- Don / AA7AU
ps: I guess I first need to understand what "*really*" means in this sentence:
"Now run an hourly check on published service and “unpublish” any which aren’t really available. Re-enabled services will be republished automatically."
Nope, just plain http. Both been working for a long time with no issues. I even tried shifting one to 8080 to see if that made a difference inder v.8 but it didn't.
- Don / AA7AU
Both cameras respond to pings over the mesh.
Any one else got an idea how we can force the node to advertise perfectly good and responsive links as properly/previously setup?
Is it possible that the password protection on the camera webserver defeats the "intelligence" of this new feature?
Please correct me if I'm wrong, but changing the way a node works that breaks backward compatibility does NOT seem like a "feature"...
Help?
- Don / AA7AU
<aside>We no longer have physical access to that site (and may even lose it entirely) as the Moose Fire was busy burning closer and closer to town and our municipal watershed last night. The pyrotechnics last night were astounding an entire ridge line on fire with huge group torching etc. The Forest folks would rather burn the forest than log it but this time they lost control (but probably gained budget $$).</aside>
Both cameras respond to pings from the node itself when logging in thru 222/telnet. And why wouldn't they if the cameras remain active when the underlying link is used directly over the mesh?
The only thing I can think of is that the non-backward-compatible publishing logic is that perhaps a password challenge as the initial response is not properly allowed for by the new code.
If nothing further heard here, guess I gotta go file a bug report - sigh! I would revert to 3.22.6.0 but that's not an option for this node (see earlier posts); it's gotta stay at v8 until next year bu the looks of things and there's no way I'll do anything remote on that isolated node.
- Don / AA7AU
ps: fire has slowed down and not yet reached out mtn top site, although it is now burning hot in the municipal watershed above town. Thankfully we now have a much more capable type *ONE* IMT on the fire as of 0600 local this morning. There are very noticable differences between the various IMTs and it's even obvious in how well their PIO staff takes its responsibilities to keep the public informed in a timely manner,
pps: I've personally done all the major ICS courses etc, up to and including COML and even AUXCOMM. It's now my observation that the time delays inherent in the overall ICS management structure/culture directly leads to longer/hotter fires run by risk-averse personnel. Meetings can keep aircraft on the ground while group think works every morning. Also: One needs to watch the AHJ (Forest Circus in this case) closely as they sometimes may have hidden agendas that are not necessarily parallel to the values/objectives of the local civilian community.
Replied to email, but that is a less than optimum way of contact for me as the email address registered here is inundated with copies of forum posts and so I only occasionally scan it. I'd rather have the conversation here in this forum and then if necessary use the forum user-to-user direct contact here for private info. That way everyone else interested can follow along.
So far:
suggestion: curl -k -L -v
result:
root@xxxxxxx:~# curl -k -L -v
curl: no URL specified!
curl: try 'curl --help' for more information
- Don / AA7A
> It would appear the forum software stole a bit of that line - it should read:
> curl -k -L -v URL_OF_THE_FAILING_CAMERA_GOES_HERE
OK, my guess is that it looks like your brand new code won't like the 401/authorization bit from the initial password challenge. You should be able to replicate the same on any http which has a standard pw challenge.
Please read the full details in my forum post; there is little I can do to/with that node (or cameras) without risking it all as being down/dead on the mtn until early next July. Otherwise I would immediately revert to the prior *stable* v6 release.
Ideas?
- Don
3.22.8.0, r11427-9ce6aa9d8d
----------------------------------------------
root@xxxxxxxx:~# curl -k -L -v baldycam1.local.mesh/img/snapshot.cgi?size=4
> GET /img/snapshot.cgi?size=4 HTTP/1.1
> Host: baldycam1.local.mesh
> User-Agent: curl/7.66.0
> Accept: */*
>
< HTTP/1.1 401 Unauthorized
< WWW-Authenticate: Basic realm="Authorization"
< Content-Type: text/html
< Content-Length: 351
< Date: Mon, 12 Sep 2022 00:58:46 GMT
< Server: ip-camera
<
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>401 - Unauthorized</title>
</head>
<body>
<h1>401 - Unauthorized</h1>
</body>
</html>
root@xxxxxxxx:~#
I've made a change which will allow 401 status codes in upcoming nightly builds.
Have an event this weekend where we will be using these nodes and services.
After the event is over I'll reinstall 13.22.8.0 so we can continue to investigate.
Thanks gor responding sof quicky
and finally had time set up the node and cameras that were having the issue.
I have two cameras from the same manufacture that are installed on a tower. I updated the ARDEN software to
13.22.8.0 on the node that these cameras are connected to. The adverting of these cameras did not get dropped.
Comparing software versions I noted a difference in the configuration software on the cameras.
The Tower camera (older camera software version) had both http and https enabled by default. However the
new cameras by default only had https enabled. I had configured the node supporting the new cameras using http just like to
older cameras. The new cameras when receiving the http request would respond using https which permitted access. This masked
the issue.
Problem found:
Since the new cameras did not respond to the node on http (port 80), the node would stopped advertising them as a
service after a period on time. This made sense as that service on port 80 was not available.
Turning on the http on the new cameras corrected the problem.
My takeaway from this is the 13.22.8.0 is checking for a response on the service port that
that is configured on the node for that service. Which is what you want.
Thanks to everyone that helped and responded.
Bill KA2FNK