this post was submitted on 21 Jul 2024
20 points (100.0% liked)
Linux
5288 readers
383 users here now
A community for everything relating to the linux operating system
Also check out !linux_memes@programming.dev
Original icon base courtesy of lewing@isc.tamu.edu and The GIMP
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I agree with the other comment. Look into the actual logs of the services. If they send a 503, they should be able to provide an explanation.
If you're asking if your ISP is alright... You can monitor that. Monitor if DNS is working, monitor if a ping to some server has hiccups.
And then do it methological. Is it just completely random services? Then it's likely that your monitoring has connectivity issues. Or is there some structure to what you're seeing? Do the issues all concern the same server? Or location? Or protocol? Then it's maybe that. Or it's a bit more complicated but they share a common thing, software or infrastructure element.
Edit: Alright, I didn't notice this was all concerning your same homeserver.... Maybe set up some local monitoring? See if it's different from the perspective of the computer itself, or just if viewed from the internet? You can also monitor some performance parameters: Is there enough free RAM, is the CPU busy, are you close to maximum upload bandwidth, is the I/O too much... But I suppose the main question is: Is it a network issue? And if yes, where and what kind.
If you're using Cloudflare or some other tunneling solution, that could also be the issue.
It's hilariously annoying, but to address your points:
The 503s are coming from cloudflare indicating it can't connect to the back end, which makes me think network issue again. Non-CF sites just show timeout errors.
I don't think it's resource related; it's a 10850k with 64gb of ram, and it's currently using uh, 3% cpu and about 15gb of ram so there's more than sufficient idle resources to handle even a substantial spike in traffic (which I don't see any indications of in the logs, but).
It's gotta be some incredibly transient network issue but it's so transient I'm not sure how to actually make a determination as to what happens when it breaks, since it's "fixed itself" by the time I can get near enough to something to take a look.
Maybe set up a script that runs locally and pings an external service like 1.1.1.1 or 8.8.8.8 every second to see if it survives in a window when your services alert? Perhaps it's your modem refreshing some config which causes a blip for a few seconds or something similar. If this doesn't alert at least you can rule out that your internet fully goes out.
The other side of this would also be useful, if you could run a similar check towards different levels of your home network to see how far down it gets (e.g. ping your router, expose some simple TCP echo service on the server running all this and nc it, curl the status page of the reverse proxy (or set up a static page in it), curl the app behind the reverse proxy - just make sure to use firewall rules for this and not just put everything on the internet). Depending on where it fails should hopefully give you some idea to go on.
Maybe set up https://www.thinkbroadband.com/broadband/monitoring/quality/ to see if it registers any packet loss in those times or increased latency (although I'd still do the above as well)