What’s everyone using for status monitoring and/or status pages either in their lab or at work?
I setup a status page for my fediverse instances using Uptime Robot (have an existing subscription), and the features are kinda lacking. I feel like they haven’t really updated anything in the last 5 years which is unfortunate.
Uptime Kuma seems really cool. I set it up last week and it seems to work quite well.
I prefer gatus because you can pass it a configuration file. Allows me to manage my setup declaratively
We’re looking at that at work and it seems pretty good. I’d probably want to host it external to where my lab cluster is though otherwise it’s kind of pointless, eh?
That’s true, I thought of hosting it on a VPS but then VPN is another moving part that can fail. I ended up putting it in a mini-pc on the same stack as firewall and modem so that it is relatively stable.
This left me with the problem that I don’t want to expose my docker socket from each host so I’ve to use the network based tools rather than the built in docker monitoring. If you host it in the cluster itself, it shouldn’t be a problem.
Yeah, hosting it yourself certainly has various potential issues unfortunately :/
Maybe it’s been implemented by now, but when I set it up I was disappointed to realize there’s not an API. Previously I was using Statping and had a Slack bot that employee could use to check the status of everything. I saw that there was a project you could install alongside Uptime Kuma to add API endpoints, but I didn’t take the time to set it up.