Fahhem's Blog – Running Too Much Kubernetes

Running Too Much Kubernetes

In my previous post, I mentioned running vouch-proxy to protect internal apps. In it, I didn't mention that I over-complicated it by running it under Kubernetes using nginx as an Ingress instance rather than just running nginx itself.

Turns out, I've been on a kubernetes kick for a bit. As of this week, I'm officially running 3 k3s "clusters" made up of one node each. This means the clusters aren't highly-available, but also neither is my home internet connection so :man_shrugging: it doesn't really matter.

Three clusters? What do they do?

Yep, two of them run in my apartment.

"Cluster" one: a NUC

This one is an old NUC (circa 2013) I recently swapped in for an old Acer AspireRevo (a 2009 box) who's fan had kicked the bucket. This is intended to run 24/7, so I put a few "essential" services on it:

k3os: The whole OS is just for k3s, installed via a takeover since I couldn't get the ISO to install normally.
certmanager: a letsencrypt service to let me automatically create SSL certs for my domains
dyndns: a CronJob that updates my domain's DNS entries daily
ingress: an nginx Ingress for hosting my websites
nginx: more nginx containers to host my websites (basically to serve up static files)
"faas" functions: a few Ingress/Deployment combinations that I use as functions in some workflow automation work I'm doing. I didn't use openfaas since it seemed like overkill since I just wanted basic auth, a URL (ingress), and a single instance. I don't need scale to zero or other fancy features, at least not yet.
And lastly: some ingresses to proxy subdomains to my second cluster

"Cluster" two: my desktop

This is my personal desktop, the one I'm writing this on, and I installed k3s inside of the Ubuntu installation a while ago.

I run a few things here where uptime isn't important, and I likely only need them when I'm on the desktop already or are ssh'd in from the road:

container registry: a private Docker registry to host the few custom images I need. Both clusters use this, but it's only really needed when deploying a new image. Since I deploy from my desktop, uptime can match my desktop.
ingress: an nginx Ingress (I tried traefik, but I'm just too familiar with nginx)
grafana: a Grafana instance for playing around. Currently, it just has some CSV files that I generated/update that I wanted to have in a dashboard
nginx: some nginx containers to proxy to NUT / upsd or server static files for things like the staging version of websites. This is basically the same as the production setup but with different hostnames.
vouch: some vouch instances to protect the subset of my websites that are internal apps. There's multiple since I have two protection "regimes", a personal one that allows certain Google email addresses, and a work one that allows a GitHub team.

"Cluster" three: my self-owned website

This is a Linode instance I run that was pretty consistently going down or using 100% CPU due to some Docker Engine bug that I could never figure out. So after a couple years of neglect, I put a k3os installation on it so I can deploy/track it via my at-home clusters.

I haven't finished setting this up, but here's what I plan on putting on it:

ingress: an nginx Ingress
nginx/python/php: some containers to serve mostly-static files (and a woocommerce installation) for these websites

"Cluster" four: Reviewable-owned node

This doesn't exist, and will probably instead run on one of the above "clusters" for a while, since I don't need a dedicated machine for this. So far, I've mentioned clusters with different uptime expectations, and different uptime costs/difficulties and different levels of interest from me in keeping them up. This cluster, at the moment, can likely run on any of the above clusters, but once any tool here gets internal usage I'll likely setup a separate "cluster", though not likely a proper multi-node, high-availability cluster unless things drastically change.

Anyway, here's what I would want to run on this cluster/namespace:

retool or other internal tooling product: At the moment, mainly for helping me with customer support operations. Every time I interact with Stripe I wish I could automatically redo the last interaction but with a new customer.
grafana: we're looking to replace our current user analytics tool, and all current analytics tools are these walled gardens that don't let me do what I want with my data and don't let me tweak the dashboards/queries/etc. I would also throw some other internal data into here.
faas: Some workflow automation functions. As an example, if you want to parse a long string into a json structure, most tools make this pretty difficult, but any code (and some ChatGPT to write it) makes it 10x easier. But you now have to host the function, etc, and I'd like to avoid that.
nginx/nodejs: Some copies of the product running. One at HEAD as a staging of sorts, and others against branches of client-only PR's to make code review, QA, and other things easier.

One for all, and all clusters for one

Some things that I left out of the above, but I either have or want to have in all of the clusters:

k8s dashboard: This is an incredibly useful tool. I have some scripts (and a StreamDeck) to make opening and logging in as easy as the press of a button, and getting a quick overview of what's going on (and how many pods are red!) in a cluster is worth the work. If anyone knows of a CLI that would give me that level of information, please leave a comment.
prometheus/alertmanager: I don't have this setup, but I want something that can track data about the cluster and then point a single grafana dashboard at all of the databases. Then setup some basic alerts on things like memory usage, disk usage, failing pods, etc.
argocd, or another devops tool: I have all of these clusters configured from folders of kustomization.yaml files, and I've been diligent on not altering anything through the dashboard or kubectl. I setup argocd in the past for a previous employer, and while I found it to be a strange product, I definitely liked the concept. Make a change, commit it, it gets deployed automatically. prod == git. That way, if I want to move a service, recreate a cluster, create a new cluster, etc, it's just some git operations and I'm done; I don't have to then manually go in and know to run other commands or, worse, do any GUI operations that aren't written down. Because if I have to do anything that requires memory in 6-12 months, I would be better off just nuking all the clusters and starting over again.

Why, man? Why?!

I do admit, running 3 "clusters" of a single node each is strange. Here are what I could have done instead:

I could have run a single cluster, with my Linode instance as a control plane node, and deployed everything with Node affinities. However, that meant if any part of my home network was down, then pods would be considered in an Error state in the control plane and likely reboot when the network came back up. That would mean that, if I every setup alerting, I would be alerted when my desktop shut off, or I would have to setup very complicated alerting rules.
I could just run all these things "normally", as system services. I used to run all of these things as /etc/init.d/ scripts, in fact that's what ran on the Acer AspireRevo. However, since then, the desktop sysadmin world has left me in the dust, and I'm not sure if I should put something in /etc/init/ or create a systemd unit file, or if I switch to another distro from Ubuntu how I would even tell what they do. I've tried using systemd in the past, and have had varying levels of success, to the point where one machine's user unit files wouldn't run but the same file would run as a system unit file with a su in it. I've tried multiple times over the years and have really struggled to locate sufficient documentation on systemd to craft the perfect .service files, and I've especially struggled with debugging why it's not working. Maybe there's a book I could just sit and read that would explain it all, but so far no amount of googling has answered the question "Why doesn't my .service file run on bootup?"

One thing I wanted, and am trying to do more often in general, is to have a reproducible system. If some hardware bites the dust or disk gets cleared, I don't want to be in a position where I have to do a lot of manual work to set things back up to the way they were. I would prefer if the operations needed to replace hardware were:

Get new system/disk/etc
Flash new OS image from Ventoy USB drive (drive is reproducible from online ISO's)
Run bash {system_name}/deploy.sh from a git repository (reproducible via git clone)
Watch as everything reconfigures itself

Since each step uses only reproducible inputs, and there's so few steps, I can trust that I'll remember it and not mess it up. Honestly, if I didn't have such a bad history with using Packer, I would probably have said it should be a single command: bash {system_name}/flash.sh to the USB drive.

If I was going to run more than a couple machines at home, such as a server rack, I would push this further and deploy a netboot box (a NUC or openwrt router) that would serve DHCP and ISO images to boot from that would automatically flash and deploy the software assigned to that MAC address. Then the only step to dealing with a hardware outage would be to replace the hardware.

Posted on: Sat 01 July 2023

Category: Software