Kubernetes Cluster on Raspberry Pi 3’s

As a company, one of InfoSiftr’s many areas of prowess is our skills around multi-architecture support and the images that are used in such an environment. In order to showcase this skill set (as well as our partnership with arm) at Kubecon in Seattle, we used a Raspberry Pi cluster built on then freshly released kubernetes 1.13.0. Here we’ll go through the steps we used to build the cluster, so you can recreate it on your own.

Our starting point was Kasper Nissen’s excellent work (https://kubecloud.io/setting-up-a-kubernetes-1-11-raspberry-pi-cluster-using-kubeadm-952bbda329c8) to make sure we had all of our bases covered (setting up ssh, setting hostnames, installing packages, etc), and deviated as needed from there. Note that even though the scripts make use of get.docker.com (and we used it as written), this is against best practices, and the instructions from https://docs.docker.com/install/linux/docker-ce/debian/ would normally be considered best practice instead.

For physical layout, we had a set of five Raspberry Pi 3 boards, each with a 16GB U1 Micro SDHC card, in an enclosure with a switch, and power. The enclosure also had one set of USB and HDMI connectors, which we needed for the initial installations. The uplink from the switch connected to our laptop via a USB hub with Ethernet to keep the ensemble as portable as possible.

We chose raspbian lite (https://downloads.raspberrypi.org/raspbian_lite/images/) as the OS we would use on the Pis. It is important to know, however, that Raspbian lite may install on a Raspberry Pi 3 with the armv7l kernel, instead of armv8, which limits the system to 32-bit patterns, and could affect some modern Docker installations (as mentioned as “best practices” above). If the v7 installation is the case, as it was for us, you’ll simply need to use arm32v7 images in this environment. If this were anything other than a demo environment, we would have selected a different OS build to ensure 64-bit compatibility.

Next, we needed to get an OS onto the boards. Burning the iso to the SD card was simply a matter of correctly using dd, since the laptop conveniently had a SD card slot, which was /dev/sdb in our case:

umount /dev/sdb1; time sudo dd bs=1M if=./2018-11-13-raspbian-stretch-lite.img of=/dev/sdb conv=fsync

Once the images were burned to the cards, we initialized the ssh file, copied over the setup scripts, and got ready to boot.

For the first part of the networking, we wanted all static IPs, in order to ensure we had no issues with the generated TLS certificates. So, the five nodes (master, and worker1 – worker4) were 172.23.0.50 – .54, the laptop itself was .10, and the Untangle VM we had set up on the laptop acting as router/gateway was .1. We opted for the VM over a physical router to minimize what we needed to bring to the show floor. We didn’t bother with DNS for the nodes, but that simply would have been adding entries to /etc/hosts on each of the nodes.

With the IP addressing chosen, we booted all five Pis, ran the network configuration script, and verified that we could access them all over ssh from the laptop. We also, for the sake of ease, setup some ssh keys attached to the pi and root users, which would make some later debugging much less time consuming.

Now that everything was laid out, it was time for the actual cluster installation. We didn’t add the additional kubeadm configuration in Kasper’s doc (due to version mismatches), and went with a basic ‘sudo kubeadm init’ instead, followed by ‘sudo kubeadm join …’ on all the worker nodes.

Once this is setup, we have a running cluster, however, it’s not really usable due to missing network functionality. In kubernetes 1.13, the networking plugin defaults to CNI, so without a CNI plugin even basic access functionality will fail. We ended up using weave, as described in Kasper’s work (we were going to use Calico, but finding the arm installation was taking more than the couple minutes of work we wanted to put in, so we moved on). The shorthand from the document did not work as expected, so instead, we manually filled in version numbers using: kubectl apply -f ‘https://cloud.weave.works/k8s/net?k8s-version=1.13.0’ . At that point we had a functional cluster we could deploy apps to. We tested with a simple NGINX deployment with a NodePort, then also rebuilt dockercoins for armv7 to make sure everything was working smoothly (Spoiler: it all worked great).

Including the time spent downloading and burning the ISO to SD Cards, and setting up the router VM, the entire cluster build process took a little more than an hour. Once we got up to ‘kubeadm init’ we were mere minutes away from a working cluster.

Now that we had a working cluster, it was on to the heart of our demo. We were checking out a cross-architecture demo app to showcase the arm build service from https://hub.docker.com/r/aibdemo/kubecon which was occasionally being rebuilt on the fly, and pulling directly from Github. Every time it was rebuilt, our build system was checking the build in to Docker Hub, and we would pull the newest version any time it scaled. The next step would be to trigger the scaling and/or image pull on the build as well, but as this was a demo system, we didn’t get around to writing the webhook for it. /shrug Any changes to the code could kick off a job which would then be displayed by the new image live on the show floor.

Since the action of our “Blinky Lights” demo was saturating network traffic on the switch, but the heart of it was the multi-arch support, we also needed to deploy a version on the AMD64 laptop we had. The build process took care of both architectures together, built with a manifest list, and we had our images ready.

Once the images were built, we managed the image distribution by scaling a DaemonSet that had a nodeSelector (app: blinky-node vs blinky-nope), dynamically retagging the individual nodes, and having ‘imagePullPolicy: Always’ in place. This forced the demonstration of the newer image build in fewer steps than having to build a webhook. The UI that managed all this was running on the laptop, and was part of the same image that was generating traffic on the Pis.

Full disclosure: Some issues popped up once we made it to the show floor (and therefore needed to be resolved early Tuesday morning) included problems with passing DNS traffic, and also the show floor blocking outbound ICMP (which broke our manual testing). It’s hard to pull from Docker Hub if you can’t resolve the docker.io domain, so we needed to fix that ASAP, but the failure of everyone’s basic network testing tool, ping, sent us on a networking wild goose chase.

When all was said and done, the ICMP issues were a red herring that ate our time, but to resolve the real issues we had, we only needed to reset our DNS to use the router instead of trying to pass through (based on the configuration it had picked up when the initial scripts were ran). This resolved all of our issues with image pulls, NTP, and a host of other inconsistencies. Remember, folks, more often than not, unexplained system issues are often misconfiguration of your system’s foundations, so if you run into the same problems, you’ll know where to look.

So, TL;DR: We dynamically built an image to run on both arm and AMD64, deployed to both a laptop and a freshly built kubernetes cluster of Raspberry Pis, and made a whole lot of lights blink, in a very short amount of time… and you can too.

If you would like to discuss more about how Infosiftr can help you with multi-architecture support, image builds, general application design, or anything else along your containerization journey, please contact us at [email protected].

 

2018-12-20T17:51:23+00:00

About the Author: