Linux Format

Part 1: Kubernetes

Jolyon Brown shows you how to take control of your containers before they control you with the open source version of Google’s management software.

-

This issue I’m going to start looking at Kubernetes, beginning with some of the concepts around it, and next month I’ll look at building some infrastruc­ture with it. Since being open-sourced by Google, Kubernetes has become a very popular project. It automates the operations around containers, such as deployment and scaling, and it runs across clusters of hosts. With the current trend for containers, agile developmen­t and micro-services architectu­re Kubernetes is definitely worth knowing about and taking a look at.

From where I’m sitting at least the whole world has gone container mad in the last twelve-eighteen months. I’ve only covered them on a couple of occasions here in the hallowed pages of LinuxForma­t (I worried that ‘container fatigue’ might become a problem). On the outside though almost everyone I’m working with has either a fairly advanced container evaluation project on the go or has plans for one. In most of these cases this means working with Docker, and there’s a healthy ecosystem around that project as well as in the wider container space. This is driven to a certain extent by the everevolvi­ng idea of ‘The Cloud’ and the expected role that containers will have to play in it. “There’s money in them thar clouds” one might say. As many readers of this column will undoubtedl­y be working with containers by now, I thought it was time to start looking at them on a more regular basis.

With any infrastruc­ture technology, one of the main worries for administra­tors is how they are going to manage it. No-one wants to be firefighti­ng their systems day in and day out and having to manually fix issues. From physical systems to virtual machines to containers – the number of discrete elements to control (and maintain) is going up. Naturally, solutions for taking control of this situation have begun to spring up. Plus containers are cool! Everyone wants to impress their boss/customer/significan­t other by being able to provide an autoscaled, resilient and redundant service using them.

Enter Google. In infrastruc­ture terms, the huge global ‘warehouse scale’ computing facilities built by the internet giant have achieved somewhat mythical status. This was in large part due to the secretive nature of the company when it came to discussing its internal architectu­re (with academic papers often being the only glimpse outsiders got). As its systems began to evolve, Google began to hit issues with running and managing virtual machines at massive scale. The company’s solution was to reconsider the problem and ensure that failures of individual components wouldn’t cause failures in other areas, that they could be distribute­d across different compute resources and that failures could be handled automatica­lly. Google’s internal orchestrat­ion tool was called Borg (and there is another called Omega). Kubernetes is a descendant of Borg, which has been packaged up for external and open sourced consumptio­n.

Pods and nodes

Kubernetes currently supports Docker and Rocket (from CoreOS) containers, with more promised in the near future. It runs these containers in collection­s known as ’pods’, which are the basic building block for Kubernetes. A pod can consist of one or more containers. When multiple containers exist inside a pod, they will all be located on the same physical host. Kubernetes can deploy multiple copies of pods (based on the same configurat­ion) and handle bringing up replacemen­ts for pods that go offline. The pod is the level at which Kubernetes performs its scheduling and orchestrat­ion tasks. This is different to say, a vanilla Docker installati­on where the container itself is the common building block. The pod gets assigned an IP address rather than an individual one for each container, with ports being used by applicatio­ns in the pod for communicat­ion. Its typical for a pod to consist of a single complete instance of a micro-service. Kubernetes provides service discovery so that IP addresses and DNS can be used to point at a collection of such services. A pod can also contain a volume which is visible to all containers in the pod, therefore allowing the sharing of temporary (ephemeral) data. For more persistent data, NFS mounts can be used.

A node is a physical machine on which one or more pods reside. Certain nodes act as masters, running control software. This includes etcd – which might be familiar from the CoreOS project covered in an earlier Administer­ia. This is a key-value store for shared configurat­ion and service

discovery. An API server handles the calls which all the components of Kubernetes make and receive during cluster operations. A scheduler and controller deal with the pods on regular nodes: They make sure the correct number of pods are running, that new pods are brought up on nodes that have the capacity for them etc. Regular nodes run something called the kubelet (which maintains the pods themselves) and kube-proxy which is (surprise!) a simple network proxy and load balancer that passes traffic to services running in the pods. This all hangs together like the diagram ( seebelow).

Controllin­g Kubernetes: kubectl

The easiest way to interact with Kubernetes is via the kubectl command-line interface. With it, simple standalone examples of pods can be created with a single line command (This example from the Kubernetes documentat­ion, which can be found at http://kubernetes.io). $ kubectl run my-nginx --image=nginx --replicas=2 --port=80 CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS my-nginx my-nginx nginx run=my-nginx 2

For anything other than trivial setups though, it’s more usual to create YAML or JSON format files which kubectl will read and act upon. These, of course, have the added advantage of being able to be versioned and stored in a code repository (which is always a good thing). These definition files can themselves be quite simplistic but can also include definition­s for health checks of a service (which will help Kubernetes will decide if a container is in a working state or not) and user defined key/value fields known as labels and annotation­s. Labels are handy for assigning identifyin­g data to containers for the cluster administra­tor. These would typically be single word notes, eg environmen­t: prod or environmen­t: dev . Annotation­s are intended for longer notes – perhaps Git commit related informatio­n, the name of the creator for a particular service or URLS for relevant documentat­ion. Labels in particular will be used for sorting and searching of services, and the advice is to get off on the right foot by using them right from the beginning.

Replicatio­n controller­s are created in Kubernetes by the same method as regular pods, with the addition of extra keywords, such as replicas: 2 . Kubernetes would spin up two instances of the requested pod, and maintain that number should they crash, be killed etc.

In contrast to a regular Docker setup, Kubernetes handles networking a bit differentl­y. An admin doesn’t have to worry about port allocation to get around host-private networking which limits communicat­ion between containers to others on the same node. All pods can speak to one another, even across nodes. However, as pods are restarted, IP address allocation­s change. Services should be defined (again, using config files) which handle this situation automatica­lly for the administra­tor. We could easily set up a service which targets pods labelled as Apache webservers and forwards traffic to them on port 80 or 443. As the pods go through their lifecycles, the service endpoint will be updated with the new IP addresses. To have external (eg internet-based) clients access services on a Kubernetes cluster, they must have public IP addresses and be connected to what Kubernetes calls NodePorts or LoadBalanc­ers. These two similar methods manage external facing IP addresses which might be provided by the underlying cloud provider, for instance. Another popular method is to run HAProxy (which remains one of my favourite pieces of open source software).

Phew! Hopefully this wasn’t too much informatio­n to take in all at once. There’s a lot to Kubernetes, which is a really powerful system with a lot of production, real-world experience baked into it. Next issue, we’ll get Kubernetes installed and get down to some working examples, including some monitoring and examples of replicatio­n.

 ??  ?? The website includes some handy marketing speak you can use to impress your boss when pitching a Kubernetes project.
The website includes some handy marketing speak you can use to impress your boss when pitching a Kubernetes project.
 ??  ?? A simplified view of theKuberne­tes architectu­re showing how the ‘pod’ is the unit of currency.
A simplified view of theKuberne­tes architectu­re showing how the ‘pod’ is the unit of currency.

Newspapers in English

Newspapers from Australia