We have been using Amazon AWS ECS to deploy containerised applications for most of our clients. Now, we are exploring Kubernetes and its integration with Google Cloud Services, to keep up with the new possibilities out there and to choose the best tools for each project.
Lately, I have been investigating Kubernetes because at MarsBased we want to learn more about container orchestration and deployments automation, to see if it would be a good fit for any of our clients.
Here is a summary of what I have learnt peppered with some personal opinions! Hope it's useful!
In short, Kubernetes is like an open source self-contained PaaS (Platform as a Service). It allows you to deploy containerized applications and it manages all the heavy lifting of placing the containers, restarting them, scaling them, updates, and other frequent actions.
From the outside to the inside, these are the parts Kubernetes is composed of:
Apart from the physical components that Kubernetes manages, there are a set of virtual components that allow running an application. These resources are defined in YAML configuration files.
The most important ones are:
Kubernetes works declaratively. That is, you define the state you want to have (these deployments with these containers and resources, these services to access that one, etc.) using YAML files, pass them to Kubernetes and forget. Kubernetes takes care of always maintaining that state by taking the necessary actions: starting containers, terminating containers, changing containers from one node to another, etc.
Thus, to deploy an application, for instance, what you would do is to just update the deployment with the new image (built from the new code) and Kubernetes would take care of shutting down the old Pods and running new ones, making sure that there are always available Pods to serve requests.
A normal Kubernetes workflow is:
Kubernetes requires using a Master Node. So, in every Cluster, you will need to dedicate a Node to be the master, and you will not deploy applications into this Node. The Master Node provides an API to interact with it and stores its state in etcd (a highly available linux key-value store).
There is a CLI (Command Line Interface) to operate with Kubernetes which, in turn, in the background talks to this API. The CLI has both commands to manage the state of the Cluster (deployments, services, etc.) and to interact with the Pods (execute a command, view logs, etc.).
Furthermore, all Nodes in the Cluster run a process called kubelet to receive commands from the master and send data to it.
Apart from that, you have the option to install add-ins - which are like optional components - such as:
Kubernetes has a clever way to manage Secrets. You can create Secrets from the CLI either from literals or from a file and these are securely stored in the cluster. Then, from a deployment configuration file, you can reference these Secrets by name.
The most common scenario is to define an environment variable in a deployment to get the value from a Secret (API key, for instance).
Kubernetes has another useful resource which is the Job. As its name implies, the idea of this resource is to execute a command once and wait for it to complete. Jobs are also defined using YAML files.
One situation for which Jobs are really useful is database migrations. However, there is a big caveat with jobs. They (and the pods they run on) are not deleted after the jobs finish. The Pod is shut down but the Job and Pods resources are left in order to be able to view the status and the logs. This makes it more difficult to use it for migrations since you need to delete the Job and Pods manually.
Kubernetes is a truly powerful beast. It has excellent and extensive (maybe a bit too much!) documentation and tutorials, as well as great tooling available overall.
Besides the CLI, it has mini-kube which as a local Kubernetes Cluster that can be installed in your machine to play with it, using Docker or a virtual machine. It has a built-in dashboard, too, built-in monitoring and built-in logging. The community is huge, too, and you can find tonnes of courses (both free and paid), tutorials and blog posts such as this one.
In my opinion, I think it's a very good option for MarsBased as a Rails consultancy, and we will benefit from using it, although we have the challenge of finding the right provider to go with it.
Google Cloud seems like the best option by far - after all, Google are the creators of Kubernetes - but this means that we'll need to become as experts in it as we already are with Amazon AWS, our cloud provider of choice.
However, Kubernetes is not meant for small deployments. It's overkill to use it for a low-throughput application as you have the extra overhead of having to keep the Master Node which takes some room just on its own.
Imagine an app that you want to deploy only on a small machine. Having the master would be 50% of the infrastructure. But for a medium-to-large deployment, the overhead is next to negligible. Also, note that Google Cloud does not charge for the Cluster itself (it does charge for the Master Node), it only charges for the Nodes in the Cluster.
As I mentioned earlier in the post, this is just the beginning. It's been an exploratory ride into the exciting world of Kubernetes.
Currently, we are deploying our first production application into Google Cloud Service using Kubernetes. There's more stuff that we will cover in next posts, exploring some of the challenges that we have been encountering and their solutions:
kubectl exec -ti [pod_name] [command]
.I will follow up soon with more findings to keep you posted!
Project setup can be a very cumbersome process for developers. In this blog post, our developer Dani explains how he uses Docker to develop in Rails
Read full articleMySQL and PostgreSQL are very similar but not exact. Take a look at this scenario that works with PostgreSQL but not with MySQL.
Read full articleIn this blog post, our CTO, Xavi, will show us how to query data from PostgreSQL to represent it in a time series graph.
Read full article