The 20 Kubernetes Interview Questions & Concepts You Need to Master

The 20 Kubernetes Interview Questions & Concepts You Need to Master

Kubernetes is the most popular and desired container orchestration platform among companies and software teams, with ninety-six percent of respondents to the Cloud Native Computing Foundation's 2021 annual survey saying they were using or investigating using Kubernetes.

Kubernetes knowledge is key for roles in cloud engineering and architecture.

Kubernetes is an open source system that automatically manages the lifecycle of containers, along with the underlying infrastructure for the respective runtime environment.

Some of the core benefits of Kubernetes are its features to run highly available (HA) and scalable container workloads.

For organizations that want to run containerized applications at scale, their operative and management needs can be fulfilled by a system like Kubernetes. As a result, individuals with skills in operating and administering Kubernetes are highly sought after, but it's essential to be able to showcase your knowledge of the ins and outs of the Kubernetes platform.

In this post, you will learn top questions that you can expect to be asked when you're interviewing for a Kubernetes-related role.

Top Kubernetes Interview Questions

Below are some of the main questions that an interviewer would potentially ask to test your knowledge and understanding of Kubernetes. These questions may be theoretical, but the ability to answer them comprehensively will reveal a certain level of proficiency that software teams and companies would want in an individual that will either be managing a Kubernetes cluster or developing applications for Kubernetes.

What is Kubernetes?

Kubernetes is an open source container orchestration tool or platform that is built to automatically manage containerized applications at scale.

It is infrastructure-agnostic and can run in different environments, such as bare-metal machines, virtual machines, cloud environments, and hybrid cloud environments.

Docker is a technology that is used for containerizing your applications. This involves packaging your software along with all of its dependencies and configurations into a single file, known as a Docker image. An instance of a Docker image can then be run as an isolated process with its own set of hardware resources such as memory, CPU, hard disk space, and networking technology. This isolated running process is a container.

In addition to containers, Docker is a technology ecosystem with multiple tools, including a container runtime engine (CRE). Container runtime engines are system components that mount containers and work with the operating system kernel to manage the containerization process.

Previously, Docker was the default container runtime engine for Kubernetes. In December 2020, the CNCF announced that Docker would eventually be replaced by Containerd. In February 2022, Docker was deprecated as the official CRE in Kubernetes.

Why Would You Use Kubernetes?

The lightweight and portable nature of containers makes them very versatile for a variety of scenarios.

They can be useful for running a small number of application instances in the cloud or data centers on-premise, running CI builds for deployment pipelines, or once-off batch processes and tasks.

However, companies with sizable workloads could find themselves running up to hundreds or thousands of containers in production.

For all their benefits, there are a number of issues that containers don't inherently solve for at scale. Containers are fundamentally concerned with solving the problem of application portability and delivery so that you have consistency regardless of the runtime environment.

Managing large-scale container applications requires a system that can automatically manage tasks such as the following:

  • Deploying images and starting up containers
  • Managing scaling the containers and underlying hardware based on the demand
  • Resource management in containers
  • Provide network security and communication for the containers
  • Load balancing of network traffic for containers

Similar to physical shipment containers, you need a crane-like structure to automatically manage these orchestration responsibilities.

You would use a container orchestration platform like Kubernetes to automate the scheduling, deployment, networking, scaling, health monitoring, and management of your containers.

What is a Kubernetes Cluster?

A Kubernetes cluster is a deployment of the container orchestration platform that consists of the three main planes that make up the Kubernetes architecture: the control plane, the data plane, and the worker plane.

These three planes fulfill different roles and can co-exist on the same device, or be distributed across multiple devices. In most enterprise contexts, your Kubernetes cluster will be distributed across several devices to ensure high availability and mitigate the risks of a single point of failure.

  • Control Plane: The control plane is responsible for all main operations of the Kubernetes cluster, and can be referred to as the brain behind the operations.
  • Data Plane: The data plane is the memory of the cluster, and stores all of the cluster configurations and operations that have been executed.
  • Worker Plane: The worker plane is where all of the containerized workloads run.

Provisioning a Kubernetes cluster has been simplified through the use of CNCF-certified Kubernetes distributions and hosted clusters. Kubernetes distributions are software packages that provide a pre-built version of Kubernetes. Hosted clusters are solutions provided by third-party vendors that host, manage, and optimize some or all of the planes of the cluster. Examples of hosted clusters are Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Service (GKS), and Microsoft Azure Kubernetes Service (AKS).

What are the Main Components of the Control, Data, and Worker Planes in Kubernetes Architectures?

As mentioned in a previous question, the Kubernetes architecture consists of three main components: the control plane, the data plane, and the worker plane. The control plane drives the main operations of the cluster and consists of the following components:

  • kube-apiserver: This component exposes the main Kubernetes API server. The Kubernetes API server is a REST-based server that handles all the CRUD (create, read, update, and delete) cluster API requests.
  • kube-scheduler: This component is responsible for assigning pods to nodes based on resource requirements and constraints, policies, and any applied affinity specifications.
  • kube-controller-manager: Controllers in Kubernetes are resources that are responsible for regulating the state of the cluster. They are set to continuously watch the live and desired state of the cluster, ensuring that the live state always matches the desired state. The controller manager is a binary responsible for managing all of the controllers.

The worker plane is responsible for running the application workloads in the cluster on the designated nodes. These nodes are machines or devices registered to the cluster, and communicate with the control plane. The worker plane nodes have the following main components:

  • kubelet: The kubelet is a node-specific agent responsible for creating pods based on the manifest specifications. It communicates with both the underlying container runtime engine and the container repositories that house the container images.
  • Container Runtime Engine (CRE): CREs are system components that mount containers and work with the underlying operating system kernel to manage the container lifecycle.
  • kube-proxy: The kube-proxy is a daemon that runs on every node and reflects all the services that have been defined in the Kubernetes cluster API. Furthermore, it maintains network rules that allow communication to pods from internal and external cluster network sessions.

Finally, the data plane is responsible for storing all the cluster configurations. The data plane is typically implemented with a data store like etcd, which is a key-value database. Alternatively, some lightweight Kubernetes distributions like K3s implement this data store with other database technologies like SQLite.

What is kubectl?

kubectl is Kubernetes' official CLI tool, which you use to communicate with your Kubernetes clusters. It will reference your kubeconfig file and apply the public certificate credentials to the request for authentication and authorization purposes. kubectl can be used for create, read, update, and delete (CRUD) operations in your cluster.

If your kubeconfig file is associated with multiple clusters, you can also use kubectl to update the configuration and communicate with the desired cluster.

What is a Pod?

Pods are the smallest deployable artifacts in a Kubernetes cluster. You can think of them as wrappers for one or more closely linked containers, each with their own IP address.

Sometimes, this idea of pods wrapping containers can cause confusion. The tendency is to assume that an entire application must always go inside a single pod, but this isn't always true.

Consider an application made up of multiple containers. If your application is tightly coupled by design, and the application's functionality would be compromised if the different containers were scheduled to different nodes in the cluster, then they should live within the same pod.

If, however, your containers are loosely coupled and can exist independent of each other, then they should have a one-to-one relationship with a pod.

What is a Deployment?

A deployment is a special type of Kubernetes resource known as a controller. Controllers are responsible for managing the state of the cluster by continuously watching the live state, comparing it to the desired state, and ensuring that the former always matches the latter.

For example, if your deployment resource specifies that three pod replicas should be running for a particular application, the deployment will work with the associated ReplicaSet resource to ensure that there are always three pods running.

Deployments are the optimal way of managing workload releases because they can manage pods in the following ways:

  • Scaling pods
  • Maintaining pod replicas based on the desired state
  • Watching the state of a pod
  • Rolling out and rolling back application versions for pods

Deployments have a one-to-one relationship with a set of identical pods. Deployments manage a specific set of pods using properties known as selectors, which specify or reference the labels attached to the pods that they are supposed to manage.

Pod labels are simply key-value pairs that provide metadata about the pods. Labels are used by other Kubernetes resources, such as deployments or services, to discover and attach to specific pods.

What is a Service?

Services are stable network abstraction layers for proxying traffic to pods. Every pod has a unique IP address, but this presents a challenge for accessing an application because pods have a short life cycle. Services provide network stability and have an unchanging IP and receive the network traffic destined for the pods. Services balance traffic across the pods in the cluster based on a label attached to the pods, and they keep track of which pods are alive and their respective IPs. As pods go through their lifecycle, services have access to a table with the pods that they’re associated with (based on the label) and will always know the IP to forward traffic to.

There are different types of services, and they fundamentally do the same thing in providing a stable networking abstraction point and load balancing of traffic for pods. Below are the different types of services:

  • Cluster IP Service: This is the default service type. It is used for workloads that should only accept traffic from inside the cluster.
  • NodePort Service: This service gets a cluster-wide port and is accessible from outside of the cluster.
  • LoadBalancer Service: This service is typically used in cloud environments because it integrates well with public cloud providers to create a load balancer external to the cluster.

What is an Ingress and Why Would You Use It?

When it comes to getting external network traffic to and from your application, services are only sufficient to a certain degree. For starters, they only operate at Layer 4 in the OSI network model, so they can only forward TCP and UDP traffic.

That's a problem if your application needs to handle HTTP traffic, which operates at Layer 7 of the network OSI model. The second issue is that every time you use the NodePort service to expose traffic to an application, you're exposing a unique port on your node.

This increases the risks of a security breach because of the open ports. Alternatively, If you opt to use the LoadBalancer service for every application, this can quickly become expensive, because a load balancer will be created by the cloud provider for every LoadBalancer service you create for your applications.

The solution to these problems is to make use of the Kubernetes Ingress object. The Kubernetes Ingress object will create a single external load balancer listening for HTTP traffic, and will then route the traffic to the relevant service in your cluster based on the routing rules that you define.

Kubernetes Ingress objects cannot be created and deployed in isolation. You first have to deploy a Kubernetes Ingress Controller. In many cases, Kubernetes installers and distributions come with them already installed.

Ingress Controllers act on the Ingress objects you create, and manage them in a similar way that deployments manage pods.

What is a StatefulSet?

StatefulSets are Kubernetes resources used to manage stateful workloads. Due to the ephemeral nature of pods and container file systems, StatefulSets allow you to create pods that maintain a sticky identity with stable network identification, as well as stable persistent storage for data.

How Would You Persist Data in Kubernetes Workloads?

Both container and pod file systems are ephemeral. That means when they reach the end of their lifecycle, any associated data on the file system is lost. For your applications to have a persistent method of data storage, you need to make use of volumes.

With volumes, you can store data outside the container file system, while still allowing the container to access the relevant data at runtime. Volumes provide external storage for your container workloads. In the context of Kubernetes, you would also make use of persistent volumes.

Persistent volumes are Kubernetes resources that allow you to treat storage as an abstract resource that can be consumed by your pods.

How Do You Accomplish Application Configuration in Kubernetes?

Application configuration in Kubernetes can be accomplished using ConfigMaps and secrets to store data that will drive the behavior of the applications.

Both are used to store application configuration data that can be accessed by containers at runtime, either by exposing the data as environment variables or as files in a volume that gets mounted in the relevant containers.

How Can You Secure Secrets in Kubernetes?

By default, secrets are stored in a non-encrypted format (base64 encoding) in the etcd data store.

This presents the challenge of safely storing secret manifests in Git repositories, since the original values can easily be derived. To secure secrets, you can enable encryption of data at rest in your cluster datastore, and enable TLS/SSL between the datastore and the pods.

In addition to this, you can make use of tools like Mozilla SOPS, Helm Secrets, and Sealed Secrets for one-way encryption of your Kubernetes secrets.

How Would You Secure Network Traffic Between Workloads in Kubernetes?

To secure the network traffic between pods in your cluster, you can make use of network security policies. Network security policies are objects that allow you to control the ingress and egress of network traffic to and from the pods in your workloads.

This creates a more secure cluster network by keeping pods isolated from traffic that they shouldn't be receiving.

To use network policies in your cluster, you must install a network CNI plug-in that supports this type of Kubernetes object. Some popular CNI plug-ins that support network policies include Calico, Weave, and Cilium.

What is One Way to Secure the Hosts or Nodes in a Kubernetes Cluster?

There are a number of ways that you can secure your cluster nodes. The most basic security measure is to secure your hosts with firewall rules that only permit the minimum required network traffic for communication across the cluster.

In addition to this, you can use an Ingress controller and Ingress resources to ensure a single point of entry into the cluster, as opposed to using NodePort Services for incoming network traffic from external sources.

Another best practice is to make use of a security tool like AppArmor. AppArmor is a Linux Security kernel module that provides granular access control for the different software programs that run on your hosts.

With AppArmor, you can control and limit what a program is allowed to do on your host operating systems using profiles, which is a set of rules that define what a program can and cannot do.

By applying these profiles, you can stipulate the behavior or actions of a program on your machines. In Kubernetes, this will prevent containers from executing escalated commands. Other tools similar to AppArmor are seccomp and SELinux.

Conclusion

In this post, you've looked at some of the top questions that can be used to test the knowledge and understanding of individuals interviewing for a Kubernetes-related role.

If you're looking to upskill and acquire the relevant knowledge for certain job opportunities, as well as receive coaching for the interview process, try Exponent. Exponent is a learning platform that offers a variety of courses and coaching material to prepare you for your desired role.

✍️
This article was written by Lukonde Mwila. He specializes in cloud and DevOps engineering and cloud-native technologies. He is passionate about sharing knowledge through various mediums and engaging with the developer community at large.
Product Management Today