Additional Microservices and Container Security Guidelines

The following section provides additional guidelines to securing your microservices and containers.

There is already several guidance documents on secure application development, secure coding practices, and secure deployment practices. This document will focus on the specific security considerations for microservices and containers.

5.1 Securing Platform

Multi-Tenancy

Multi-tenancy within a single Kubernetes cluster is one of the key drivers for digging into the security implications of different deployment patterns. The most secure approach is to only deploy a single application per K8s cluster, but this becomes very inefficient in terms of resource usage and operational maintenance of the utilized infrastructure.

Throughout the following sections, we will illustrate different options for isolating and separating workloads, as well as best practices when establishing your application architecture.

To get started, it is worth briefly covering the difference between a container runtime and traditional VM based workload isolation.

Workload Isolation

The following table lists the different considerations when it comes to workload isolation in containers and VMs.

Consideration	Description
Hard Isolation	Traditional VMs offer strong, "hard" isolation due to their separate kernels.
Soft Isolation	Containers provide "soft" isolation using mechanisms like Kubernetes namespaces.

The most fundamental difference between containers and virtual machines (VMs) is related to the host, or kernel environment in which they run. Each VM runs its own guest OS and as such, has its own kernel. In contrast, a container shares the underlying operating system's kernel with all of the other containers on that host.

As a simple illustration, the following diagram shows the components of an application as it relates to either VMs or containers.

Figure 5-1 - VMs vs Containers

Hard Isolation

Key points to consider when it comes to hard isolation:

Consideration	Description
Strong Isolation	VMs provide strong isolation by running separate guest OS and kernels, offering well-understood security realms.
Resource Sharing	Containers share the host OS kernel, enabling faster startup, lower memory footprint, and efficient resource usage.

Soft Isolation

Key points to consider when it comes to soft isolation:

Consideration	Description
Namespace	Kubernetes namespaces provide basic workload separation through resource quotas and user access boundaries.
Isolation	Namespaces provide a level of isolation, but they are not as strong as VMs. Namespaces do not isolate workloads on physical hosts; workloads in different namespaces can run on the same host.

Attack Surface

The attack surface of a container is the sum of all the different points where an attacker could potentially exploit the container. The attack surface includes the container image, the container runtime, the host OS, the container network, and the Kubernetes API server.

Consider security implications of:

Consideration	Description
Kubernetes Platform Architecture	nodes, kubelet service, API service, etcd service, control plane.
Workload Organization	Pods, containers, namespaces.
Ecosystem Services	ingress, storage, certificate management, service mesh.

Figure 5-2 - Kubernetes Attack Surface

Workload Scheduling & Placement

Containers can have dependencies among themselves, dependencies to nodes, and resource demands, that can also change over time. The resources available on a cluster also vary over time, through shrinking or extending the cluster, or by having it consumed by already placed containers. The way we place containers impacts the availability, performance, and capacity of the distributed systems as well. This section covers the scheduling considerations when a workload is scheduled on a node and how Kubernetes manage it.

The following table is a list of considerations when it comes to workload scheduling and placement in Kubernetes.

Consideration	Description
Taints and Tolerations	Taints and tolerations are used to control which pods can be scheduled on which nodes. Use taints and tolerations to prevent pods from running on unsuitable nodes (e.g., user workloads on master nodes).
Pod Affinity/Anti-Affinity	Pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node.
Node Affinity/Anti-Affinity	Node affinity and anti-affinity allow you to specific nodes or ensure distribution for high availability.
Built-in vs Custom	Utilize built-in taints or define custom taints based on node conditions like readiness, resource pressure, or schedulability.

Role-Based Access Control (RBAC)

Kubernetes Cluster Authorization

Access controls are implemented on the Kubernetes API layer (kube-apiserver). When an API request comes in, the authorization permissions will be checked to see whether the user has access to be able to execute this command.

Consideration	Description
Role-Based Access Control (RBAC)	A method of regulating access to computer or network resources based on the roles of individual users within an organization. Implement access controls at the Kubernetes API layer to govern user permissions for actions and resources.
Roles and Binding	A role is a set of permissions that define the actions that a user is allowed to perform. `RoleBindings` grant those permissions to the users. Utilize Roles and RoleBindings for namespace-specific access, and `ClusterRoles` and `ClusterRoleBindings` for cluster-wide access.

The image depicts two rectangles, one labelled , containing a rectangle pointing to a rectangle, and the other labelled
, containing a rectangle pointing to a rectangle. Figure 5-3 - RBAC in Kubernetes

Service Mesh

A service mesh is a programmable framework that allows you to observe, secure, and connect microservices. It doesn't establish connectivity between microservices, but instead has policies and controls that are applied on top of an existing network to govern how microservices interact. Generally a service mesh is agnostic to the language of the application and can be applied to existing applications usually with little to no code changes.

Features of a service mesh include:

Feature	Description
Traffic Management	Can control the flow of traffic between services, including routing, load balancing, and retries. Enhance security through a programmable framework that governs microservice interactions.
Zero Trust	Implement Zero Trust network principles with policy-based security controls.
Separate Planes	Consider service meshes for features such as traffic shaping, resiliency, observability, and secure communication.

Figure 5-4 - Service Mesh

Access Control and Policy Enforcement

Access control policies evolve as the business requirements change -- tying the access control policies to the micro-service code is a bad practice.

The following table lists the considerations when it comes to access control and policy enforcement:

Consideration	Description
Open Policy Agent (OPA)	OPA is a general-purpose policy engine that can be used to enforce policies across the Kubernetes ecosystem. Use OPA to externalize and enforce fine-grained access control policies across the Kubernetes ecosystem.
Policy Enforcement	Define policies for API authorization, workload deployment restrictions, resource tagging, and other compliance controls.
API Gateway	API gateways can be used to enforce policies at the API gateway level. Integrate OPA with API gateways for policy enforcement at the edge.
Cloud IAM	Use cloud IAM to manage access to cloud resources and services when deploying microservices on cloud platforms. Use service or machine accounts to authenticate and authorize services to access cloud resources.

Policy enforcement systems could be integrated with API gateways to enforce the policies at the API gateway level. The following figure illustrates the sequence of events that happens when an API gateway intercepts client requests to apply authorization policies using OPA.

A diagram illustrating a microservice architecture with an API gateway and OPA engine for authorization. It shows the flow of a client request through the gateway, which sends an authorization request to the OPA engine. The OPA engine evaluates policies and sends an allow/deny response back to the gateway. The gateway then either terminates the request or forwards it to the microservice. Figure 5-5 - API Gateway with OPA

5.2 Securing Container Runtime

In order to run containers securely, we aim to do the following:

Use least privilege to carry out the task at hand.
Enforce resource allocation
Limit communication between applications, and to and from the outside world, to a defined and deterministic set of connections.

Least-Privilege Security Settings

Least privilege is a security concept that requires that every module or program in a computer system be granted the least amount of privilege needed to fulfill its function. This helps to reduce the potential attack surface of the system.

The following table lists the considerations when it comes to least-privilege security settings in containers:

Consideration	Description
Avoid Root	Run containers as non-root users unless specific privileges are required (e.g., modifying the host system, binding to privileged ports)
Read-Only Root Filesystem	Prevent attackers from writing executable files by setting the root filesystem as read-only, unless write access is essential.
Limit Host Volume Mounts	Restrict the ability to mount sensitive host directories into containers to prevent unintended modifications.
Disable Privileged Access	Set `privileged` and `allowPrivilegeEscalation` to false unless specifically required.
Restrict System Calls	Use `seccomp` profiles to limit system calls available within containers, minimizing potential attack vectors.

Resource Usage

Containers have multiple dimensions at runtime, such as:

memory usage,
CPU usage, and
other resource consumption dimensions.

The following table lists the considerations when it comes to resource usage in containers:

Consideration	Description
Define the Required Resources	Define and enforce resource requirements (CPU, memory) for each container to ensure predictable resource allocation and prevent resource starvation.

Network Policies

The following table lists the considerations when it comes to network policies in containers:

Consideration	Description
"Layer 3" Network segmentation	Restrict pod-to-pod communication and external access using network policies to enhance security and implement network segmentation.
Set Limits	Consider network policy limitations (pod port focus, service port changes, logging capabilities, FQDN support).
Default Allow-All Policy	By default, Kubernetes has an allow-all policy, and does not restrict the ability for pods to communicate with each other.

5.3 Securing Traffic

Microservices developer should focus on the business functionality of a microservice, and the management of other concerns like security, observability, and resiliency should be handled by specialized components. The API Gateway and Service Mesh are two architectural patterns that help us achieve this goal.

North-South Traffic

North-South traffic indicates any traffic caused by the communication between the client/consumer applications and the APIs. To secure the north-south traffic, an API gateway is typically deployed at the edge of a network.

Consideration	Description
North-South Traffic (Client-API Communication)	Deploy API gateways at the network edge to secure and manage incoming requests to microservices. Implement authentication and authorization using protocols like OAuth 2.0 to control access to resources. Consider API gateways for features like rate limiting, request throttling, analytics, and API monetization

East-West Traffic

East-West traffic indicates the inter-micro-service communications.

Securing this type of traffic has three aspects:

Consideration	Description
East-West Traffic (Inter-Microservice Communication)	Implement mutual TLS (mTLS) to secure communication between microservices. Use JWT tokens for for request authentication to verify end-user identities and claims. Leverage service meshes for decoupling security, observability, routing control, and resiliency from microservices implementation.

Event-Driven Systems

Event-driven systems rely on data streams and the propagation of events to trigger actions. These systems are composed of event producers, event consumers, and event brokers.

The following table lists the considerations when it comes to securing event-driven systems:

Consideration	Description
Message broker	Secure event traffic between microservices using message brokers like Kafka Google Pub/Sub Azure Service Bus AWS SNS
Transport Layer Security	Use TLS/mTLS to encrypt data in transit between microservices and the message broker.
Control access	Use `IAM` to control which microservices are permitted to connect to the message broker and to authenticate the clients connecting to it.
Access Control Lists (ACLs)	ACLs permit or deny various micro-services performing different types of actions on the message broker resources such as topics and queues.

5.4 Secure Coding Practices

Secure coding practices are a set of guidelines and best practices that help developers write secure code. The Government of Canada provides guidance on secure coding practices in the Secure Coding Practices Guide - on GCPedia, which are base in large part on the OWASP Top 10.

In addition to the Secure Coding Practices Guide, the following list should be considered when developing microservices:

Consideration	Description
Crypto Libraries	Don't write your own crypto code! Use well-known and tested libraries.
Data in Transit	Use SSL and OAuth tokens to protect your APIs and the data in transit. Use mutual TLS (mTLS) for secure communication between microservices.
Protect Production Branch	Only verified commits should be merged into the production branch. Use branch protection rules to prevent unauthorized changes to the production branch.
Secrets Management	Never hard-code secrets in your code. Use a secret management tool to store and retrieve secrets.
Input Validation	Always validate input data to prevent injection attacks.
Error Handling	Implement proper error handling to prevent information leakage.
Logging	Implement proper logging to help with debugging and auditing.
Dependency Management	Keep your dependencies up to date to avoid vulnerabilities.
Secure Configuration	Ensure that your configuration is secure and does not expose sensitive information.

5.5 Architecting Your Application for Cloud

The purpose of this section is to outline aspects of scalable, resilient, and portable microservices to facilitate them being deployed and monitored to the Cloud.

The following principles are recommended:

Principle	Description
Single Concern	Containers should address a single concern and do it well. Use patterns like sidecar and init-containers to address multiple concerns. This promotes reusability and replaceability
Immutable Container Images	Container images should be immutable and not contain any environment-specific configuration. Separate application code from configuration.
Self-Contained Image	Container images should be self-contained and not rely on external dependencies. Minimize the attack surface by using minimal images.
Lifecycle Conformance	React to lifecycle events like readiness, liveness, and termination.
Process Disposability	Containers should be stateless and externalize state to a persistent data store. Minimize startup time by reducing the amount of work done on container startup.

5.6 Securing Container Images

The following table provides a list of considerations when it comes to securing container images:

Consideration	Description
Image Signing	Use image signing to validate the provenance of the application software running in a deployment. For self-managed containers use a tool such as Sigstore, Notary, or Docker Content Trust to store and assure image metadata. For cloud-hosted images, see your cloud provider's documentation for signing images.
Software Bill of Materials (SBOM)	Include a software bill of materials (SBOM) with your container image to provide transparency into the components and dependencies of the image.
Scan Images for Vulnerabilities	Use a container image scanner to inspect packages and report on known vulnerabilities. Scan third-party container images as well as the ones built by your organization. Re-scan images on a regular basis to detect new vulnerabilities.
Patch Container Image	Update the container to use a repaired version of the package, when a vulnerable package is discovered. Automate the build process through a CI pipeline.
Container Image Storage	Store images in a private registry that is protected by credentials and not exposed to the internet. Minimize the impact of public images changing by referencing images from a private repository.

The following figure illustrates the system components, activities, and artifacts involved in securing container images.

A diagram illustrating a container security pipeline. It shows the flow of activities and artifacts between system components: CI/CD services, container registry, and various security tools. The pipeline includes stages for building, scanning, patching, storing, and signing container images, as well as generating Software Bill of Materials (SBOMs). Figure 5-6 - Securing Container Images

5.7 Observability

Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. In the context of microservices, observability is the ability to understand the internal state of the system by observing the outputs of its components. Observability is a key requirement for microservices, as it allows you to monitor, debug, and optimize your services.

The following table lists the considerations when it comes to observability in microservices:

Consideration	Description
Health Check and Auto Healing	Expose APIs for the runtime environment to observe the container health and act accordingly. Use liveness and readiness probes to make a service more robust and resilient.
Logging	Write logs to standard output and standard error streams. Use a separate storage and lifecycle for logs.
Monitoring and Custom Metrics	Use observability metrics to determine if the production deployment is functioning correctly. Monitor application health, performance, and availability.
Tracing	Use tracing to view the latency encountered on every instrumented application or service. Implement a trace that represents the life of a request in a distributed system.
Anomaly Detection	Use AI-supported tools to detect anomalies in the system. Use anomaly detection to identify performance issues and security threats.

5.8 Secrets Management

Secrets management is a critical component of container security. A secret, in the context of containers, is any information that will put your organization, customer or application at risk if exposed to an unauthorized person or entity. This includes API keys, ssh keys, passwords, etc.

The following table lists the considerations when it comes to secrets management in microservices:

Consideration	Description
Secure Storage	Store secrets in a secure location that is encrypted at rest. Use a secret management tool to store and retrieve secrets.
Dynamic Secret Distribution	Use dynamic secrets to automate encryption and authentication of keys.
Access Control	Implement multi-level role-based access to secrets.
Audit	Log all access to secrets.
Secret Rotation	Rotate secrets frequently to reduce the risk of compromise.

Secret Management Tools

Secret management solutions fall into two broad categories:

Category	Tools
Cloud Provider Tools	AWS Secrets Manager Google Cloud Platform KMS Azure Key Vault
Open Source Tools	Hashicorp Vault BitWarden Secrets Manager

5.9 Continuous Integration/Continuous Deployment (CI/CD)

Your CI/CD pipeline plays a pivotal role in securing your microservices. The following table lists the considerations when it comes to CI/CD security:

Consideration	Description
Code Commit Signing	Sign your commits to prove that the code you submitted came from you and wasn't altered while you were transferring it.
Use Machine-to-Machine (M2M) Authentication	Secure access between your CI/CD pipeline and your secret manager, using M2M authentication such as OAuth 2.0.
Integrate Security Testing in Pipeline	Integrate security testing into your CI/CD pipeline to identify security flaws before they can be exploited.
Code analysis and scanning	Use static code analysis, dynamic analysis, and third-party dependency scanning to identify vulnerabilities.

5.10 Infrastructure as Code (IaC)

Infrastructure as Code (IaC) is the process of managing and provisioning compute infrastructure through machine-readable definition files rather than physically configuring the hardware or by using interactive configuration tools.

The following table lists the considerations when it comes to IaC security:

Consideration	Description
Use IaC Tools	Use IaC tools like Terraform, OpenTofu, Ansible, Azure Resource Manager, AWS CloudFormation, or GCP Deployment Manager to automate the provisioning of infrastructure. Use IaC tools to define and manage infrastructure in a declarative manner.
Secure IaC Configuration	Secure your IaC configuration files by using version control, access control, and encryption. Use a secret management tool to store and retrieve secrets.
Automate Security Compliance	Automate security compliance checks in your IaC pipeline to ensure that your infrastructure is secure.

Page details

Date modified: