When Jordan Liggitt at Google posted details of a serious Kubernetes vulnerability in November 2018, it was a wake-up call for security teams ignoring the risks that came with adopting a cloud-native infrastructure without putting security at the heart of the whole endeavor.
For such a significant milestone in Kubernetes history, the vulnerability didn’t have a suitably alarming name comparable to the likes of Spectre, Heartbleed or the Linux Kernel’s recent SACK Panic; it was simply a CVE post on the Kubernetes GitHub repo. But CVE-2018-1002105 was a privilege escalation vulnerability that enabled a normal user to steal data from any container in a cluster. It even enabled an unauthorized user to create an unapproved service on Kubernetes, run the service in a default configuration, and inject malicious code into that service.
The first approach took advantage of pod exec/attach/portforward privileges to make a user a cluster-admin. The second method was possible as a bad actor could use the Kubernetes API server – essentially the front-end of Kubernetes through which all other components interact – to establish a connection to a back-end server and use the same connection. Crucially, this meant that the attacker could use the connection’s established TLS credentials to create their own service instances.
This was perfect privilege escalation in action, as any requests were made through an established and trusted connection and therefore didn’t appear in either the Kubernetes API server audit logs or server log. While they were theoretically visible in kubelet or aggregated API server logs, they wouldn’t appear any different to an authorized request, blending in seamlessly with the constant stream of requests.
Of course, open source versions of Kubernetes were patched quickly for this vulnerability and cloud service providers sprang into action to patch their managed services, but this was the first time that Kubernetes had experienced a critical vulnerability. It was also, as Jordan Liggett stated in his CVE post at the time, notable for there being no way to detect who and how often the vulnerability had been used.
Unfortunately, this CVE also highlighted the unprepared state of many traditional enterprise IT organizations when it came to their applications that were housed in containers. Remediation required an immediate update to Kubernetes clusters, but Kubernetes isn’t backward-compatible with every previous release. This meant some organizations faced two issues: not only did they have to provision new Kubernetes clusters but they also found their applications didn’t work any more.
The rise of containers for apps, with their clever use of namespaces and cgroups which respectively limit what system resources you can see and use, has ushered in an era of hyper-scale and flexibility for enterprises.
According to Sumo Logic’s Continuous Intelligence Report, which is derived from 2,000 companies, the use of Docker containers in production has grown from 18 per cent in 2016 to almost 30 per cent in 2019 among enterprises. Docker owes much of its success to Kubernetes. The platform built from Google’s Borg project and open-sourced for all to use has orchestrated out much of the management complexity of handling thousands of containers. However, it has created security challenges.
Since this high-profile vulnerability, other Kubernetes flaws have been found, each exposing undiscovered gaps in how companies apply security to their container-based applications. There has been the runc container exploit in February, which allowed a malicious container to overwrite the runc binary and gain root on the container host. This was followed by an irritating – but limited by authorization – Denial of Service (DoS) that exploited patch requests.
The most recent vulnerability, uncovered by StackRox, was another DoS attack which hit Kubernetes API server. This made use of an exploit in the parsing of YAML manifests to kube-apiserver. As kube-apiserver doesn’t perform an input validation on manifests or apply a manifest file size limit, it made the server susceptible to the unfunny Billion Laughs DoS attack.
Container security requires continuous security
Among the lessons to be learned from the growing number of issues discovered over time in Kubernetes is that there will be more, and they will be discoverable across the different stages of the software development lifecycle (SDLC). In other words, Kubernetes is just like any other new, critical infrastructure component introduced in an application development environment.
Discovering and addressing these new class of vulnerabilities will require continuous security monitoring across development, test and production environments. It will also require collaboration and integrated workflow between previously siloed teams from initial planning and coding all the way through to testing and into production. Many use the term DevSecOps to describe this evolution of the DevOps transitions which often accompany modern application development using containers/orchestration/etc.
Choosing a common analytics platform for your DevSecOps projects can result in substantial operational savings while also providing the fabric to deal with the unique security challenges of containers. For example, integrated insight across the tool chain and technology stacks can be leveraged to pinpoint infected nodes, run compliance checks to pick up anonymous access to the API and apply run-time defenses for containers. In many cases, container security will automatically detect and stop unusual binaries that are being exploited, for instance, attempts to access the API from an application within a compromised container.
To build, run and secure containerized apps in this DevSecOps model requires a new approach to the core visibility, detection, and investigation workflows that make up the defense. DevSecOps requires tools that supply deep visibility into your systems and can identify, investigate and prioritize security and compliance threats across the SDLC. This level of observability comes from integrated, large-scale real-time analytics that is aggregated from both structured and unstructured data from across all the systems in the complex SDLC tool chain.
While straightforward as a strategy, often the execution of this approach is frustrated by fragmented analysis tools across logs, metrics, tracing, application performance, code analysis, integrated testing, runtime testing, CI/CD, etc. This often leaves teams managing several products to connect the dots between, for example, low-level Kubernetes issues and the potential impacts they will have to security at the application layer. Traditional analytics tools often lack the basic scale and ingestion capacity to integrate the data, but equally important they also lack the native understanding of these modern data sources required to generate insight in the data without excessive programming or human analysis.
Even when adopting a smaller set of application development and testing platforms, with the scale and insight required, DevSecOps needs capabilities specifically designed for the container/orchestration problem space. First, from a discoverability standpoint the platform must provide multiple views on the data to provide situational awareness. For example, providing visual representations of both the low-level infrastructure as well as the higher-level service view helps connect both the macro and micro security picture. Also, from an observability standpoint, the system must integrate with the wide array of tools that facilitate various aspects of collection and detection (such as Prometheus, Fluentd and Falco).
Metadata in Kubernetes, in the form of labels and annotations, is used for organizing and understanding the way containers are orchestrated, so leveraging this to gain security insight with automated detection and tagging is an important capability. Finally, the system needs to assimilate the insight and data from the various discrete container security systems to provide a comprehensive view.
All of these dimensions of integration (data, analytics, workflow) demand continuous security intelligence applied across the SDLC. Securing containers and orchestration, and more broadly the entire modern application stack, cannot suffer from the delays in both planning and production of connecting dozens of fragmented analytics tools.
At a higher level, securing the modern application stack also can’t depend on the delays of integrating data, analysis, and conclusions across the functional owners of these many tools (security, IT operations, application teams, DevOps, etc). Continuous intelligence from an integrated analytics platform can break these silos and can be a critical element of securing containerized applications in a DevSecOps model.