by Josh Stella

7 ways to avoid a cloud misconfiguration attack

feature
Apr 19, 202213 mins
Application SecurityCloud ComputingCloud Security

Cloud security is all about configuration. Here’s how to make sure the configurations of your cloud resources are correct and secure, and how to keep them that way.

Cloud engineering and security teams need to ask some important questions about the security of their cloud environments, and they must go well beyond whether or not environments are passing compliance audits.

Within minutes of your adding a new endpoint to the internet, a potential attacker has scanned it and assessed its exploitability. A single cloud misconfiguration could put a target on your organization’s back—and put your data at risk.

Assume for a moment that an attacker finds one of these vulnerabilities and gains an initial foothold in your environment. What is the blast radius of this penetration? What kind of damage could they do?

How easy would it be for an attacker to discover knowledge about your environment and where you store sensitive data? Could they leverage cloud resource API keys and overly permissive IAM (identity and access management) settings to compromise your cloud control plane and gain access to additional resources and data? Might they be able to extract that data into their own cloud account without detection, such as with a storage bucket sync command?

Look deeper, and chances are you’re not going to like what you find. Take swift action to close these gaps in your cloud security before hackers can exploit them. And also recognize that cloud configuration “drift” happens all the time, even when automated CI/CD pipelines are used, so you need to stay vigilant. A cloud environment that’s free of misconfiguration today won’t likely stay that way for long.

Cloud security is configuration security

The cloud is essentially a giant programmable computer, and cloud operations are focused on the configuration of cloud resources, including security-sensitive resources such as IAM, security groups, and access policies for databases and object storage. You need to make sure the configurations of your cloud resources are correct and secure on day one and that they stay that way on day two.

Industry analysts call this cloud security posture management (CSPM). And this is what cloud customers tend to get wrong all the time, sometimes with devastating consequences. If you see a data breach involving Amazon Web Services, Microsoft Azure, or Google Cloud, it’s a solid assumption that the attack was made possible due to cloud customer mistakes.

We tend to focus a lot on avoiding misconfiguration for individual cloud resources such as object storage services (e.g., Amazon S3, Azure Blob) and virtual networks (e.g., AWS VPC, Azure VNet), and it’s absolutely critical to do so.

But it’s also important to recognize that cloud security hinges on identity. In the cloud, many services connect to each other via API calls, requiring IAM services for security rather than IP-based network rules, firewalls, etc.

For instance, a connection from an AWS Lambda function to an Amazon S3 bucket is accomplished using a policy attached to a role that the Lambda function takes on—its service identity. IAM and similar services are complex and feature rich, and it’s easy to be overly permissive just to get things to work, which means that overly permissive (and often dangerous) IAM configurations are the norm.

Cloud IAM is the new network, but because cloud IAM services are created and managed with configuration, cloud security is still all about configuration—and avoiding misconfiguration.

Cloud misconfigurations and security incidents

There are far more kinds of cloud infrastructure than there were in the data center, and all of those resources are completely configurable—and misconfigurable. Take into account all of the different types of cloud resources available, and the ways they can be combined together to support applications, and the configuration possibilities are effectively infinite.

In our 2021 survey, 36% of cloud professionals said their organization suffered a serious cloud security leak or breach in the past year. And there are a number of ways these incidents become possible.

cloud misconfiguration 01 The State of Cloud Security 2021 Report

Source: The State of Cloud Security 2021 Report

Keep in mind that the configurations of resources such as object storage and IAM services can get extremely complex in scaled-out environments, and every cloud breach we’re aware of has involved a chain of misconfiguration exploits. Rather than focusing solely on single resource misconfigurations, it’s essential to thoroughly understand your use case and think critically about how to secure these services in the full context of your environment.

For instance, you may believe your Amazon S3 bucket is configured securely because “Block Public Access” is enabled, when a malicious actor may be able to access its contents by leveraging over-privileged IAM resources in the same environment. Understanding your blast radius risk can be a hard problem to solve, but it’s a problem that can’t be ignored.

The scale of cloud misconfiguration

Cloud misconfiguration vulnerabilities are different from application and operating system vulnerabilities in that they keep popping up even after you’ve fixed them. You likely have controls in place in your development pipeline to make sure developers don’t deploy known application or operating system vulnerabilities to production. And once these deployments are secured, it’s generally a solved problem.

Cloud misconfiguration is different. It’s commonplace to see the same misconfiguration vulnerability appear over and over again. A security group rule allowing for unrestricted SSH access (e.g., 0.0.0.0/0 on port 22) is just one example of the kind of misconfigurations that occur on a daily basis, often outside of the approved deployment pipeline. We use this example because most engineers are familiar with it (and have likely committed this egregious act at some point in their career).

Because cloud infrastructure is so flexible and we can change it at will using APIs, we tend to do that a lot. This is a good thing, because we’re constantly innovating and improving our applications and need to modify our infrastructure to support that innovation. But if you’re not guarding against misconfiguration along the way, expect a lot of misconfiguration to get introduced into your environment. Half of cloud engineering and security teams are dealing with 50 or more misconfiguration incidents per day.

cloud misconfiguration 02 The State of Cloud Security 2021 Report

Source: The State of Cloud Security 2021 Report

Why cloud misconfiguration happens

If we’re successfully using the cloud, the only constant with our cloud environments is change, because that means we’re innovating fast and continuously improving our applications.

But with every change comes risk.

According to Gartner, through 2023 at least 99% of cloud security failures will be the customer’s fault. That 1% seems like a hedge considering cloud misconfiguration is how cloud security failures happen, and misconfiguration is 100% the result of human error.

But why do cloud engineers make such critical mistakes so frequently?

Lack of awareness of cloud security and policies was one of the top causes of cloud misconfiguration reported in the past year. Compile all of your compliance rules and internal security policies together and you probably have a volume as thick as War and Peace. No human can memorize all of that, and we shouldn’t expect them to.

So, we need controls in place to guard against misconfiguration. But 31% say their organizations lack adequate controls and oversight to prevent cloud misconfiguration mistakes.

Part of the reason for that is there are too many APIs and cloud interfaces for teams to effectively govern. Using multiple cloud platforms (reported by 45% of respondents) only exacerbates the problem as each has its own resource types, configuration attributes, interfaces to govern, policies, and controls. Your team needs expertise that effectively addresses all cloud platforms in use.

The multicloud security challenge is compounded further if teams have adopted a cloud service provider’s native security tooling, which doesn’t work in multicloud environments.

cloud misconfiguration 03 The State of Cloud Security 2021 Report

Source: The State of Cloud Security 2021 Report

Seven strategic recommendations

Because cloud security is primarily concerned with the prevention, detection, and remediation of misconfiguration mistakes before they can be exploited by hackers, effective policy-based automation deployed is needed at every stage of the development life cycle, from infrastructure as code (IaC) through CI/CD to the runtime.

Below I’ve listed seven recommendations from cloud professionals to accomplish this.

1. Establish visibility into your environment.

Cloud security is about knowledge of your cloud—and denying your adversaries access to that knowledge. If you’re unaware of the full state of your cloud environment, including every resource, configuration, and relationship, you’re inviting serious risk. Establish and maintain comprehensive visibility into your cloud environment across cloud platforms and continuously evaluate the security impact of every change, including potential blast radius risks.

You’ll not only achieve a better security posture, but you will enable your developers to move faster, and compliance professionals will thank you for the proactive audit evidence.

2. Use infrastructure as code everywhere possible.

With few exceptions, there is no reason you should be building and modifying any cloud infrastructure outside of infrastructure as code and automated CI/CD pipelines, particularly for anything net new. Using IaC not only brings efficiency, scale, and predictability to cloud operations, it provides a mechanism for checking the security of cloud infrastructure pre-deployment. When developers are using IaC, you can provide them with the tools they need to check the security of their infrastructure before they deploy.

If you’re operating a multicloud environment, an open source IaC tool like Terraform that has widespread adoption is probably your best bet. IaC offerings from cloud service providers (i.e., AWS CloudFormation, Azure Resource Manager, and Google Deployment Cloud Manager) are free and deserve consideration if you aren’t going to need multicloud support.

3. Use policy-based automation everywhere possible.

Anywhere you have cloud policies expressed in human language, you’re inviting differences in interpretation and implementation errors. Every cloud security and compliance policy that applies to your cloud environment should be expressed and enforced as executable code. With policy as code, cloud security becomes deterministic. That allows security to be managed and enforced efficiently—and helps developers get security right early in the development process.

Avoid proprietary vendor policy as code tools and choose an open source policy engine such as Open Policy Agent (OPA). OPA can be applied to anything that can produce a JSON or YAML output, which covers just about every cloud use case.

Prioritize solutions that don’t require different tools and policies for IaC and running cloud infrastructure.

4. Empower developers to build securely.

With the cloud, security is a software engineering problem more than it is a data analysis one. Cloud security professionals need engineering skills and an understanding of how the entire software development life cycle (SDLC) works, from development through CI/CD and the runtime. And developers need tools to help them get security right early in the SDLC. Make security a forethought and close companion to development, not an afterthought focused only on post-deployment issues.

Training security teams on cloud engineering practices not only will better equip them with the skills needed to defend against modern cloud threats, but they’ll gain valuable skills and experience to help advance their careers. You’ll improve team retention and better position your organization as a desirable place to work.

The Cloud Security Masterclass series is designed to help cloud and security engineers understand cloud risks and how to think critically about securing their unique use cases.

5. Lock down your access policies.

If you don’t already have a formal policy for accessing and managing your cloud environments, now’s the time to create one. Use virtual private networks (VPNs) to enforce secure communications to critical network spaces (e.g., Amazon Virtual Private Cloud or Azure Virtual Network). Make VPN access available or required so that the team can access company resources even if they are on a less trusted Wi-Fi network.

Engineers are prone to creating new security group rules or IP whitelists so that they can access shared team resources in the cloud. Frequent audits can certify that virtual machines or other cloud infrastructure haven’t been put at additional risk. Oversee the creation bastion hosts, lock down source IP ranges, and monitor for unrestricted SSH access.

In AWS, Azure, GCP, and other public clouds, IAM acts as a pervasive network. Follow the principle of least permission and utilize tools like the Fugue Best Practices Framework to identify vulnerabilities that compliance checks can miss. Make IAM changes a part of your change management process, and make use of privileged identity and session management tools.

Adopt the “deny by default” mentality.

6. Tag all cloud resources.

Enforce resource tags and establish effective tagging conventions across your entire cloud footprint. Using tags is one of the best ways to help you track and manage your cloud resources, but you need to establish tagging conventions and enforce them. Use resource names that are human readable and include a point of contact, project name, and deployment date for each resource.

If a cloud resource isn’t properly tagged, it should be considered highly suspect and terminated.

Microsoft Azure has a good resource on cloud resource tagging conventions.

7. Establish your mean time to remediation (MTTR).

Measuring your risk and the effectiveness of your cloud security is how you establish where you stand and where you want to go. The most important measurement to start with is your mean time to remediation. You probably don’t know what your current MTTR is for security-critical cloud resource misconfigurations (few cloud customers do), and you should change that. Set goal targets for an MTTR measured in minutes, and if automated remediation isn’t a viable option for your team and environment, tune your processes to ensure that your team can detect and remediate critical vulnerabilities before hackers can find them.

Looking ahead

As your cloud use grows, the complexity of your environment—and the challenges in keeping it secure—will grow. Hackers are using automation to identify and exploit cloud misconfiguration in minutes, so it’s critical to avoid configuration mistakes in the first place. By equipping your developers with automated tools based on policy as code, you’ll be able to scale your security efforts to meet these new challenges—without having to scale security headcount. And you’ll be able to move faster in the cloud than you ever could in the data center.

This guide was excerpted from The Engineer’s Primer on Cloud and Infrastructure as Code Security 2021.

Josh Stella is chief architect at Snyk and a technical authority on cloud security. Josh brings 25 years of IT and security expertise as founding chief technology officer at Fugue (part of Snyk), principal solutions architect at Amazon Web Services, and advisor to the U.S. intelligence community. Josh’s personal mission is to help organizations understand how cloud configuration is the new attack surface and how companies need to move from a defensive to a preventive posture to secure their cloud infrastructure. He wrote the first book on Immutable Infrastructure (published by O’Reilly), holds numerous cloud security technology patents, and hosts an educational Cloud Security Masterclass series. Connect with Josh on LinkedIn and via Fugue at www.fugue.co.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.