Gaining Control Over Cloud IAM Chaos

Flying Blind Through Cloud IAM Complexity

The ephemeral and dynamic nature of cloud resources, combined with new features enabled by software-defined networking in the cloud, makes traditional security perimeters insufficient for successful risk management. Therefore, leading cloud adopters have turned their security focus to a new perimeter—identity. This is substantially different and more complex than the firewall perimeters of traditional data centers. The complexity of the cloud service provider identity and access management (IAM) tools makes it exceptionally challenging to determine who—or what—has access to a cloud resource. For example, Amazon Web Services (AWS) offers extensive policy evaluation logic, starting with a request context and then deriving all applicable policies from that. Within a single AWS account, this evaluation logic engages up to five overlapping policy layers to permit or deny access. Combine these policy layers with on-premises identity, and the result is a jumble of overlapping and often conflicting IAM privileges. 

When it comes to cloud IAM, security and operations teams are flying almost blind. This visibility drops to zero as cloud deployments grow and cloud IAM complexity increases with scale. This resulting tangled puzzle of IAM policies and rules means organizations lose any ability to assign and manage cloud least privileged access (LPA), let alone understand the permissiveness of their cloud access. Even more important, when organizations are not entirely in control of cloud IAM governance, they are incredibly vulnerable. If they experience a security incident, the lack of cloud IAM visibility makes determining the potential blast radius a tough, if not impossible, task.

To increase cloud identity visibility and reduce risk, security teams need to find a way to distill clarity from cloud IAM complexity. They must be able to:

  • Gain visibility of the full cloud IAM picture to assess, prioritize, and remediate improper permission combinations that grant unintended or overly permissive access;
  • Explore effective access by principal user, resource, or application;
  • Understand true access to complex IAM combinations;
  • Establish and maintain least privilege; and
  • Limit and understand the cloud security blast radius.

To paraphrase Albert Einstein, we cannot achieve clarity from cloud IAM complexity with the same level of thinking that created it. This change in thinking starts by understanding what it takes to grant access to a cloud resource.

The Current Complexity and Chaos of Cloud IAM

Whether their resources are in the cloud or on premises, most organizations have three primary IAM goals:

  1. Assessing and limiting the blast radius of a potential IAM failure;
  2. Effectively responding to IAM failures in the event of an exploit; and
  3. Establishing and maintaining control over LPA.

Even before the advent of cloud, these goals were challenging to achieve. Identity compromise and privilege escalation have been, are now, and will continue to be primary attack vectors.

Everything Has an Identity

An already tricky practice becomes nearly impossible in the face of the scale, scope, and ephemeral nature of cloud services. Every service and asset in the cloud has its own identity with multiple permission layers.

In public cloud environments like AWS, Microsoft Azure, and Google Cloud Platform, each component (e.g., VM, storage bucket, infrastructure service, serverless function) is associated with roles and permissions. Even small cloud footprints require hundreds of identity permission rules, each built by one or more cloud service provider IAM controls. And, there is a significant control overlap. For example, a group policy permission may cancel, augment, or reduce an individual policy permission.

Once the security team turns all the IAM dials to set IAM policies and rules, what is left are the effective permissions. These make up the net permission set for the cloud asset or principal. Effective permissions are a powerful cloud IAM construct, but as described below, simply determining what they are is not enough for cloud IAM success. The good news is that there are ways to enrich and augment effective permissions with the right data, insights, risk context, and perspective to drive successful cloud IAM.

AWS as a Cloud IAM Reality Check

AWS offers a robust IAM feature set. As shown in the figure below, the AWS policy evaluation logic includes five policy steps plus an explicit deny. Essentially, an IAM action starts with a denial and then flows through the policy steps to make a final permit/deny decision. This five-gate model is a classic threat-based approach where every permit/deny step reduces the potential of an exploit by a malicious threat actor (e.g., outsider, insider, user, or app).

  • Explicit Deny – First, AWS evaluates all applicable IAM policies to determine if there is an explicit deny for this request. For example, there may be a policy that explicitly denies any API call from North Korea. Without an explicit deny, the first policy gate activates.
  • AWS Organizations SCPs – An AWS Organizations service control policy (SCP) limits “permissions that identity-based policies or resource-based policies grant to entities.” In other words, does the organizational unit have an applicable SCP? If there is a permit, the next gate activates. If there is no permit, then there is an implicit deny. However, if there is no SCP attached to the organizational unit, the next policy gate still activates.
  • Resource-Based Policies – This policy gate evaluates inline permissions to a resource (e.g., S3 bucket and IAM role trust policies). As with SCPs, if there is no resource-based policy, the next policy gate activates. Unlike SCPs, however, if there is a resource-based policy with a permit, access is granted without further hurdles.
  • IAM Permissions Boundaries – This policy defines the “maximum permissions that the identity-based policies can grant to an IAM entity.” Maximum permission is the opposite of LPA. As with SCPs and resource-based policies, the next gate activates if there is no permission boundary set. If the IAM permission boundary is set and there is no permit, there is an implicit deny.
  • Session Policies – Session policies “limit the permissions that the role or user’s identity-based policies grant to the session. Session policies limit permissions for a created session, but do not grant permissions.” As with the above policy gates, if there is no session policy set, then the next gate activates. If there is a session policy and there is no permit, there is an implicit deny.
  • Identity-Based Policies – At this final gate, an implicit deny is given if there is no policy associated with the individual, or if there is a policy and no permit. Essentially, an explicit permit occurs only when there is an identity-based policy and a permit.

The Cloud IAM Inferno

Working through this cloud IAM policy gate process leaves security teams with an overlapping policy stack consisting of multiple permit/deny decision points for every cloud asset. Determining the effective permissions requires conducting extensive Venn diagram analysis. Plus, this process is ongoing since every IAM setting change affects every overlapping policy permission. Given that a typical enterprise cloud footprint has hundreds of cloud assets and principals, organizations often deal with thousands of identity and access rules. Put all the Venn diagrams together, and cloud IAM quickly becomes a raging inferno of conflicting, overlapping, and continually changing policy rules.

Effective Permissions Alone Are Insufficient

This cloud IAM inferno makes determining effective permissions exceedingly tricky. More importantly, even if the security team figures out how to quell the inferno and muscle through the Venn diagram analysis, effective permissions alone are insufficient to meet IAM goals. The reason is that effective permissions determine whether an actor (user or application) should have access to a cloud asset, not the potential impact or reach of that access.

Put another way, effective permissions are a threat-based concept, whereas a blast radius determination is a risk-based concept (i.e., involving understanding the impact of the threat). Specifically, effective permissions are missing two core components necessary to address risk:

  1. Risk context – Determining a blast radius or LPA requires an intimate understanding of an organization’s applications. While effective permissions define access to the organization’s cloud assets, they do not delineate access to its cloud applications: assets are not equivalent to applications. Applications consist of dozens or even hundreds of assets spread across multiple services. Just because an S3 bucket is accessible (i.e., via the effective permissions), it does not mean that an organization can calculate the blast radius until it knows the application that contains the S3 bucket and its specific business application.
  2. True identity – Cloud IAM requires a complete understanding of the actors accessing the application. Even though providers like AWS federate with enterprise directories (e.g., Active Directory, LDAP, Okta, Ping), only a subset of identity information transfers because it comes from an external system.

A New Cloud IAM View

Unfortunately, risk context and true identity are not bolt-on capabilities. Aligning risk context, true identity, and effective permissions requires reassembling an organization’s cloud IAM policy stack. Teams must first deconstruct the stack to its most basic elements (e.g., assets, permissions, rules, and accounts). Next, they match these elements to the enterprise IAM source of truth (i.e., Active Directory, LDAP, or third-party identity stores). Finally, they must match applications and their respective resources, business metadata, and historical context from a configuration management database (CMDB). The outcome shows what user/role is accessing a cloud asset (i.e., true identity) and the potential impact of that access (i.e., risk context).

Here is an illustration of how this works. InsightCloudSec's IAM Governance Module deconstructs and reconstructs the cloud IAM policy stack by creating an IAM boundary view. The IAM boundary view consists of three lenses that operations staff, analysts, incident response (IR) staff, and auditors can use to analyze and simulate their cloud IAM environment. These lenses are:

  • Principals – The federated users, IAM roles, and IAM users that define identity and access to cloud resources.
  • Applications – Critical applications identified by aligning multiple cloud assets via tagging and naming schemes.
  • Resources –  The underlying resources supporting applications that define the relationships among all the cloud assets – for example, discovering which principals can access a critical S3 bucket or SNS topic. This is extremely powerful, particularly when considered within the context of compromised or “hijacked” resource-based identities, a fast-emerging cause of data breaches.

Using these lenses, teams can quickly identify all the resources a federated user has access to and why and what they did to gain access. This perspective gives the team what it needs to net out all the different permission boundaries and establish critical areas of risk and noncompliance. For example, the InsightCloudSec IAM Governance Module provides immediate answers to the following primary questions relevant to establishing a baseline assessment in the event of a cloud IAM event:

  1. What applications and resources link to a principal? In other words, which principals (users) have access to a resource or group of resources?
  2. What applications and principals link to a resource? Based on this analysis, it is easy to determine which roles have cross-account permissions.
  3. What principals and resources link to an application? It is possible to determine who has read (or write) permission access to the application by answering this question.

The Cloud IAM Lifecycle

Following an approach like cloud IAM boundaries sets up an organization to manage and govern cloud IAM. Given the dynamism of cloud infrastructure and the need for continuous permissions updates to manage risk, successfully implementing a cloud IAM boundary approach requires a lifecycle approach.

Step 1: Assess Risk
Understanding risk underlies successful cloud IAM. Teams can use InsightCloudSec’s IAM Governance Module, Filters, and Scorecard to assess effective permissions. Teams can then use historical data to compare current efforts to previous actions. This comparison helps to address false permission alerts (i.e., non-effective permissions) and highlight anomalous activities that could represent IAM policy risks or indicate areas of noncompliance.

Step 2: Prioritize and Remediate
Using InsightCloudSec’s IAM Governance Module’s simulation capabilities, teams can perform what-if analysis to look for potential cloud IAM issues proactively. Simulation is essential for modeling the blast radius of a possible cloud IAM exploit and identifying excessive and unused permissions that indicate permission “bloat.”

Step 3: Cloud LPA
After assessing risk, prioritizing cloud IAM misconfigurations, and remediating permission bloat, teams can establish and manage LPA by setting the minimum privilege possible to achieve the organization’s risk goals. LPA is a never-ending process, requiring ongoing assessment of privilege levels against organizational roles and permissions. Teams can use InsightCloudSec’s bot automation to remediate permissions that are too restrictive or not sufficiently restrictive.

Step 4: Automate for Scalability
Finally, to address the ongoing growth of their cloud footprint, organizations must implement automated remediation of common high-risk IAM issues, such as anomalous behaviors, permission bloat, and under- or over-provisioning of LPA. This automation is essential for saving time and continually improving the organization’s risk posture, while accelerating its response to change.

Creating Clarity and Context from Chaos

In the end, operations staff, analysts, IR personnel, and even DevOps teams need to answer this simple question: what is the risk associated with access to my cloud applications and data by different users and systems?

By focusing on threats and risks, adopting an approach like the cloud IAM boundary view, and following a cloud IAM lifecycle approach, teams can cut through the complexity of current cloud provider IAM controls. This approach gives teams the clarity and context they need to answer this question confidently and easily. With precise answers, organizations can quickly determine the blast radius of an IAM incident and stablish and manage LPA at scale.

The result is establishing identity as the new security perimeter in the cloud, continually identifying and reducing cloud identity risk, and ultimately decreasing the chance of breaches and their resulting damage.

For more information, please read more about InsightCloudSec's IAM governance here.