At a top level, the fundamentals of securing cloud infrastructure are similar to the fundamentals of securing an on-premises network. For example, the NIST cybersecurity framework still applies: you are trying to identify what you need to secure, protect those assets, detect malicious activity, respond to security events, and recover afterward. However, the unique aspects of a cloud environment mean that the way in which you accomplish these functions can be quite different.
Perhaps the most obvious yet misunderstood difference is that in a cloud environment, the cloud service provider (CSP) has responsibility for some elements of securing that environment. The misunderstandings occur over exactly what the CSP is responsible for. CSPs like AWS have created shared responsibility models to help clarify things. This shared responsibility model says AWS is responsible for securing the underlying infrastructure of its cloud. That means it is responsible for things like maintaining and updating hardware, as well as providing physical security for that hardware. The customer using AWS infrastructure is responsible for securing anything they put in AWS. That means they are responsible for things like updating and patching operating systems, properly configuring the AWS services they use, and controlling who has access to those services. Learn more about securing AWS environments.
Should someone in your organization incorrectly assume that your cloud provider is taking care of some aspect of security, it could result in improperly secured cloud assets. Therefore, it’s very important that before anyone is given the ability to modify a cloud environment, they are first taught what the shared responsibility model is and which portions of security your organization is on the hook for.
Another unique aspect of a cloud environment is the ease with which new assets can be created and deployed. In an on-premises network, the IT and security teams have control over all new infrastructure that gets added. In a cloud network, new infrastructure can be instantly added by any person or system with the right credentials. This makes it far easier to modify a cloud network, but unless the right guardrails and monitoring are in place, it also increases the chances that new infrastructure isn’t configured securely and thus vulnerable.
At the same time, cloud environments change very quickly. Technologies like autoscaling and serverless computing mean that assets in a cloud network are constantly appearing and disappearing. Traditional security measures like vulnerability scanning are no longer enough because a vulnerable asset might only exist for a few minutes or hours, which means the asset won’t be picked up by a weekly or even daily scan. You might think that also means the asset doesn’t live for long enough to present a risk, but data from our Project Heisenberg global honeypot network shows that new assets are scanned by malicious sources within a few hours of being spun up.
The ease of deployment and high rate of change make it very difficult for security teams to maintain a complete picture of their cloud environment. This is made worse in hybrid environments (IT environments that include both on-premises and cloud networks) and multi-cloud environments (IT environments that include cloud networks from multiple cloud providers), where different information is stored in different systems, often protected by different security tools. In these situations, the security team has to bounce back and forth between various systems to manage their security efforts. The lack of unified data makes it difficult (if not impossible) to get an accurate sense of the organization’s overall security posture or track a malicious actor that is moving between cloud and on-premises networks.
Just like an on-premises network, AWS is as secure as you make it. Above, we discussed the shared responsibility model and how AWS is responsible for security of the underlying infrastructure. The fact of the matter is AWS devotes more resources to their portion of the shared responsibility model than the majority of organizations with on-premises networks. For these organizations, moving to AWS can enhance portions of their security posture.
However, moving to AWS or any other public cloud provider also introduces new risks. As we mentioned above, cloud environments have unique challenges. You can’t simply take your existing security tactics and apply them to the cloud. Having said that, if you understand the unique aspects of cloud security and apply best practices, AWS can be as secure as (or even more secure than) an on-premises network.
Strong AWS security is important for the same reasons why cybersecurity in general is important. You need to protect your organization and your customers from malicious actors. For many organizations, the importance of having strong AWS security is increasing as they move more valuable workloads and more sensitive data to the cloud.
Another reason why strong AWS security is important is because of the reputational damage that can occur from an avoidable incident. Gartner predicts that through the end of 2020, 95% of cloud security failures will be the customer’s fault. Should customers learn that an organization was compromised due to an easily avoidable error, it could shake their confidence to the point that they move their business elsewhere.
One quick thing before we get started: As you can tell from their name, AWS loves acronyms. This can create some confusion on what various AWS services do, so here’s a quick breakdown:
We discuss additional AWS services in more depth below, but we wanted to make sure you were familiar with the basics. Now then, onto best practices:
Ideally, you should start thinking through how you will secure your AWS environment before you begin adopting it. If that ship has already sailed, no worries—it just might require a bit more effort to implement some best practices.
When approaching a cloud environment for the first time, some security teams try to make the cloud mimic the on-premises environments they are used to protecting by doing things like prohibiting developers from making infrastructure changes. In almost all cases, the end result is the team gets relieved of responsibility for cloud security or engineers find ways to bypass the restrictions (see best practice No. 9 for why this is bad).
Security teams need to recognize that potentially risky aspects of the cloud, such as the rapid rate of change and the ease of deployment, are also some of the biggest benefits to using cloud infrastructure. To be successful, security teams must endeavor to be seen as enablers of the cloud. They must find ways to keep cloud infrastructure secure without overly stifling those aspects that make the cloud beneficial to the organization. This starts by adopting an open mind and recognizing that successfully managing risk in a cloud environment will require new tactics and processes.
Your security and DevOps teams should work together to define what your AWS environment should look like from a security perspective. The baseline should clearly describe everything from how assets must be configured to an incident response plan. The teams should consider using resources like the AWS Well-Architected Framework and the CIS Benchmarks for AWS as starting points. They might also want to ask for assistance from an AWS Solutions Architect, who is a technical expert skilled in helping customers construct their AWS environment.
Make sure your baseline is applied to your production environment as well as any test and pre-production environments. Reevaluate your baseline at least every six months to incorporate things like new threats and changes in your environment.
Once your security and DevOps teams have defined what your AWS security baseline looks like, you need to enforce it. Make it easy for developers to adhere to your baseline by providing them with infrastructure templates that have already been properly configured. You can do this using AWS CloudFormation or an infrastructure as code vendor like Terraform.
You also need a monitoring solution in place to detect when something is out of compliance with the baseline (either because it was deployed with a misconfiguration or because a change was made after deployment). To do this, one option is to use AWS Security Hub, but several third-party vulnerability management solutions include built-in monitoring for cloud misconfigurations. There are two benefits to using a VM solution with built-in misconfiguration detection. First, it consolidates two types of risk monitoring (asset vulnerabilities and cloud infrastructure misconfigurations) into one tool. Second, with most of the vulnerability management solutions all the misconfiguration rules and detections are managed for you by the vendor, whereas with AWS Security Hub you need to set up and manage the rules yourself. Learn more about InsightVM for vulnerability assessment + cloud configuration.
Another option for enforcing your security baseline is a Cloud Security Posture Management (CSPM) solution. A quality CSPM will have the ability to monitor accounts from multiple cloud providers for misconfigurations. This is a big deal, as it allows your organization to set one security baseline for all your cloud providers, then enforce it using a single tool. Beyond being able to monitor cloud accounts for misconfigurations, you should look for a CSPM with the ability to automatically fix misconfigurations as soon as they are detected. This will greatly reduce the burden on your security team and ensure that nothing slips through the cracks.
Other capabilities to look for in a CSPM include the ability to flag issues in infrastructure as code before anything is deployed, IAM governance (see the next section for more on IAM), and compliance auditing. CSPMs tend to be a bit pricey, but for organizations that use multiple cloud providers or who have a large number of accounts with a single provider, a CSPM is the way to turn the chaos of managing all those accounts into order.
Few things are more important to creating a secure AWS environment than restricting access to just those users and systems that need it. This is accomplished using AWS Identity Access Management (IAM). IAM consists of the following components:
Now that you have a basic understanding of the components that make up IAM, let’s talk about best practices. AWS has a list of IAM best practices that you should read through. Similar practices are mentioned in the CIS Benchmarks for AWS. All of these best practices are important, but in the interest of brevity, we’ll call out a few of the most vital (and commonly broken) guidelines:
A lot of people don’t realize that even in the cloud, unpatched vulnerabilities still present a threat. To detect vulnerabilities in EC2 instances, you can use AWS Inspector or a third-party vulnerability management solution. Using a vulnerability management solution allows you to better prioritize your work, improve your reporting capabilities, and facilitate communication with infrastructure owners and help everyone monitor progress towards reducing risk. In addition, security teams that are dealing with a hybrid or multi-cloud environment often prefer to use a third-party solution because it allows them to oversee vulnerability and risk management for all their environments in one place (more on that in best practice item No. 9).
Although vulnerability management should be familiar to most cybersecurity professionals, there are a few unique aspects of VM in a cloud environment like AWS that you should be aware of. As we mentioned earlier, a cloud environment can quickly change. Assets appear and disappear minute-by-minute. In such a dynamic world, weekly or even daily scans aren’t enough to get an accurate understanding of vulnerabilities and your risk exposure. It’s important to have some way to make sure you have a complete picture of which EC2 instances exist, as well as a way to continuously monitor the instances throughout their lifetime. To ensure you have a complete picture of your EC2 instances, invest in a vulnerability management solution with dynamic asset discovery, which automatically detects new instances as they are deployed. A similar capability can be achieved with AWS Inspector by using CloudWatch Events, although setup is a little more manual.
When vulnerabilities are detected in an EC2 instance, they can be addressed in several ways. One option is to use the Patch Manager in AWS Systems Manager. This approach is the most similar to how you traditionally manage vulnerabilities in an on-premises network. However, many cloud environments are designed to be immutable. In other words, assets like EC2 instances should not be changed once they’re deployed. Instead, when a change needs to be made, the existing asset is terminated and replaced with a new one that incorporates the change.
So, in immutable environments, you don’t deploy patches, but rather deploy new instances that include the patches. One way to do this is to create and maintain a base AMI that gets regularly updated to run the most recent version of whatever operating systems you’re using. With this approach, when a vulnerability is detected, you can create a new baseline AMI that incorporates patches for the vulnerability. This will eliminate the vulnerability from any future EC2 instance you deploy, but you’ll need to make sure you also redeploy any currently running EC2 instances.
Another option is to use an infrastructure automation tool like Chef or Puppet to update and redeploy AMIs. This approach makes sense if you are already using one of these tools to maintain your EC2 instances.
Just like any other system, you should log all activity that has occurred in your AWS environment. Not only are logs important for monitoring and compliance, but thinking back to the NIST cybersecurity framework, they are a critical part of detecting malicious activity, (especially when they are fed into a modern SIEM) responding to a security event, and recovering afterward.
In AWS, most logs are captured using CloudTrail. This service automatically captures and stores AWS API activity as what AWS calls Management Events in your AWS account for no charge (although you will need to pay the cost of storage). CloudTrail captures tens of thousands of events, including critical security information like logins and configuration changes to AWS services. For a fee, you can also create “trails” in CloudTrail, which allows you to do things like capture additional activity and send your logs to S3 for long-term storage and/or export. Here are some best practices for setting up CloudTrail in your AWS account:
Although most logs are collected in CloudTrail, there are a few other logs you should make sure you capture. VPC Flow Logs show data on the IP traffic going to and from the network interfaces in your virtual private cloud (VPC). They can help you identify intra-VPC port scanning, network traffic anomalies, and known malicious IP addresses. If you use AWS Route 53 as your DNS, you should also log DNS queries. You can use these logs to match against threat intelligence and identify known-bad or quickly spreading threats. Keep in mind that you will need to use AWS CloudWatch to view your DNS logs.
Now that you know how to use logs to obtain visibility into the activity in your AWS environment, the next question is how to leverage this visibility. One (very manual) option is to use AWS CloudWatch alarms. With this approach, you build alarms for various suspicious actions such as unauthorized API calls, VPC changes, etc. A list of recommended alarms is included in the CIS Benchmarks for AWS. The challenge with this approach is that each alarm must be manually built and maintained.
Another option is to use AWS GuardDuty. GuardDuty uses CloudTrail, VPC Flow Logs, and DNS logs to detect and alert on suspicious behavior. The nice thing about GuardDuty is that it is powered by an AWS-managed list of findings (aka potential security issues), as well as machine learning. That means no manual setup or maintenance is needed to receive alerts about suspicious activity. However, detecting suspicious activity is just the first step in responding to an incident. Your security team will need to pull relevant log files and other data to verify that an incident has occurred, then determine the best way to respond and recover. If the team needs to search multiple different data sources to find this information, it can dramatically lengthen the time needed to conduct an investigation. This challenge is exacerbated in a hybrid or multi-cloud environment.
Having all relevant data automatically centralized during an investigation is just one of the reasons why many security teams decide to use a modern SIEM and incident detection tool. A good SIEM solution will have a CloudTrail integration and let you store all logs from AWS alongside logs from on-prem networks and other cloud providers like Azure and Google Cloud Platform (GCP). This ability to centralize all data can be massively helpful in speeding up investigations, especially when you need to track a malicious actor who has moved across your environments.
A good SIEM will also provide a host of other features to enhance your security team’s ability to detect, confirm, and respond to an attack. For example, the most advanced SIEMs use multiple techniques to detect suspicious behavior. Other features to look for include the ability to create custom alerts, deception technology (things like pre-built honeypots and honey users that will trigger alerts when accessed) and File Integrity Monitoring (FIM). All these capabilities provide additional layers of detection. You should also look for a SIEM that provides visualization capabilities like customizable dashboards and investigation timelines, which make your centralized data more usable. In addition, make sure any SIEM you’re considering has built-in automation, as this can dramatically reduce reaction times when an incident occurs. Finally, many teams like to use both AWS and third-party tools to secure their AWS environment, in which case it’s important to find a SIEM that includes a GuardDuty integration.
One very common mistake is to approach AWS security in a silo, separate from efforts to secure existing IT infrastructure. This creates hole that can be exploited by a malicious actor. For example, we’ve seen situations where an organization’s on-premises and AWS security were designed to address different potential threats. The resulting gaps left both networks vulnerable.
Having a single team responsible for securing all IT infrastructure ensures that no assumptions are made about what the “other” security team is or isn’t doing. Instead, there is one team that knows it is accountable for all aspects of your organization’s cybersecurity posture. Unifying your security efforts under one team can also be extremely important during an incident. The team has immediate access to far more data. It’s also much easier to maintain clarity around each team member’s area of responsibility.
Not only is it important to unify responsibility for security under one team, but it’s important to unify all your security data in one set of tools. The vast majority of organizations are not just using AWS. At a minimum they have on-premises networks and employee endpoints to secure. In many cases, organizations also utilize multiple cloud providers. If you use different security solutions for each environment, it increases the likelihood of there being blind spots. In addition, the more tools your security team uses, the higher their workload, as they are forced to constantly bounce between tools in order to manually piece together a complete picture of the organization’s current cybersecurity posture.
With so many best practices for securing AWS, it’s not reasonable to expect everyone to remember them all. Even if they did, mistakes happen. To ensure your AWS environment continuously adheres to your security baseline, you should turn to automation. For example, you can use a combination of CloudFormation and Lambda or a tool like Terraform, or one of the more advanced CSPMs to automate deployment of new AWS infrastructure and ensure that everything complies with the baseline you’ve established. You can also have these tools automatically flag or terminate infrastructure that is not in compliance.