Customers often ask us, “What is the best way we can improve our overall security?” At Trek10, we have developed a security audit framework for reviewing customer environments and providing our expertise on where an AWS account may deviate from best practices. We first cover all of the simple and obvious scenarios, such as whitelisting VPN-only private IP addresses for SSH/RDP access, recertifying old and potentially rogue accounts, deleting unused access keys, locking down access to S3 buckets, etc., etc.
For customers who have covered all of the basics, one of our top recommendations to better secure the critical IAM environment is to implement AWS roles for all access by all IDs and services. Leveraging roles everywhere has a number of tremendous security benefits. In this three-part post, we will cover common use cases for AWS roles and how they make your account more secure -- from leveraging them for your services (e.g., EC2 instances, Lambda functions, etc.), to using them for federated access to AWS from your identity provider (e.g., Active Directory), to using them for elevated console access… roles will keep your CISO AND your auditors happy!
Introduction to Roles
Generally speaking, you can think of an IAM role as an IAM user ID. You give a role the same types of permissions to access AWS resources as you would a user, and you attach the same JSON policies to a role as you would a user ID. The big difference is that you do not “log in” to a role as you would an ID in the console. A role can be assumed or leveraged by multiple different users and services. For example, a user can assume a role or an EC2 instance can be launched with the permissions of a role (so that, for example, the EC2 instance will have permissions to access an S3 bucket without requiring any access keys hard-coded into the application). The best way to understand these use cases is to take a dive into a more specific examples. In Part 1 of this post, we will start with understanding how user IDs can assume different types of roles from the AWS management console with an example. In Part 2, we will review how cross-account roles work both for MSPs and Consulting Partners. Finally, in Part 3, we will review SSO access to SAML providers, as well as discuss how AWS services (e.g., EC2, Lambda, ECS, etc.) leverage roles. All role use cases allow users/services to more simply and/or securely take on permissions for access to AWS APIs.
When a user ID “assumes” a role, it takes on all the permissions associated with said role. (Of course, the user needs permissions to assume the role in the first place, based on the IAM policy associated with the user ID -- more on this later). Let’s first review each scenario in which a user ID may assume a role, then we’ll dive into each use case.
Let’s say, for example, that you leverage OpsWorks for configuration management. You have several developers who all have access to deploy applications through OpsWorks in pre-production, but you only want to give a small number of people permissions to your production OpsWorks stack. You could simply create an IAM group -- e.g., “production-access-users” -- then assign that group the appropriate permissions without using roles. Alternatively, we prefer an approach which leverages role assumption for access to the production stack. In this scenario, the user will be required to switch to a specific role in order to access the production stack.
Using roles, as opposed to assigning permissions directly to a user, has a number of advantages:
While there is no simple checkbox to require a user to use MFA to log in to AWS, with roles, you can click a checkbox to require MFA in order for a role to be assumed.
Roles help avoid careless mistakes. Requiring your users to take a separate action (i.e., assume a role) in order to change a production system can help prevent careless mistakes like when a user inevitably edits a production stack instead of pre-production!
Leveraging roles can lead to more robust auditing capabilities. For example, Trek10 leverages a 3rd party service, CloudCheckr, which can generate emails whenever a role is assumed for elevated permissions. We also use SumoLogic for logging API calls from CloudTrail, which gives us the ability to generate support tickets for our customers whenever a particular role is assumed.
Lastly, troubleshooting or reviewing production changes is simplified. Instead of frantically searching through CloudTrail for production changes, you can filter your logs for only those API calls in which a role was assumed.
Next we will demo how this works in the management console.
Step 1: Create Role
The first step is to create a role that your users will use to access production systems within the OpsWorks console:
Navigate to IAM > Roles > click Create New Role. We will name our role “production-opsworks-access”.
For Role Type, select Role for Cross-Account Access, then select Provide access between AWS accounts you own (technically, you will not be “switching” AWS accounts, but this is the correct option to select. Later, we will cover the use case in which one actually does switch AWS accounts).
Enter your AWS Account Number and click Require MFA (requiring MFA is a key advantage to using roles).
Skip through the Attach Policy section by clicking Next. Rather than using AWS Managed Policies, we’ll attach an inline policy to the role.
Verify your configurations, then click Create Role.
Step 2: Assign Appropriate Permissions to Role
The next step is to assign permissions to that role so that any user who assumes it will have access to the production OpsWorks stack:
Navigate back to the Roles section of IAM and click on the role that you just created. Expand the Inline Policies section and click to create an inline policy. Select custom policy. In order to create the policy, you will need the Amazon Resource Name (ARN) of your OpsWorks stack.
To find the ARN, navigate to OpsWorks > Click Stack > Click Stack Settings and copy the ARN.
Create policy for all OpsWorks production stack permissions.
You have now created your role and assigned the appropriate permissions to it.
Step 3: Grant Users Permissions to Assume Role
Now you have created the role, but the users whom you trust to access production stacks must have the appropriate permissions to assume this role. To review this process, I have created “test-user” in our sandbox with ReadOnlyAccess and all necessary permissions to edit the pre-production OpsWorks stacks.
Right now, test-user cannot access the OpsWorks production stack because s/he only has permissions associated with the pre-production-opsworks-access-group. In order to access the production stack, s/he must switch to the production-opsworks-access role.
In order to switch roles, the user must be given the appropriate permissions. Let’s create another inline policy for this:
Now, test-user has the necessary permissions to assume the role that will provide access to the production OpsWorks stack.
Finally, let’s test out the role! Before assuming the role, you can see below that test-user has no access to the production OpsWorks stack.
Next, let’s assume the production-opsworks-access role. Click the account name in the test-user account name, then click Switch Role. Remember, MFA is required! If you try to switch roles without setting up MFA, access will be denied.
Now that we have successfully switched roles, let’s take a look at the OpsWorks production stack. As you can see, when switching to the production-opsworks-access role, test-user can now deploy an app.
Theoretically, different roles can be used for all types of access, which adds a layer of forced MFA security and auditing capabilities. For example, AWS IDs can be created that have no permissions at all, except for a policy that allows the IDs to assume different roles for access to different infrastructure.
OpsWorks does simplify the ability to segregate access to different environments, but many of our customers leverage other DevOps tools, such as Docker on EC2 Container Service. Outside of OpsWorks, defining access to different environments is not as straightforward. For these use cases, the same type of restrictions can be placed on infrastructure with environment tagging and resource-level permissions. Then, roles can be leveraged in a similar fashion as described in this blog post. For more information on resource-level permissions, check out this AWS blog post.
Questions/comments? Looking for a security audit? Feel free to contact us at firstname.lastname@example.org.