Better Access Control in AWS

The benefits and drawbacks of standing versus just-in-time access control.

In a maturing startup, compliance programs like SOC 2 will require the engineering team to take a hard look at who can access what parts of the production systems. You’ll need to consider the risk and reward of having many engineers with access to many systems. I’m a strong proponent of building a culture of ownership and putting all your engineers on-call, and being on-call to fix production incidents does require access to production.

Your solution to this access control problem should facilitate this culture of ownership, avoid disempowering engineers, and also help manage the risk of too much access. A good access-control story will also help with SOC 2 compliance. Read on to learn about the relative benefits and drawbacks of just-in-time (JIT) access approvals and tightly scoped standing access. 

In this post

What is access control?

Access control is how an organization determines who has access to what parts of its data, applications, and resources. There are several common types of access control, such as role-based access control (RBAC), where a job role determines the access. Discretionary access control (DAC) is another common form, where the resource owner can decide who has access. Microsoft has a great explainer here.

Just in Time access

Just in Time access, or JIT access, is a security practice where access is granted to applications or systems on request. Generally, there is some kind of system or portal where engineers can request access to things like AWS infrastructure. The access request goes to someone who approves or denies the request. I was this approver at Segment when we implemented JIT access control.

Depending on the implementation and the company policies access may be granted for a limited period of time and will automatically expire if not renewed. Or the access stands until revoked.

In the first case with time limited access it can help avoid long-running access when it’s not actually needed. It can also help ensure people don’t have more access than necessary to do their jobs, because their access has to be renewed, and that’s usually only done if it’s needed.

💡
Substrate: The Right Way to AWS
Substrate is a CLI tool that helps teams build and operate secure, compliant, isolated AWS infrastructure. From developers who have been there.

A time bound access system also removes the need to revoke access when people move to different teams or leave the company, since the access just expires.

In the second case, when you request access it stays granted until it’s revoked. In this case it’s less work to request, renew, and approve access. But there also needs to be a revoke and audit step.

The problem with Just in Time access

The first issue is you need to set up an approval process, and someone needs to be on the hook and possibly on-call to approve access requests, which can quickly feel like busy work.

The approver will need to ensure a few things. Does this person really need the access they are asking for? When would you deny an access request? In a well-functioning engineering organization, you want a bias towards enabling people to do their work, without putting up too many roadblocks. There is almost never a valid reason to deny access to an on-call engineer when they need to fix a problem.

You also need to verify that the person who requested the access is who they say they are. An attacker could have gotten into an employee’s computer or stolen their credentials. In the early days of Square, the team mostly all worked in one big office so they could, and would, literally walk over and ask the person if they really did make that access request. Slack had Yubikeys wired into Slack bots for acknowledging sensitive activities out of band. 

However many companies are much more distributed now, and even a hypothetical attacker could modify the employee directory to route calls to themselves for verification. I could continue down this rabbit hole by mentioning a recent deep fake scammer who stole $25M from a multinational company. The point is, the verification step can also be attacked, and it’s easy to just skip it. It’s best to do some form of verification to make it more expensive for any attacker to escalate their access, but it’s never perfect.

But what if there’s an emergency where an on-call engineer needs access to a system that they don’t normally use, and the approver is not available? For such instances, plan for a system where the engineer can grant themselves access, and then the approval can be retroactive for compliance.

This type of Just in Time access often emerges in organizations that don’t have proper isolation in their production environment. They may have one big AWS account, and every engineer in the company needs to have access to it so they can do on-call and deploy changes to production. The lack of isolation is the true problem.

Standing access

Tightly scoped standing access, with temporary local credentials, is a viable and often better alternative to JIT access. You need to scope access down to specific environments and services and log all access. You can even alert on unexpected access and unexpected access escalations.

As an engineering team and application grows, there will be multiple services and each should be isolated. If you’re on AWS, you should have lots of accounts and isolate services into their own accounts. Then an approver can grant standing access to the teams that own those services, scoped only to the accounts needed for their application. Their access is managed with the IdP, so team changes or company departures will revoke access as necessary.

Temporary local credentials are the final requirement. Engineering laptops or similar devices shouldn’t have long-lived credentials stored on them, making it harder for an attacker to find and use credentials for unauthorized access. Ideally, engineers start their day by authenticating with the IdP, which then generates cached credentials for their work. Those credentials should expire by the end of the day.

With this setup, engineers have access to the services they are responsible for and nothing more than they need. They are enabled to manage, support, and improve their systems.

With temporary credentials you’ve also made the attacker's job harder. Now a compromised laptop doesn’t immediately provide credentials to access your systems. The attacker has to generate credentials, which depends on an active session to the IdP. The IdP should have a reasonable session expiration, so any locally cached credentials have an expiration date quite similar to any JIT access. You can also sever all access in a single place, at the IdP.

No one has to be on-call to approve access requests. Your access is granted as part of a person’s role. If the role requires additional access, it can be requested. Better yet, you can work with the team that owns the systems and are most familiar with them.

How Substrate can help with access control

Good security and reliability start with isolation. When you can tightly scope access to only the services an engineer needs for their job, you also limit the access of any attacker that compromises that engineer's credentials. We believe that standing access with isolation is better than JIT access if you have the following:

  • Isolation between environments and services, perhaps via lots of AWS accounts
  • Alerting on unexpected access and access escalation
  • Role-based access control managed by your IdP
  • Temporary credentials

Substrate integrates with an IdP to grant temporary credentials for access to AWS. It allows easy isolation of applications and environments, so an organization can create tightly scoped access to accounts.

Substrate also makes it easier to create new AWS IAM roles, as well as new AWS accounts for new systems and teams. Each engineer can start their day by authenticating and generating temporary credentials.

With Substrate and the standing access playbook in hand, engineers are empowered to fix problems and unhindered by superfluous processes. Most important, a startup’s security is built on a strong foundation of isolation.

💡
Substrate: The Right Way to AWS
Substrate is a CLI tool that helps teams build and operate secure, compliant, isolated AWS infrastructure. From developers who have been there.