SOC 2 compliance for startups and first-timers (part 2)

Reading the criteria, writing controls, and preparing for your first SOC 2 audit

No matter how deftly a startup delays SOC 2 compliance (as discussed in part one of this series), they will eventually need to tackle it. When you’ve filled out enough infosec questionnaires to last a lifetime, lined up enough SOC 2-dependent revenue to cover the cost, or matured enough to need formal risk management, it’s time for your startup to start seeking SOC 2 compliance. Otherwise, you may run out of customers who will accept SOC 2 stalling and substitutes.

💡
Substrate: The Right Way to AWS
Substrate is a CLI tool that helps teams build and operate secure, compliant, isolated AWS infrastructure. From developers who have been there.

Before you start that first audit and receive your first SOC 2 report, you need to define controls that meet the SOC 2 criteria. Defining controls that make sense for your company will set you up for a successful first audit and a compliance program that streamlines your business instead of slowing it down.

In this guide

What is a control?

Controls are formal statements that declare how a business is ensuring that the actions of all its employees and the operation of its various technologies achieves those objectives. Best case scenario, they’re formal statements of what you’re already doing.

Grammatically, they should use the active voice. More importantly, they should express a continuous state of being rather than a point-in-time test. Controls that aren’t enforced by computers would do well to identify who’s responsible for them and who takes the actions they specify.

In all cases, you must be prepared to show evidence that the controls are operating effectively. You should think ahead of time about how you’re going to provide that evidence.

Controls aren’t static. They must evolve as a company and its customers change. Most every control matrix (what auditors call a list of controls and the mapping from those controls to the various compliance criteria they address) includes controls that state various policies and controls that are reviewed on a particular schedule.

How are controls different from policies?

Information security policies can be a very deep rabbit hole. Their most important task, though, is to align your whole organization’s security posture. Absent policy, employees will approach security with their own reasonable but probably highly varied values. Policies align everyone to the weighty task of being a good shepherd of your customers’ data. They set the tone and influence your security culture.

As companies grow larger and larger, policies inevitably take on a level of abstraction that makes them difficult for employees outside of policy-oriented roles to read. While you’re small, though, your policies can be useful primary sources that employees can refer to for guidance.

When your policies change, it’s very likely that your controls will need to change, too. For example, if your company grows and you change your access policy to limit who has access to customer data then you probably need to control how access is granted, modified, and reviewed. In contrast, a change from manual to automated access provisioning changes how you control access without any changes to your access policy.k

What’s in scope for a SOC 2 audit?

SOC 2 covers a lot of ground in its effort to assure readers of a SOC 2 report (your customers) that the subject of the report (that’s you) is going to be a good shepherd of their data. While you may be able to scope some truly ancillary things out, you should expect people and technologies that interact with your customers’ data, even indirectly, to be in-scope. Ditto for people and technologies that monitor the security and availability of the system and respond to incidents.

And because people are in scope, that means HR processes like performance management are in scope, too. You’ll be asked about employment standards, background checks, onboarding, and offboarding.

You own your controls

SOC 2 is not like PCI. PCI DSS has requirements and, no matter how you feel about firewalls, you will do them their way, or you will not be PCI-compliant. Thankfully, SOC 2 brings with it criteria, not requirements. How you meet the criteria and how you prove it is mostly up to you. Your relationship with your auditor isn’t a negotiation, exactly, but it’s not far from it, though you’re not adversaries. Your job is to demonstrate how effectively you’re controlling risk. Their job is to gather sufficient evidence to prove you’re meeting the criteria.

You own your controls. Your auditor doesn’t tell you what your controls are. It’s up to you to tell your auditor what the controls are. Then they confirm that your controls address all the criteria, and they may have clarifying questions or suggestions for improvements to your control statements.

Once satisfied that your controls meet the criteria, an auditor gathers evidence that your controls are operating effectively, which is to say you’re doing what your controls say you do. You can provide any evidence you like so long as it convinces an auditor that the controls are operating effectively.

The SOC 2 Common Criteria

The AICPA’s Trust Services Criteria are the basis for all SOC 2 audits. The authors regret their inability to link to the SOC 2 criteria but the AICPA have put them behind a paywall. The criteria are divided into several sections, the largest and most important of which are the Common Criteria.

Every SOC 2 audit will include the Common Criteria, which take a rather expansive view of security, evaluating not just correct use of cryptography but also an organization’s design, management, communication, ethical posture, and employee performance management, monitoring, access control, and change management. It’s no wonder that the Common Criteria are sometimes also identified as the Security criteria.

The Common Criteria treat risk management as a discrete process. It wants your management to enumerate risks your business faces and either mitigate or accept each one. The formality on display here matches formality seen throughout the rest of the compliance process.

Additional SOC 2 criteria for Availability, Confidentiality, Privacy, and Processing Integrity

Depending on your business you may want to add one or more additional sections from the Trust Services Criteria to the scope of your SOC 2 compliance program. Each of these sections add a few criteria and allow you to tailor your compliance program to best address your customers’ concerns about your company.

The Availability criteria mean exactly what you think. If your customers have a bad day when you have an outage, you probably want to add Availability to your SOC 2 compliance program. It is very, very common for SaaS companies to add the Availability criteria to their SOC 2 compliance programs.

In the Trust Services Criteria, Confidentiality is concerned with data protection and retention. In other words, it’s how you ensure that only authorized parties may access the data you’re storing on your customers’ behalf. It is almost table-stakes for SaaS companies.

Privacy, in contrast to Confidentiality, is concerned with your customers’ choices as to how their private and personally identifying information is handled. It’s in the same ballpark as GDPR and CCPA. Most SaaS companies do not add Privacy to their SOC 2 compliance program but this may change as customer preferences and demands change.

The Processing Integrity criteria will be unsurprising to most SaaS companies as they’re well-aligned with providing a high-quality Internet service. Failures in Processing Integrity are highly visible and thus possibly less in need of a third-party audit but, nonetheless, these criteria could eventually become table-stakes for SaaS companies, too.

What are control objectives? 

Your controls must meet your company’s objectives as well as the SOC 2 criteria. By themselves, the criteria, written in the most general and abstract possible form, can be an intimidatingly empty canvas. You’ll notice when you read the Trust Services Criteria that every criteria is parameterized by your “objectives.” You own your controls. You own your objectives. And the criteria dictate that your controls serve your objectives.

You’ve probably already set these objectives without even knowing it. Your control objectives are simply what you promise your customers you do. You promise your customers that your service performs some task, stores some data, etc. You also promise your customers you don’t lose their data or accidentally show it to the wrong people. You may promise your customers a certain quality or availability of service. Whatever those promises, those are your objectives — to deliver on your promises.

Each of your controls operates in service of one or more of these objectives. The controls here to help your business be its best. They’re not here to distract you and slow you down (though poorly designed controls can distract you and slow you down).

How to read the SOC 2 criteria

The SOC 2 criteria are very formal, very dense, and very abstract. They apply to every company. There is good security advice hiding in there, if only you know how to read them. Here are some tips for finding it.

Have a shorthand for your objectives that you can mentally substitute. Every criteria ends with something like “to meet its objectives.” Make that concrete for your company. When you see “to meet its objectives,” read it as “to take over the world,” instead. Make it real for yourself.

Some of the criteria is written like a factored polynomial — (x + 2)(x + 3) — and it pays to distribute the polynomial as you’re reading the criteria.

When you see this:

The entity authorizes, designs, develops or acquires, configures, documents, tests, approves, and implements changes to infrastructure, data, software, and procedures to meet its objectives.

Read it like this:

The entity authorizes changes to infrastructure to take over the world.
The entity designs changes to infrastructure to take over the world.
The entity develops or acquires changes to infrastructure to take over the world.
The entity configures changes to infrastructure to take over the world.
The entity documents changes to infrastructure to take over the world.
The entity tests changes to infrastructure to take over the world.
The entity approves changes to infrastructure to take over the world.
The entity implements changes to infrastructure to take over the world.
The entity authorizes changes to data to take over the world.
The entity designs changes to data to take over the world.
The entity develops or acquires changes to data to take over the world.
The entity configures changes to data to take over the world.
The entity documents changes to data to take over the world.
The entity tests changes to data to take over the world.
The entity approves changes to data to take over the world.
The entity implements changes to data to take over the world.
The entity authorizes changes to software to take over the world.
The entity designs changes to software to take over the world.
The entity develops or acquires changes to software to take over the world.
The entity configures changes to software to take over the world.
The entity documents changes to software to take over the world.
The entity tests changes to software to take over the world.
The entity approves changes to software to take over the world.
The entity implements changes to software to take over the world.
The entity authorizes changes to procedures to take over the world.
The entity designs changes to procedures to take over the world.
The entity develops or acquires changes to procedures to take over the world.
The entity configures changes to procedures to take over the world.
The entity documents changes to procedures to take over the world.
The entity tests changes to procedures to take over the world.
The entity approves changes to procedures to take over the world.
The entity implements changes to procedures to take over the world.

Remember that “implement” to an auditor is a synonym for “deploy” not “develop.” All of these words have different meanings in the context of your compliance program.

Some of these simplified criteria might not make sense or be applicable; for example, it probably doesn’t make sense to test changes to data. Many of these may be addressed by the same control; for example, a code review and approval control may address all of these verbs for both infrastructure and software. Thinking about them individually brings clarity and ensures you don’t miss anything.

Sometimes, restating criteria in plain(er) English helps. For example, instead of this:

The entity selects and develops control activities that contribute to the mitigation of risks to the achievement of objectives to acceptable levels.

Consider this:

The company tries to reduce risk to acceptable levels.

This might be enough to get you unstuck.

Similarly, try restating criteria in very concrete terms. For example, instead of this:

The entity specifies objectives with sufficient clarity to enable the identification and assessment of risks relating to objectives.

Consider this:

The company declares what data it's trying to protect from whom and what capabilities it grants to its customers.

When the criteria are just too abstract, ask your auditor. They’re incredibly knowledgeable! They’re often the best translators of the abstract criteria into something concrete that applies directly to your company.

What makes a good SOC 2 control

A good control is auditable, which is a higher standard than merely being testable. It requires the control to be specific enough to hold it in your head. More importantly, it requires the control to operate continuously. That means they describe how a process or technology works and not how to test whether a process or technology is working.

In many cases, humans are involved in the operation of a control and, in those cases, it’s good to identify those humans (by their role) in the control itself.

As we’ll see in the following sections, the specific control language you choose implies something about how it will be audited. You can use that to your advantage.

An example of a bad control

Only correct changes may be deployed to production.

This example is too aspirational to be an effective control. It doesn’t describe any process or technology specifically. By leaving responsibility unstated, it provides no clue as to whether it depends on human activity to operate. Worst of all, the condition it states is impossible to deliver, in general, and will thus be impossible to audit. In other words, correctness is not a process.

An example of a good control

Code changes must pass all test suites before being deployed to production.

This is a better attempt at a control that partially addresses CC8.1, the criteria covering change management. It still reads as a fairly broad and abstract statement. There are some important differences between this and the previous version, though, which make it a good control. It’s unambiguous, using words like “must,” “all,” “before,” and “production.” It doesn’t explicitly assign responsibility but, in context, implies automated enforcement. It states an invariant that should always hold rather than describing how to test it at one point in time.

Bridging the gap between a startup’s practices and the SOC 2 criteria

Every company’s different and so the criteria that trip each company up may very well be different. However, time and again, these criteria seem to be unmet or perhaps just lacking in evidence at the very smallest companies. This isn’t at all to suggest these companies are doing anything wrong — they’re just less mature and the SOC 2 criteria are written for large, mature companies. Nonetheless, establishing a SOC 2 compliance program means addressing these criteria.

CC1.1, CC1.4, CC1.5: Employee performance management

It’s OK to formalize performance management during your first audit period but make absolutely sure that you retain some evidence of your performance management process. There should be policies, career ladders, etc. There should be evidence of feedback given to employees.

CC2.1, CC4.1: Annual review of all policies

Your meta-policy should be to review all your policies and controls annually. It’s easiest to get in the habit of reviewing your policies a few weeks ahead of every audit. Don’t forget to preserve some affirmative and timestamped evidence of the review.

CC2.3: Population of releases to sample for effective communication

When releases impact your customers’ security, they need to be told as much. Note well that this impact could be positive or negative (though it’s hard to imagine a release that purposely negatively impacts security); for example, if you add support for two-factor authentication, this is a release that positively impacts your customers’ security but they only benefit if they know about it, so they must be told. This is why you get emails from AWS, Fastly, Slack, etc. alerting you that e.g. they’re deprecating old versions of TLS.

CC6.3: Access matrix and quarterly access reviews

Giving everyone access to everything has a very short shelf life. Deciding one-off what each employee may access isn’t scalable or auditable. So you need to create some buckets that determine the access granted to employees in each bucket. This “access matrix” is a matter of policy and should be reviewed annually.

The actual access individuals have, though, will inevitably drift due to in-the-moment changes, role changes, etc. You should have a control that requires you to review access to all systems quarterly to ensure only those who should have access actually do. Preserve evidence of these reviews and what, if any, corrections you make. It’s good if your quarterly access reviews find nothing but it’s also good if they find and correct errors. The only bad outcome is finding an error that’s more than a quarter old during an audit.

CC6.6: Firewall configuration

Auditors will always care about firewall configuration and changes to that configuration. This is very hard to control if you make changes through e.g. the AWS Console, so make every effort to put firewall changes behind a code review and approval process.

CC7.1: Vulnerability management

Vulnerability scanning should be happening regularly, in production. However discovered, you’ll need to have a story for patching vulnerable software, whether your own or a dependency. GitHub and Snyk have a lot to offer you here.

CC7.2: Security monitoring and alerting

Monitoring is not just for whether the service is up and fast, it’s also whether all the access that’s happening is authorized. AWS CloudTrail or its equivalent are table-stakes. If you run significantly complex Linux infrastructure yourself, consider Slack’s go-audit. What you alert on is the trickiest decision and one that security teams are reticent to talk about publicly. AWS GuardDuty provides good defaults for most organizations. In your future, red team penetration tests will help you discover more things to alert on.

CC7.3, CC7.4, CC7.5: Population of incidents to sample for resolution

A list of every time service is impacted by some incident is surprisingly hard to come by without a ton of human effort. You’re going to have to formalize your incident response process at least far enough to demand that every incident have an incident record in PagerDuty, a Slack channel, or similar. Then you can give your auditors a list of all incidents and they can ask you questions about some of them.

A1.1: Capacity planning

You’re probably doing an OK job of this already if your service is usually up but you’re likely not preserving any evidence that you’re doing so.

A1.2, A1.3: Disaster recovering planning and practice

Big picture: Your need to prove you can recover from the irrecoverable loss of e.g. an AWS region. This means replicating your backups to another region or cloud provider, parameterizing your infrastructure, etc. Ideally, you can state an RPO (Recovery Point Objective, how fresh the data will be when you’re recovered relative to when disaster struck) and RTO (Recovery Time Objective, how long it takes you to recover after the disaster).

Spreadsheet tips for SOC 2

SOC 2 audits are driven by spreadsheets. You provide your auditors your control matrix — the list of all your controls mapped to the criteria they meet — in a spreadsheet. They respond with their DRL — document request list — another spreadsheet letting you know the evidence they want to see and the people they want to interview. You may propose edits to the DRL if you have a better way to provide evidence.

You should give each control a stable synthetic identifier (probably just a number) that won’t change if your spreadsheet is reordered, rows are added, rows are deleted, etc. Your auditors will use these identifiers when talking about your controls and even when referencing previous audits. Stability is very useful here. The synthetic identifiers you assign your controls will be used throughout your SOC 2 report as your auditors demonstrate that your controls together cover all of the relevant criteria.

Auditors tend to work methodically, filtering spreadsheets to work through controls in the most efficient order, finding the controls owned by the person they’re meeting with, and hiding controls for which they’ve already collected evidence. By the end of the audit, they’re down to nothing.

* * *

This was part two in the SOC 2 compliance for startups and first-timers series. Part one taught you how to delay SOC 2 compliance without sabotaging your business. Part three will take you through your first audit. And part four will take you to your second audit and beyond.

  1. How to delay an audit without sabotaging your business
  2. Reading the criteria, writing controls, and preparing for your first SOC 2 audit (you are here)
  3. What to know going into your first SOC 2 audit
  4. How to operate and improve your compliance program after a first SOC 2 audit

Looking for a leg up in meeting the SOC 2 criteria covering access to production, network segmentation, or change management? Substrate is the right way to use AWS, designed with SOC 2 compliance in mind.