Tuesday, June 28, 2016

Reflections on trusting CSP

Tldr; new changes in CSP sweep a huge number of the vulns, yet they enable new bypasses. Internet lives on, ignoring CSP.


Let’s talk about CSP today:
Content Security Policy (CSP) - a tool which developers can use to lock down their applications in various ways, mitigating the risk of content injection vulnerabilities such as cross-site scripting, and reducing the privilege with which their applications execute.

Assume XSS, mkay?

CSP is a defense-in-depth web feature that aims to mitigate XSS (for simplicity, let’s ignore other types of vulnerabilities CSP struggles with). Therefore in the discussion we can safely assume there is an XSS flaw in the first place in an application (otherwise, CSP is a no-op).

With this assumption, effective CSP is one that stops the XSS attacks while allowing execution of legitimate JS code. For any complex website, the legitimate code is either:
  • application-specific code, implementing its business logic, or
  • code of its dependencies (frameworks, libraries, 3rd party widgets)
Let’s see how CSP deals with both of them separately.

Just let me run my code!

When CSP was created, the XSS was mostly caused by reflecting user input in the response (reflected XSS). The first attempt at solving this with CSP was then to move every JS snippet from HTML to a separate resource and disable inline script execution (as those are most likely XSS payloads reflected by the vulnerable application). In addition to providing security benefits, it also encouraged refactoring the applications (e.g. transitioning from spaghetti-code to MVC paradigms or separating the behavior from the view; this was hip at that time).

Of course, this put a burden on application developers, as inline styles and scripts were prevalent back then - in the end almost noone used CSP. And so ‘unsafe-inline’ source expression was created. With it, it allowed the developers to use CSP without the security it provides (as now the attacker could again inject code through reflected XSS).

Introducing insecurity into a security feature to ease adoption is an interesting approach, but at least it’s documented:
In either case, developers SHOULD NOT include either 'unsafe-inline', or data: as valid sources in their policies. Both enable XSS attacks by allowing code to be included directly in the document itself; they are best avoided completely.
That’s why this new expression used the unsafe- prefix. If you’re opting out of security benefits, at least you’re aware of it. Later on this was made secure again when nonces were introduced. From then on, script-src 'unsafe-inline' 'nonce-12345' would only allow inline scripts if the nonce attribute had a given value.

Unless the attacker knew the nonce (or the reflection was in the body of a nonced script), their XSS payload would be stopped. Developers then had a way to use inline scripts in a safe fashion. 

But what about my dependencies?

Most of the application dependencies were hosted on CDNs, and had to continue working after CSP was enabled in the application. CSP allowed developers to specify allowed URLs / paths and origins, from which the scripts could be loaded. Anything not on the whitelist would be stopped.

By using the whitelist in CSP you effectively declared that you trusted whatever scripts were there on it. You might not have the control over it (e.g. it’s not sourced from your servers), but you trust it - and it had to be listed explicitly.

Obviously, there were a bunch of bypasses (remember, we assume the XSS flaw is present!) - for one, a lot of CDNs are hosting libraries that execute JS from the page markup, rendering CSP useless as long as certain CDNs are whitelisted. There were also different bypasses related to path & redirections (CSP Oddities presentation summarizes those bypasses in a great way).

In short, it was a mess - also for maintenance. For example, this is a script-src from Gmail:
script-src https://clients4.google.com/insights/consumersurveys/ 'self' 'unsafe-inline' 'unsafe-eval' https://mail.google.com/_/scs/mail-static/ https://hangouts.google.com/ https://talkgadget.google.com/ https://*.talkgadget.google.com/ https://www.googleapis.com/appsmarket/v2/installedApps/ https://www-gm-opensocial.googleusercontent.com/gadgets/js/ https://docs.google.com/static/doclist/client/js/ https://www.google.com/tools/feedback/ https://s.ytimg.com/yts/jsbin/ https://www.youtube.com/iframe_api https://ssl.google-analytics.com/ https://apis.google.com/_/scs/abc-static/ https://apis.google.com/js/ https://clients1.google.com/complete/ https://apis.google.com/_/scs/apps-static/_/js/ https://ssl.gstatic.com/inputtools/js/ https://ssl.gstatic.com/cloudsearch/static/o/js/ https://www.gstatic.com/feedback/js/ https://www.gstatic.com/common_sharing/static/client/js/ https://www.gstatic.com/og/_/js/

Why so many sources? Well, if a certain script from, say https://talkgadget.google.com/ loads a new script, it needs to be whitelisted too.

There are various quirks and bypasses with this type of CSP, but at least it’s explicit. If a feature is unsafe, it’s marked as so, together with the trust I put into all other origins - and that trust is not transitive. Obviously, at the same time it’s very hard to adopt and maintain such a CSP for a given application, especially if your dependencies change their code.

The solution for that problem was recently proposed in the form of strict-dynamic source expression. 

Let's be strict!

What’s strict-dynamic? Let’s look at the CSP spec itself:
The "'strict-dynamic'" source expression aims to make Content Security Policy simpler to deploy for existing applications who have a high degree of confidence in the scripts they load directly, but low confidence in their ability to provide a reasonably secure whitelist. 
If present in a script-src or default-src directive, it has two main effects: 
1. host-source and scheme-source expressions, as well as the "'unsafe-inline'" and "'self' keyword-sources will be ignored when loading script.
2. hash-source and nonce-source expressions will be honored. Script requests which are triggered by non-parser-inserted script elements are allowed.

The first change allows you to deploy "'strict-dynamic' in a backwards compatible way, without requiring user-agent sniffing: the policy 'unsafe-inline' https: 'nonce-abcdefg' 'strict-dynamic' will act like 'unsafe-inline' https: in browsers that support CSP1, https: 'nonce-abcdefg' in browsers that support CSP2, and 'nonce-abcdefg' 'strict-dynamic' in browsers that support CSP3. 
The second allows scripts which are given access to the page via nonces or hashes to bring in their dependencies without adding them explicitly to the page’s policy.
While it might not be obvious from the first read, it introduces a transitive trust concept into the CSP. Strict-dynamic (in supporting browsers) turns off the whitelists completely. Now whenever an already allowed (e.g. because it carried a nonce) code creates a new script element and injects it into the DOM - its execution would not be stopped, regardless of its properties (e.g. the src attribute value or a lack of a nonce). Additionally, it could in turn create additional scripts. It’s like a tooth fairy handed off nonces to every new script element.


As a consequence, now you’re not only trusting your direct dependencies, but also implicitly assume anything they would load at runtime is fair game for your application too. By using it you’re effectively trading control for ease of maintenance.

POC||GTFO

Usefulness of CSP can be determined by its ability to stop the attack assuming there is an XSS flaw in the application. By dropping the whitelists it obviously increases the attack surface, even more so by introducing the transitive trust.

On the other hand, it facilitates adopting a CSP policy that uses nonces. Enabling that, it mitigates a large chunk of reflected XSS flaws, as the injection point would have to be present inside a script element to execute (which is, I think, a rare occurrence). Unfortunately, DOM XSS flaws are much more common nowadays.

For DOM XSS flaws, strict-dynamic CSP would be worse than “legacy” CSP any time a code could be tricked into creating an attacker-controlled script (as old CSP would block it, but the new one would not). Unfortunately, such a flaw is likely present in a large number of applications. For example, here’s an exemplary exploit for applications using JQuery <3.0.0 using this vuln.
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta http-equiv="Content-Security-Policy" content="script-src 'unsafe-inline' 
      'nonce-booboo' 'strict-dynamic'"> 
  <script nonce='booboo' src="https://code.jquery.com/jquery-2.2.4.js" ></script>
</head> 
<body>
<script nonce=booboo> 
// Patched in jQuery 3.0.0
// See https://github.com/jquery/jquery/issues/2432
// https://www.w3.org/TR/CSP3/#strict-dynamic-usage 
$(function() { 
   // URL control in $.get / $.post is a CSP bypass 
   // for CSP policies using strict-dynamic. 
   $.get('data:text/javascript,"use strict"%0d%0aalert(document.domain)');
});
</script>
</body>
</html>

Try it out (requires Chrome 52): https://plnkr.co/edit/x9d0ClcWOrl3tUd33oZR?p=preview

JQuery has a market share of over 96%, so it’s not hard to imagine a large number of applications using $.get / $.post with controlled URLs. For those, strict-dynamic policies are trivially bypassable.

Summary

It was already demonstrated we can’t effectively understand and control what’s available in the CDNs we trust in our CSPs (so we can’t even maintain the whitelists we trust). How come losing control and making the trust transitive is a solution here?

At the very least, these tradeoffs should be expressed by using an unsafe- prefix. In fact, it used to be called unsafe-dynamic, but that was dropped recently. So now we have a strict-* expression that likely enabled bypasses that were not present with oldschool CSP. All that to ease adoption of a security mitigation. Sigh :/