Defending against the BREACH attack

When Juliano and Thai disclosed the CRIME attack last year, it was clear that the same attack technique could be applied to any other compressed data, and compressed response bodies (via HTTP compression) in particular. But it was also clear that—with our exploit-driven culture—browser vendors were not going to do anything about.

Progress will be made now that there is an exploit to worry about because, this year at Black Hat, a group of researched presented BREACH, a variant of CRIME that works exactly where it hurts the most, on HTTP response bodies. If you’re not already familiar with the attack I suggest that you go to the researchers’ web site, where they have a very nice paper and a set of slides.

If you don’t want to read the paper right this moment, I will remind you that the CRIME attack works by having the attacker guess some secret text. The trick is to include the guess in the same context as the actual secret (for example, the same response body). When his guess is 100% wrong, the size of the response increases for the size of the guess. But, when the guess is correct (fully, or partially, meaning there is some overlap between the guess and the secret), compression kicks in, and the response body shrinks slightly. With enough guesses and enough time, you can guess anything on the page.

TLS does not defend against this attack because, when the protocol was originally designed, it was impossible for MITM attackers to submit arbitrary plaintext via victims’ browsers. Since then, the threat model evolved, but the protocols remained the same. (Interestingly, there is a draft proposal to deal with this sort of thing at the TLS level: Length Hiding Padding for the Transport Layer Security Protocol.)

Mitigation
Clearly, one option is to address the problem at the TLS level; but that will take some time.

Outside TLS, the paper itself includes a nice list of mitigation options, and, with exceptions, I am not going to repeat them here. In general, it’s not going to be easy. When dealing with CRIME, we were fortunate because few people knew TLS compression existed, and only a small number of browsers/users actually supported it. This time, the flaw is exploited in a feature that’s not only very widely used, but one which many sites cannot exist without. Just try convincing a large site to turn off compression, at a large financial and performance cost.

Although most discussions about BREACH will no doubt focus on its threat against CSRF tokens, we should understand that the impact is potentially wider. Any sensitive data contained in responses is under threat. The good news is that session tokens don’t appear to be exposed. However, a well-placed forged CSRF request can do a lot of damage.

CSRF token defence
For CSRF tokens there is a simple and effective defence, which is to randomize the token by masking it with a different (random) value on every response. The masking does not hide the token (whoever has the token can easily reverse the masking), but it does defeat the attack technique. Guessing is impossible when the secret is changing all the time. Thus, we can expect that most frameworks will adopt this technique. Those who rely on frameworks will only need to upgrade to take advantage of the defence. Those who don’t will have to fix their code.

HTTP chunked encoding mitigation
The award for least-intrusive and entirely painless mitigation proposal goes to Paul Querna who, on the httpd-dev mailing list, proposed to use the HTTP chunked encoding to randomize response length. Chunked encoding is a HTTP feature that is typically used when the size of the response body is not known in advance; only the size of the next chunk is known. Because chunks carry some additional information, they affect the size of the response, but not the content. By forcing more chunks than necessary, for example, you can increase the length of the response. To the attacker, who can see only the size of the response body, but not anything else, the chunks are invisible. (Assuming they’re not sent in individual TCP packets or TLS records, of course.)

This mitigation technique is very easy to implement at the web server level, which makes it the least expensive option. There is only a question about its effectiveness. No one has done the maths yet, but most seem to agree that response length randomization slows down the attacker, but does not prevent the attack entirely. But, if the attack can be slowed down significantly, perhaps it will be as good as prevented.

Referer check mitigation
A quick, dirty, tricky, and a potentially unreliable mitigation approach you can apply today, is to perform Referer header checks on all incoming requests. Because the attacker cannot inject requests from the web site itself (unless he gains access via XSS, in which case he owns the browser and has no need for further attacks), he must do so from some other web site (a malicious web site, or an innocent site hijacked from a MITM location). In that case, the referrer information will show the request originating from that other web site, and we can easily detect that.

Now, you can’t just drop such requests (because then no links to your web site would work any more), but you can drop all the cookies before they reach the application. Without the cookies, your application will not resume the victim’s session, and won’t place anything secret in the response. Attack mitigated.

There is a catch, however (as @hubert3 pointed out to me): if your web site relies on being framed by arbitrary 3rd party web sites, or if it exposes public services to others (for browser consumption), then you can’t use this defence. I am assuming your services need the cookies. If they do not, you’re back in the game. If you do decide to try it, please test your major use cases in a staging environment first.

You can implement this defence with only two Apache directives:

SetEnvIfNoCase Referer ^($|https://www\.example\.com/) keep_cookies

RequestHeader unset Cookie env=!keep_cookies

The cookies are kept for requests arriving from the same site, as well as those arriving without referrer information (e.g., from a bookmark). There’s a potential problem with users that follow links from other sites (not bookmarks) and expect to be logged in straight away. For such users you might need to have a welcome page, where you will ask them to click on a link to enter the web site. The cookies will be sent again on the next request.

Just to be clear, there is a long history of attacks focusing on spoofing the Referer header. For example, there was one such problem in Chrome just recently. However, such attacks are addressed quickly after discovery.

To conclude, I can’t really say that I like this approach, but its huge advantage is that you can deploy it very quickly at the web server or reverse proxy level, without having to make any changes to the application code. Even with all the constraints, I imagine there will be a large number of applications for which the trade-offs will be acceptable.

We should really fix browsers
I would really like to see this problem addressed where it should be addressed—at the browser level. At the moment, a MITM attacker can intercept your non-encrypted requests, mess with them, and trick your browser into sending requests with arbitrary content to the sites that you care about. It’s this interaction that’s making several very interesting attacks possible: BEAST, CRIME, RC4, and now BREACH.

A carefully designed opt-in security measure could do the trick, but I suppose it would take a lot of talking and politics to get it implemented. The idea is that a web site can control which other web sites (cross-origins) can initiate requests to it, even if it is via script and img tags.

Incidentally, just a couple of days ago, Mike Shema and Vaagn Toukharian (fellow Qualys employees), proposed a new cookie control (there is now a follow-up post from Mike) that would restrict when cookies are sent. Their intention was to deal with CSRF, but the measure would work against BREACH, too. If a 3rd party web site initiates a request to your web site, against your wishes, being able to tell the browser to drop the cookies would mitigate the potential attacks.

Update: The first version of this article included a referrer check defence that allowed empty referrers, with the idea to support auto-log in for the users following bookmarks. But then Krzysztof Kotowicz pointed out that the Referer header can be removed by the attackers. I have now modified the example to drop cookies on all requests not originating from the web site.

Don't miss