Wednesday, August 24, 2011

Death to the filters - how to validate JSON correctly

JSON is a pretty popular data-interchange format. It's supported across programming languages, fits naturally into Javascript environment, it's human-readable and it's much lighter than XML. No wonder the format is widely adopted in many web applications.
Disclaimer: This post will probably be nothing new to security experts. It's meant for developers, as I've seen vulnerable applications using the bad practices described here. 

Is JSON secure?

Well, it has some quirks (it had even more of them in the past) but mostly it's good enough. Provided two things:
  • you don't use eval!
  • you are sure that's JSON (not everything that's a valid Javascript is JSON, it's only a subset)
Eval() is, as usual, asking yourself for trouble. Unfortunately, it's common - it's even mentioned in JSON's own documentation as a recommended practice:
To convert a JSON text into an object, you can use the eval() function. eval() invokes the JavaScript compiler. Since JSON is a proper subset of JavaScript, the compiler will correctly parse the text and produce an object structure. The text must be wrapped in parens to avoid tripping on an ambiguity in JavaScript's syntax.

var myObject = eval('(' + myJSONtext + ')');
The eval function is very fast. However, it can compile and execute any JavaScript program, so there can be security issues. The use of eval is indicated when the source is trusted and competent. It is much safer to use a JSON parser. In web applications over XMLHttpRequest, communication is permitted only to the same origin* that provide that page, so it is trusted. But it might not be competent. If the server is not rigorous in its JSON encoding, or if it does not scrupulously validate all of its inputs, then it could deliver invalid JSON text that could be carrying dangerous script. The eval function would execute the script, unleashing its malice.
* docs are outdated, you can do cross-origin request now (CORS)

Example - no filters

So, when using eval() it's crucial that the input is in fact valid JSON that has not been tampered with. Consider the following situation:

Our application uses JSON to store various user preferences (e.g. background color settings). Server does not care about the JSON, it's only relevant to client-side scripts. The data flow looks like this:
  1. User fills out the 'change preferences' form
  2. Javascript code creates JSON out of this and POSTs it to the server
  3. Server stores JSON in the database and retrieves it in future requests. The resulting code looks like this:  preferences_json = '{"background":"red"}';
  4. Javascript needs to get preferences, so it uses:
var prefs = eval("(" + preferences_json + ")");
document.body.style.backgroundColor = prefs.background;
...
XSSing this is simple - escape from single quotes:
//  payload:    '; alert(1); //
// this will execute in the preferences_json line like this:
preferences_json = ''; alert(1); //';
So the server needs to either deny ' in input or escape the output (' becomes \'). Are we secure now? No!

Filtering

Even with proper escaping, we do not have to escape the string at all for XSS. We can do the code execution in eval() call:
// payload:   alert(1),{"background":"red"}
// This will become:
eval("("+'alert(1),{"background":"red"}'+")");
// and alert will execute. 
One very bad practice I saw is disallowing certain characters in JSON input when storing it. For example, you could disallow "(" and ")" so that function can't be executed. But there are ways to execute functions without "()" - see for example this wonderful vector for IE. In fact, it's even possible to execute JS in this context if you're only allowing a-z A-Z 0-9 { } - : , . " (most of these characters are required for JSON to work anyway). Another example - allow \ and attacker can use obfuscation (e.g. \u0020 ) and embed pretty much any character.

If you validate JSON by looking at characters only and disallowing / allowing them - you're toasted. You've just opened a bag of tricks for attackers to use (and new techniques requiring fewer characters are still in development). Don't!

Try for yourself!

Think you can execute arbitrary code in this victim page? Yes, you can!

Source code for examples is, as always, on GitHub. And remember:
  • If you accept JSON input, make sure it's a legit JSON input (JSON is a subset of Javascript!). In PHP you can use json_decode()to validate server-side.
  • Validating JSON using character whitelist is a dead end. Don't do it. There are really tricky vectors around.
  • Don't use eval! There's JSON.parse() built into newer browsers, for older - use this.

Monday, August 8, 2011

How not to implement CAPTCHAs (MotionCAPTCHA rant)

CAPTCHA ("Completely Automated Public Turing test to tell Computers and Humans Apart") are security controls used to prevent automated attacks against a website. You can see them in Gmail account register pages, blog comment forms etc. Sometimes they are used as a CSRF attack countermeasure.

How to do it?

Designing CAPTCHA challenge is hard - you need to fight OCRs (if the CAPTCHA is based on image recognition) and cheap labour (attackers pay $12 for solving 500 CAPTCHA, according to OWASP). On the other hand, it needs to be solvable by your users - so this one, while certainly being a strong challenge, is probably not a good idea ;)


But no matter what the challenge, the whole CAPTCHA architecture must be secure. For example:

  • it must be protected against replay attacks (CAPTCHA ID generated once cannot be reused)
  • CAPTCHA given for one session should not be valid in other session. There must be a relation between CAPTCHA and a session, so that the attacker won't solve the CAPTCHA (e.g. manually) in his own session and submit the solved value in the other session.
  • CAPTCHA solution must not be stored on the client
  • it must be protected against brute-force attack (e.g.require another CAPTCHA after a few invalid responses)
  • The amount of possible CAPTCHAs must be A Big Number (bruteforce protection).
I really recommend all the developers to read Strong CAPTCHA guidelines as many design issues are tricky. 

MotionCAPTCHA - The fail of the day

What pushed me into writing this blog post is the MotionCAPTCHA project I encountered. It's a HTML5 (yay!) canvas based project where the challenge is to repeat drawing a given shape to prove being a human. I've seen a few broken CAPTCHAs in my life, but this one is just over the top. Just two examples:

These are the available shapes (all 16 of them):



And this is how you implement it (cited from the readme):
  • You manually disable your form, by emptying the form's action attribute, and placing its value in a hidden <input> with a specific ID. You should also put disabled="disabled" on the submit button, for added points.
  • You add a few HTML lines to your form to initialise the MotionCAPTCHA canvas, and add the plugin's scripts to your page.
  • The user draws the shape and, if it checks out, the plugin inserts the form's action into the <form> tag. The user can submit the form, happy days.
I mean, WTF? So the form looks like this:
<form action="#" method="post" id="mc-form"> 
				<p><input class="placeholder" type="text" placeholder="your name &hellip;"></p> 
				<p><input class="placeholder" type="email" placeholder="your email address &hellip;"></p> 
				<p><textarea class="placeholder" placeholder="your message &hellip;" cols="" rows="3"></textarea></p> 
 
				<div id="mc"> 
					<p>Please draw the shape in the box to submit the form: (<a onclick="window.location.reload()" href="#" title="Click for a new shape">new shape</a>)</p> 
					<canvas id="mc-canvas"> 
						Your browser doesn't support the canvas element - please visit in a modern browser.
					</canvas> 
					<input type="hidden" id="mc-action" value="http://josscrowcroft.com/projects/motioncaptcha-jquery-plugin/" /> 
				</div> 
 
				<p><input disabled="disabled" autocomplete="false" type="submit" value="Submit"></p> 
				
				<p><a href="http://twitter.com/share" class="twitter-share-button" data-count="horizontal" data-via="josscrowcroft">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></p> 
 
			</form>
and when I solve the challenge, the 'mc-action' will become the form action and the form submits? And the server does not care about the challenge cause it's client side only? How about bot POSTing the form in its entirety to the mc-action URL and bypassing the CAPTCHA completely? #security #fail! Client side CAPTCHA will never be secure, it just can't!