2.16.2011

Client-Side Contextual Encoding for jQuery

As everyone is probably aware by now, jQuery; the awesome brainchild of John Resig - is everywhere! This opened an opportunity for me in my crusade against DOM Based XSS by creating a plugin to allow developers to contextually encode untrusted data on the client side (more and more important with widgets and ajax all over the place).

So what is this new hotness, this awesome plugin? It's called jquery-encoder (yeah, I was feeling very creative when I came up with that name) and it is super-simple to use!

Here is a quick snippet of the power of jquery with jquery-encoder
$.post('http://untrusted.com/webservice', function(data) {
   $('#result').encode('html', data);
});

Under the hood, this runs the untrusted data that is being returned from the untrusted.com webservice through an HTML entity encoding algorythm before setting it using the jQuery .html() function.

You can also encode for HTML Attributes or CSS.
$.post('http://untrusted.com/user-theme-color', function(data) {
   $('body').encode('css', 'background-color', data);
});

$.post('http://untrusted.com/unique-id-generator', function() {
   $('#result').encode('attr', 'id', data);
});


As soon as this matures and get's some testing, a full blown technical description and user's guide will be available - but for now, what I am really looking for is people to try it out! I don't recommend dropping this into your production code just yet, this is just a first attempt at getting this right.

The other big thing that I did was bring the awesome ESAPI canonicalization functionality to the jQuery world. This is *huge* for client side validations and for detecting bad data (multiple/mixed encodings)

The canonicalize function is a static method on the jQuery object and can be used as illustrated below.
$.canonicalize('&lt;script&gt;'); // <script>
$.canonicalize('%3cscript%3d'); // <script>
$.canonicalize('%253cscript%253d') // Raises exception (double)
$.canonicalize('&#x26;lt&#59') // Raises exception (multi-double)

IMHO, this is one of the most powerful utility functions available in the entire ESAPI and I am super-stoked that I was able to port it to javascript for jQuery. However, it needs to be poked at, prodded, and broken before it is rock solid. I currently have a suite of about 70 test cases that I am throwing against it, but I am sure there are at least double that. It will decode escaping for HTML, CSS, and Javascript escaping rules.

Dependencies
  • jQuery ( >=1.4.3 )
  • Class.extend function (prototype or John Resigs)
Links

Source: https://github.com/chrisisbeef/jquery-encoder/blob/master/src/main/javascript/org/owasp/esapi/jquery/encoder.js 

Minified: https://github.com/chrisisbeef/jquery-encoder/blob/master/jquery-encoder-0.1.0.js

Final Thoughts

Please, share in comments if you have any questions or comments - feel free to communicate with me through Github as well.


Now, go forth and break it!

10 comments:

  1. I have no words for this great post such a awe-some information i got gathered. Thanks to Author.
    html5 video player

    ReplyDelete
  2. I come here from the Owasp DOM based XSS prevention cheat sheet. There, it's said that "It has been well noted by the group that any kind of reliance on a JavaScript library for encoding would be problematic as the JavaScript library could be subverted by attackers. "

    I can't understand it... how can it be subverded, if it's inside the HTML code?

    ReplyDelete
  3. because javascript is dynamic, ie code and data are considered the same thing its really a lisp like language in many ways, so if a attacker gets access to the dom script tag containing the library then they can subvert your encoding libraries.

    for example supposing the following encoding function is found in a library
    Encoding_Library.prototype.supersecure_encoding_function = function(input){
    //lots of sanitization stuff here
    return ultrasecureencoding(input); //whatever happens it doesnt matter
    }

    if the attacker is able to get into the javascript execution context
    and gets a referance to your Encoding_Library class(this could be done by finding the id of the script tag via fire bug then say if it had a id just coud just use getelementbyid or even dom traversal worst case, remember the scripts are part of the dom just like anything else it)

    he can then do something like this

    Encoding_Library.prototype.supersecure_encoding_function = function(input){
    return input;
    }

    this will not break anything in terms of the code on the your site and know your super secure encoding function
    is not so super secure in facts its a nop (ie does nothing)

    its late and im really tired so may of left something out but
    that is basically the jist of it javascript is really a mindfield
    the complexity of the various html execution contexts are really
    one of its poorest aspects.

    and to the author of lib thanks for the code was quit interesting to read over, there is a lot of misinformation about encoding out there so stuff like this really helps people out!!!!!

    ReplyDelete
  4. Still can't understand...how could an attacker to get access to the DOM script tag?

    ReplyDelete
  5. If an attacker is able to submit a string that get's rendered in the HTML without any encoding it is possible for the attacker to break out of what we refer to as the "Data context" and into the "Command context" allowing them to execute arbitrary code in the victims browser. In DOM-Based XSS content is rendered without sending the data to the server so it is up to the client to apply the correct encoding to the untrusted data to ensure that it is rendered as data only.

    You can find more information on what DOM-Based XSS is at the OWASP page here: https://www.owasp.org/index.php/DOM_Based_XSS

    ReplyDelete
  6. Thank you for your reply!

    I understand what you tell me, and I've read the Owasp's DOM-Based XSS and I understand how does it work.

    My question was this: In the Owasp prevention cheat, it's said that "[...] any kind of reliance on a JavaScript library for encoding would be problematic as the JavaScript library could be subverted by attackers". But how can an attacker subvert the JS function that is encoding on the client side?

    Thank you very much!

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Let me explain it a little more...

    Suppose I have a web page, where all the strings on it coming from an untrusted source (another user, for example) were properly encoded (html-encoded, or JS-encoded, whatever is adapted in every string) on the Server Side, following the Owasp XSS and Owasp DOM-Based XSS prevention cheat sheets.

    Ok, suppose now I have a Javascript function on the client side. That function is for encoding on the client side, to avoid DOM-Based XSS when the local user makes changes to some HTML in the page (because, although the initial HTML was initially encoded to avoid XSS, it gets decoded when the browser first reads it, but needs to be re-encoded when the user changes the HTML and we need to re-render it).

    The question is, how can that DOM-Based XSS prevention function be subverted by an attacker, if I'm preventing all kind of non DOM-Based XSS in the Server Side?

    ReplyDelete
  9. If everything is encoded correctly it would be extremely difficult for an attacker to compromise the page against another user without the addition of a browser flaw (of which there are many) that would allow them to subvert the Javascript. That being said, the attacker can still subvert the JS protection easily in a controlled environment which could allow him to potentially discover and/or exploit additional flaws in the application.

    The key here is defense-in-depth, and your described approach is exactly the right approach to minimize the risk as much as possible. There isn't much you can do about browser bugs, but you can secure the application as much as possible from your end and then using additional defenses such as CSRF, Validation, Session Management and AuthN/AuthZ controls you can further minimize the risk to your application and your users.

    Hopefully this answers your question - and it is a great question!

    ReplyDelete