HTML5 Security in a Nutshell

Lots of people have been asking us for opinions on HTML5 security lately. Chris and I discussed the potential attack vectors with the Veracode research team, most notably Brandon Creighton and Isaac Dawson. Here’s some of what we came up with. Keep in mind that the HTML5 spec and implementations are still evolving, particularly with respect to security concerns, so we shouldn’t assume any of this is set in stone.

Don’t Forget Origin Checks on Cross-Document Messaging

Applications that use cross-document messaging could be unsafe if origin checking is done incorrectly (or not at all) in the message receivers. It’s important that developers writing apps that rely on postMessage() carefully check to ensure that messages originate from their own sites, otherwise malicious code from other sites could spoof rogue messages. The functionality itself isn’t inherently insecure, though; developers have used various DOM/browser capabilities to emulate cross-domain messaging for some time now. The window.name attribute has been abused, as has Javascript-driven injection of HTML and URL rewriting. There’s even a cross-platform JavaScript library called easyXDM that provides a friendly interface to these hacks.

One bright spot with regard to cross-document messaging is that older apps won’t be threatened by these issues, only new apps that are intentionally written to rely on the feature.

Local Storage Isn’t as Problematic as You Think

Local storage doesn’t appear to present major security risks, despite a lot of FUD circulating on the topic. Besides cookies, there have always been numerous ways for web apps to store data client-side through the use of plugins (Java, JWS, Flash, Silverlight, Google Gears, etc.) or browser extensions — WebKit/Safari/Chrome have supported local storage before it was even part of HTML5.

Developers should also be aware that as currently implemented, the HTML5 sessionStorage attribute can be vulnerable to manipulation from foreign sites under certain circumstances. A remote site can get a handle to a window containing a site for which a browser has data in sessionStorage. Then, the remote site can navigate to arbitrary URLs in that window, while the window will still contain its
sessionStorage. Hopefully this implementation bug will be fixed by the time the standard is final.

New Tags Increase Attack Surface

HTML5 will also support new data formats and tags such as the <canvas> and <video> tags. In-browser support for video means browser developers now have to parse historically bug-ridden file formats. This increases the attack surface of HTML5 browsers but otherwise doesn’t affect the typical web app developer. The <canvas> tag is a complex set of functionality mixing Javascript and imaging-related functions, and image parsers have historically been rife with vulnerabilities.

Developers Should Be Wary of Cross-Origin Javascript Requests

Another new feature set that’s not directly part of HTML5, but has recently been introduced, is limited support for cross-origin Javascript requests. Historically, it’s been forbidden for Javascript code to request pages from any host other than the page that served the script itself; this is part of the same-origin policy. However, the W3C’s current draft for Cross-Origin Resource Sharing provides a way to circumvent the same-origin policy using a mechanism similar to the crossdomain.xml file in Flash (i.e. the server decides which domains are allowed to access its resources).

Firefox, Safari, and Chrome currently allow cross-domain requests to be sent using XMLHttpRequest. Before the entire request is allowed to proceed, the browser sends a probe request using the OPTIONS method (instead of, for example, GET or POST) first. If the server responds to this probe with an “Access-Control-Allow-Origin” header that gives the source host permission to make the request, the browser will then resend the full request with the requested HTTP method. This is consistent with the current working draft for W3C Cross-Origin Resource Sharing.

However, IE works differently. Instead of relaxing permissions on XMLHttpRequest, it uses a new object type called XDomainRequest. Also, instead of sending a probe that replaces the normal HTTP method with OPTIONS, its probe includes the original HTTP method as well as the request body (in the other browsers, the request body is omitted).

The cross-domain-request features are actually fairly troublesome, from a security point of view. Malicious code on any site can cause probe requests to be sent to any other site, in every major browser, today. Developers need to be aware of both probe types and ensure that their applications won’t be fooled by probes. Fortunately, cookies aren’t passed in any browser’s probe request. Adding to the confusion, some of the official documentation on the topic contains reference code that is blatantly insecure. For example, in an MSDN page on XDomainRequest, ASP code is provided for setting the “Access-Control-Allow-Origin” header field to “*”. This would allow any remote site to make unauthenticated requests against that page from JavaScript, which is not advisable for most applications. Developers need to be sure they understand the dangers of creating an overly permissive access control list.

Sandbox Attribute Could Make Security Easier

One thing that may help, depending on how the standard is eventually defined and implemented, is the support for a sandbox attribute on IFRAMEs. This attribute will allow a developer to chose how data should be interpreted. Unfortunately, this design, like much of HTML, has a pretty high chance of being misunderstood by developers and may easily be disabled for the sake of convenience. If done properly, it could help protect against malicious third-party ads or anywhere else that accepts untrusted content to be redisplayed.

Always Remember Input Validation

The most important thing that developers can do is to remember basic security tenets, for example, the idea that all user input should be considered untrusted. They should learn how the new HTML5 features actually work in order to understand where they’d be tempted to make erroneous assumptions.

Veracode Security Solutions
Veracode Security Threat Guides

Marc Ruef | May 18, 2010 3:34 am

Hello,

Nice summary. It would be nice if you are able to present some simple example of potentially flawed code and/or proof-of-concepts of exploits.

Regards,

Marc

Dr.Ali Jahangiri | May 18, 2010 7:19 am

It is a great article. I will use it as a reference in my lectures or publications.

Andre Gironda | May 18, 2010 10:50 am

What do you think of html5security.googlecode.com
?

brandon creighton | May 19, 2010 11:43 am

hey! it’s good to see other people working on this.

Cross-domain messaging: yes, of course you have to do it incorrectly to be insecure. But it’s not as implicit as you make it sound; in the postMessage() handler, it’s up to you to check the event object’s origin. Nothing’s making you do that, and certainly nothing’s preventing you from messing it up (say, by doing a substring search for “trustedhost.com”)). There’s some better discussion about it on the MDC page for postMessage(): https://developer.mozilla.org/En/DOM/Window.postMessage

brandon creighton | May 19, 2010 12:14 pm

Also, I’m kind of with you on the codec situation. In a typical user’s browser, media files are already being parsed by a host of third-party plugins; it’s tough to argue that the attack surface is greatly increased by the audio/video tags — or even other new parsing code being introduced into browsers (WOFF, web socket frames, etc.). Browser authors now have a new minefield of parsing/playing code they’ll need to be careful with; but users have been running that kind of code already for years.

On the other hand, I gotta disagree that CORS/XDR are entirely immaterial. Certainly the inability to not read the responses from preflights limits certain attacks. That doesn’t mean exploitation scenarios won’t exist. The preflights don’t send Referer, following XHR’s tradition; that’s not true of DOM-manipulation-based POSTs. (Referers are, sadly, not reliably present enough to be used as infallible XSRF protection; but that doesn’t mean they’re not worth checking when they *are* present). If XDR also sent cookies in the preflight, you’d be able to XSRF non-nonce-protected requests like file uploads (you can send arbitrary bodies!) and other weird stuff. If you can header-inject, or if the target server has messed up the Access-Control-Allow-Origin:, you can actually do that. That’s not true now. That’s why it’s sad to see stuff like the MSDN article’s “just let * through” code-snippet recommendation; I suspect we’ll all be seeing that mistake being made in apps shortly.

It is unfair to single out CORS/XDR; it’s probably more constructive to weave them into the “XSRF is a serious problem” narrative. I think it’s time to put the onus on browser developers to clearly make available the provenance of request origins (including referers and request source (XHR/link click/JS navigation)) in a standard, reliable way.

Isaac Dawson | May 19, 2010 3:43 pm

@mario
Some pretty good stuff on the html5security.googlecode.com site. In regards to his review of our article, let me quickly state that this was a very informal ‘what do you feel some of the risks pose for the new html/web technologies?’ and we figured some good information came out of it so we decided to share it. There are obviously people investing a lot of time into researching specific attacks and defenses. Anyways there are a few things I would like to point out from his review:
>• Cite: Don’t Forget Origin Checks on Cross-Document Messaging
>o Those checks are implicit – as a developer you have to do it wrong on
> purpose. You really have to want it. Especially when handling the message
> – why not mentioning native JSON here as an awesome alternative to eval()
> or __defineSetter__() and friends? And… code please! Don’t tell to not mess
> up – show how to mess up and how not to!

Having to check the origin of a message is not implicit. In this case not checking the event’s origin property will allow any message through to the receiver. Mozilla sums it up nicely in their post from https://developer.mozilla.org/en/DOM/window.postMessage.
“If you do expect to receive messages from other sites, always verify the sender’s identity using the origin and possibly source properties. Any window (including, for example, http://evil.example.com) can send a message to any other window, and you have no guarantees that an unknown sender will not send malicious messages. Having verified identity, however, you still should always verify the syntax of the received message. Otherwise, a security hole in the site you trusted to send only trusted messages could then open a cross-site scripting hole in your site.”

>• Cite: New Tags Increase Attack Surface
>o No they don’t. Testing with enumeration and fuzzing showed that not the
>tags are the problem – but the attributes. And what browsers do with it.
> and don’t cause too much harm from a web security perspective -
>there’s nothing they do hasn’t done before and that much worse. Not
> to even mention . Think about the attributes – “autofocus”, “poster”,
> “formaction”, and all the stuff that has been there before – the tags were never
> the real problem.

Probably a poor title heading choice on our part, but we figured it would be assumed that we were primarily touching on the new formats that are being brought in for the new video / audio tags. Not the tags themselves. Also Brandon spoke on this subject already in the comments.

>• Cite: Developers Should Be Wary of Cross-Origin JavaScript Requests
>o Yes – and? It’s easy to make cross domain requests – use images, style
>sheets, scripted forms whatever. What’s hard is reading the response – and
>that’s what’s relevant for most attack patterns in this direction either. Don’t
>blame XDR for what HTTP did wrong.

We pretty much agree and we thought we made that some what clear in our post . That doesn’t mean there are not concerns, especially when some developers will almost certainly be opening up all access to their servers whether or not they understand the ramifications of doing so.

>• Cite: Sandbox Attribute Could Make Security Easier
>o Nope – it doesn’t. We have the SOP for this and it (usually) works for cross
>domain restrictions. One problem the “sandbox” attribute can actually solve if
>implemented correctly are frame busters via “allow-scripting”. Since no user
>agent has actually implemented the sandbox so far it’s still a lot of hypothesis
>powder involved in saying it makes security easier. What will the “webmaster” >do with sand-boxed tags and let’s say his precious money making ad
>banners and skyscrapers? No JavaScript for them? Or better some flash? Or no
>at all? Sand-boxed frames raise a lot of new questions instead of
>solving them. Again you managed to invert HTML5 goodies to be problematic
>and vice versa. And what about “seamless” iframes?

As we stated the sandbox is a moving target and we are not entirely sure how much it will help. It really depends on how it will be implemented. We feel that in certain circumstances, this type of sandboxing can be a benefit to sites that wish to include some third party content into their site without putting their users at risk in the event that the content is malicious. Also, It should be noted that the sandbox attribute has already been implemented in Chrome Dev channels and Chromium and is currently in WebKit trunk.

>• Cite: Always Remember Input Validation
>o Ah – right! Input validation. I totally forgot in my new project… no
>comments on that educational gem. So – what makes input validation in the
>HTML5 era special? What should we be taking care of? What are critical things >to keep in kind? Input validation to prevent client side SQL injection maybe? New
> attributes that need to be black-listed? Or less attributes to be white-listed?

I think most people understand input validation is a fickle… beast. It really depends on what you are trying to protect against and you can not apply a one size fits all solution to input.

Later on the author goes on to state some things we should have added.

>“Attributes, undoManager, inline SVG, , the “autofocus” related
>security implications. What about “srcdoc”? What about focus stealing
>attacks? What abouttoStaticHtml()? What about DoS via client side regex
>validation? Where’s actual advice on how to do it better? And not to forget about
>the things HTML5 actually fixes – like the weird SHORTTAG syntax – compare
>Firefox 3.6+ in HTML4 and HTML5 mode with this example <phref=”javascript:alert(1)”>ab…”

Which is awesome, some of these I had not been aware of. Here is my quick assessment of them.

UndoManager – This object is bound to the window so SOP would apply. The usual XSS risks exist here. If you have XSS you can read the entire DOM anyways, so besides reading typos not sure what the additional risk is.

Inline SVG – This some what falls under the new data format support, increasing the attack surface. Somewhat unrelated but I found a bug in chromium’s SVG implementation (http://code.google.com/p/chromium/issues/detail?id=21338) a while back.

– Good point (http://code.google.com/p/doctype/wiki/MetaCharsetAttribute) forcing the character set of a document as to not cause browsers to ‘sniff the charset’ is definitely something developers should be aware of. However, this is no real different than which, if not supplied in html 4 documents could lead to the same issues.

Autofocus – Also a good point, chrome will even autofocus into an iframe (overriding the primary documents autofocus’d element) which could be problematic and could potentially be (ab)used in DOM redressing style attacks. Not to mention the fact that just having the ‘autofocus’ attribute and an onfocus event inside a tag leads to instant execution of script.

srcdoc’s attribute – srcdoc allows for developers to include 3rd party content into an attribute. This can be protected by the sandbox attribute but will definitely have it’s own interesting security challenges. Doing proper validation, specifically encoding quotes to protect against attribute break out, will be key in order to maintaining control over the data included in the srcdoc attribute.

Focus stealing attacks – somewhat summed up in the autofocus attribute but same sort of risks with DOM redressing.

toStaticHtml – This is only implemented in IE8 and as far as I know nothing in HTML 5 really compares to it unlike IE’s XDR which is a similar implementation of CORS.

DoS via client side regex validation – I am assuming this is in response to input fields allowing regular expressions in the “pattern” attribute (http://www.whatwg.org/specs/web-apps/current-work/#the-pattern-attribute). It is assumed here that the browsers built in EMCAScript engine will be parsing these regex’s. Invalid regex’s could put a user at risk if one can inject values into a victims form.

SHORTTAG – totally, agree it will help significantly, once of course, the backwards compatibility is removed!!!

lavakumar | June 20, 2010 5:00 pm

@Chris
This is probably the very first article specifically on HTML5 security, great effort.

I have put up a couple of articles on the HTML5 Security project on Google code covering Cross Origin Request security and Web SQL Database security.(URLs below)
http://code.google.com/p/html5security/wiki/WebSQLDatabaseSecurity &
http://code.google.com/p/html5security/wiki/CrossOriginRequestSecurity

If you feel there is something missing based on your research then please do drop in a comment.

@Marc
I have put up an HTML5 Quick Reference Guide along with live demos at http://www.andlabs.org/html5.html

Security a Concern as HTML5 Gains Traction | September 19, 2010 11:34 pm

[...] has been around for a long time with technologies like Flash. However, engineers at Veracode in May raised warnings about an implementation issue with the sessionStorage feature that could make it vulnerable to manipulation from untrusted Web sites. The new [...]

Tom | April 18, 2011 1:03 am

We really need to establish a formal set of standards before we can really progress in web development.

Bob Novell | June 18, 2011 3:28 pm

Sandboxed frames are one of the biggest security problems I’ve seen in a long time.

Look at all the work that people invest in trying to prevent their pages being “framed” and to break out of those frames.

Along comes a feature which will allow anyone to frame any page on the web – regardless of the wishes of the owner of the page.

Oh, you say X-Frame response headers will prevent this?

Well, not really.

Currently IE8 displays a message saying that the material cannot be displayed and FF simply displays about:blank but there is no way to break out of the frame and go to the actual page on your web site – using the X-Frame header alone.

I am implementing frame busters on all of my sites and testing the different methods.

One thing that burns me is Google’s image search. Hey, could they make it any easier to steal images? I don’t see how!

Click on a thumbnail on a search results page and you get the page on which the image exists in a frame underneath a copy of the image on top of the frame where you simply right click and “save image as” to add the image to your collection of stolen images.

If someone wants to see my images, which, by the way, are copyrighted, they can go to my page.

If I use a frame breaker, they wind up on my page, coming from a Google image search (or anyone’s image search which frames my pages).

But, and it is a BIG BUT, if I use X-Frame response header to deny the framing of the page, the nice box containing my image still is displayed, the only difference is that the framed page is not displayed underneath it. That doesn’t accomplish what I want to do – I want to break out of the frame and display the my page without anyone’s frames.

Sandboxed frames were designed by clickjackers – right?

I can’t see how any sane person would conceive of such an idea and then actually contemplate implementing it.

It’s not enough that there are ways to void most (if not all) frame breaking methods, now we give the crooks another way — sandboxed frames.

Wow! The inmates are most definitely running the asylum.

Bob Novell

Security a Concern as HTML5 Gains Traction | Threatpost | March 24, 2013 9:36 pm

[...] has been around for a long time with technologies like Flash. However, engineers at Veracode in May raised warnings about an implementation issue with the sessionStorage feature that could make it vulnerable to manipulation from untrusted Web [...]

Please Post Your Comments & Reviews

Your email address will not be published. Required fields are marked *

*

RSS feed for comments on this post