Currently, the mixed-content restrictions in Firefox and the WebSocket API spec [WS-API] prevent use of insecure
ws:// URLs in webapps served over HTTPS. This is a problem, because the protocol used over the WebSocket connection may include a suitable method for ensuring confidentiality and integrity of the data, so the restriction is tighter than that actually required for security.
This document justifies loosening the restriction in browsers, for suitable WebSocket subprotocols.
Providing a VNC or SSH client in the browser really needs a way to allow access selectively to
ws:// connections from the HTTPS-served page.
Browsers have a notion of active and passive content - content which can perform an action to affect multiple elements across a page, or has a purely passive role. A browser executes the contents of resources loaded by
<script> tags, so these are clearly active, but it safely parses and displays a resource specified by an
<img> tag. Classifying WebSockets and XHR is difficult, because strictly speaking they’re neither active nor passive, but can be used as either.
For safety, browsers classify XHR as active, because it is overwhelmingly common for a document to be returned and then inserted into the page. Historically, Firefox has seen WebSockets as active, but Chrome has not.
The recommendation of this document is to sidestep this debate by simply refining the notion of ‘mixed-content’ for WebSocket connections to take into account the WebSocket subprotocol. If a connection is deemed not to be mixed, then it doesn’t matter whether it’s active or passive.
The WebSocket API currently includes this requirement:
If secure is false but the origin of the entry script has a scheme component that is itself a secure protocol, e.g. HTTPS, then throw a
SecurityErrorexception and abort these steps. HTML, 10.3.2 The WebSocket interface
I suggest altering this to place step 4 above 2 (normalising the protocols argument) and altering step 2 as follows:
If secure is false and protocols is an empty array or includes an insecure protocol, but the origin of the entry script has a scheme component that is itself a secure protocol, e.g. HTTPS, then throw a
SecurityErrorexception and abort these steps.
The notion of ‘insecure protocol’ must be defined. Protocols which include their own encryption and integrity control need not be encapsulated by TLS to match the security of the containing page. Browsers could either:
- Extend the WebSocket API to allow a webapp to declare that an unknown subprotocol is ‘secure’. For example, the elements in the protocol array could be taken to be either strings, or an object with
- Simply regard any explicit WebSocket subprotocol as secure. Use of subprotocols is not widespread at all at the moment, and this may be an acceptable starting point. (The mixed-content checks are ultimately advisory and are a second-layer of protection which don’t confer security by themselves, they just break sites that are doing something dodgy. If site admins could be trusted to be virtuous, mixed-content protection wouldn’t be needed.)
- Include a configuration parameter with some sensible initial values. Getting support for new protocols would be a big pain though and an annoying barrier to entry.
Finally, the section 10.3.3 "Feedback from the protocol" needs to be updated:
- Change the protocol attribute's value to the subprotocol in use, if is not the null value. If it is the null value and secure is false but the origin’s scheme is a secure protocol, then:
- Change the readyState attribute’s value to CLOSED (3).
- Close the WebSocket connection.
- Fire a simple event named error at the WebSocket object.
The same argument applies to XHR and Events, but these are not as important as getting WebSockets to work.
These use-cases are unknown (or extremely rare) in the wild. WebSockets is the most general protocol, in that any application of XHR requests or Server-Sent Events could also be done with WebSockets, so it’s not high priority to do something too general which includes XHR and Events as well.
I was initially looking for some sort of generic solution that works for XHR (and SS-Events), and I wrote a few Firefox patches along different lines. I didn’t post them though because I was pretty unconfident they’d be accepted. The idea of a generic solution is to provide a way for a resource to say that it isn't actually mixed: suppose a non-TLS WebSocket or XHR connection is made. The browser could send an additional header (
Access-Control-Request-Protocol) and get back an additional header (
(Remember: mixed content detection is a heuristic! It’s there to make it hard for site admins to write bad sites by breaking sites when they try and do silly things, like mixed content. The blocker doesn’t make a site secure though — that's achieved through actually using TLS everywhere, and the admin can do that without any mixed-content blocking from the browser. It’s just there to create a secure ecosystem.)
I had a draft of a spec and an implementation too that looked a bit like CSP (per-origin protection) rather than CORS (per-resource).
Ultimately, those proposals just look too ugly when speced out to be convincing. I think it’s OK to ditch XHR and Events just because they’re a subcase of WebSockets really, so using the WebSockets subprotocols to get the right behaviour is sufficient.