In a previous post, I covered the basic security principal that Chromium uses for its security model. The goal of this post is to outline few details that are vital to understand the limitations imposed on the process model. It will look at somewhat obvious parts of the web platform framed in HTML spec speak.
When a browser is navigated to an URL, it makes a network request to the server specified for the document identified in the URL. The response is a document *, which is then parsed and rendered in a window. Those should be familiar, since they correspond to the identically named objects in JavaScript. This holds true for iframes as well, which have their own window objects, which host the respective documents. The HTML spec uses different naming for window - “browsing context”, while it keeps document as the same concept. There are a few types defined by the standard:
- top-level browsing context - the main window for a page
- nested browsing context - window embedded in a different window, for example through <iframe> tag
- auxiliary browsing context - a top-level browsing context “related” to another browser context, or put in simpler speak - any window created through window.open() API, or a link with target attribute.
I will use frame to refer generically to any browsing context - be it a page or an iframe, as they are basically the same concept with two different names based on the role they play.
There are two concepts the HTML spec defines that are important to understand. The first one is “reachable browsing context”. This is somewhat intuitive, as all frames that are part of a web page are reachable to each other. In JavaScript this is exposed through the window.parent and window.frames properties. In addition, related browsing contexts are reachable too, by using the return value of window.open() and the window.opener property. For example, if we have a page with two iframes, which opens a new window with an iframe, then all of the frames are reachable.
The set of reachable frames - all of them in the above case - form the other concept the standard defines - “unit of related browsing contexts”. It is important because documents that want to communicate with other documents are allowed to do so only if they are part of the same unit of related browsing contexts. Internally, the Chromium source code uses the BrowsingInstance class to represent this concept. For the sake of brevity, I’ll use this name from here on.
When two documents want to communicate with each other, they need to have a reference to the window object of the target document. Any frame in a BrowsingInstance can get a reference to any other frame in the same BrowsingInstance since they are all reachable by definition.
How documents can interact with each other is governed by the same origin policy. When documents are from the same origin or can relax their origin to a common one, they are allowed to access each other directly. Cross-origin documents on the other hand are not allowed such access. So a BrowsingInstance can be split in sets of frames and grouped by the origin they are from. But recall that we can’t easily use the origin as a security principle in Chromium. This is why we use the concept of SiteInstance - the set of frames in a BrowsingInstance which host documents from the same Site. It is vital to remember that the Chromium browser process makes all of its process model and isolation decisions based on SiteInstance and not based on origins.
The HTML spec requires all same origin documents, which are part of the same unit of related browsing contexts, to run on the same event loop - or in other words the same thread of execution within a process. This means that all frames which are part of the same SiteInstance must execute on the same thread, however different SiteInstances can run on different ones. In the example above, the two pages are in the same BrowsingInstance because they are related through the window.open() call. The different SiteInstances should be for a.com, b.com, c.com, d.com.
Overall it all boils down to the following rules that Chromium needs to abide by:
- All frames within a BrowsingInstance can reference each other.
- All frames within a SiteInstance can access each other directly and must run on the same event loop.
- Frames from different SiteInstances can run on separate event loops.
Phew! Now there is enough background to start delving into the details of Chromium’s implementation of these concepts from the HTML spec and its process allocation model.