There is a lot of information to cover to understand this code base, so we’ll begin by looking at some of the major components so that the terminology in future posts and other articles will make sense. I am going to rely on a lot of references to introduce this material because plenty of high-level descriptions of V8’s components already do a much better job than I could in such a short post.
V8 itself is a relatively small piece within the Chromium browser. In fact, it is a completely separate code base that can be embedded within other projects (like NodeJS!). It is important to understand the layout here because we will talk a lot about V8 exploitation, but V8 is usually sandboxed within another process(es). V8 bugs must be chained with additional exploits to reach full blown code execution on most systems. For our purposes, we will be happy to gain unrestricted code execution within V8’s process. For example, this is a high-level look at Chrome that shows how V8 is situated within multiple layers of components.
The cool thing about V8 being so modular is that we do not have to dig deep into any other code bases to understand it. While looking through Chromium source code could paint a better picture for certain design choices, it is largely unnecessary.
Now we’ll explore some of the most important parts of V8. You can find all of the code here or at the GitHub mirror. In my next post I’ll go in-depth as to how these components are written in C++, but for now let’s just get an understanding of what they do.
Ignition is V8’s interpreter. Many exploits focus on JIT code and the mistakes made during the compilation process. However, the compiled code relies on what is produced by the interpreter! While fewer security-related bugs have been found here, there are still some, as pointed out by Dimitri Fourny and Moritz Jodeit. To understand this component, we recommend briefly looking over this in-depth document from Google and reading this good, quick explanation of how V8 generates bytecode by Franziska Hinkelmann.
Most V8 exploits focus on this component, and as a consequence we will too. Some aspects to understand here will be the optimization pipeline, variable typing, and memory safety checks. There’s a great presentation (slides) for basic understanding. This article by Jeremy Fetiveau is a really awesome introduction from an exploitation perspective (note that it goes much more in-depth than we have so far).
Liftoff is the component that creates machine code from WebAssembly. It is able to compile WebAssembly very quickly; however, it does not produce optimized code. It actually passes its output to Turbofan immediately for optimization (as opposed to Ignition which waits for code to be run a certain number of times first). This component is updated quite frequently, and since it is relatively new, there are more and more bugs being found in it. As WebAssembly becomes more popular, there may be more opportunities for security research into this piece of V8.
To get even better performance, V8 comes with pre-compiled code for built-in functions (the functions defined by the ECMAScript standard). These used to be written in the CSA, which was introduced in 2017. However, the process of hand-writing these assembly functions lead to several bugs, prompting the introduction of Torque a year later. Essentially, Torque makes it easier to write efficient code for built-in functions across the various architectures supported by V8.
Another great post was actually published while I was writing this section. It does a fantastic job summarizing not just an exploit involving Torque, but many other concepts I have talked about already.
Now we have covered some of the major components listed in the V8 docs with a high-level summary of each. From here, we will mostly focus on Turbofan, taking a slower approach than most of the existing vulnerability research in an attempt to get a very low-level understanding. Much later we will look back at Ignition, Liftoff, and Torque.