A Deep Dive Into V8

A majority of front-end developers deal with this buzzword all the time: V8. A big part of its popularity is due to the fact that it led JavaScript to a new level of performance.

Yes, V8 is very fast. But, how does it perform its magic and why is it so responsive?

The official docs state that “V8 is Google’s open source high-performance JavaScript and WebAssembly engine, written in C++. It is used in Chrome and Node.js, among others”.

In other words, V8 is a software developed in C++ that translates JavaScript into executable code i.e. machine code.

In this epiphanic moment, we start seeing things more clearly. Both Google Chrome and Node.js are just bridges to transport the JavaScript code to its final destination: machine code running in that specific machine.

Another important role in V8's performance play belongs to its generational and super accurate Garbage Collector. It was optimized to collect the objects that JavaScript no longer needs, using low memory.

Besides that, V8 counts on a set of other tools and features to improve some inherent JavaScript functionalities that historically, make the language slow (like its dynamic nature, for example).

In this article, we'll explore those tools (Ignition and TurboFan) and features in more detail. More than that, we'll cover the basics of V8's internal functioning, compilation and garbage collection processes, single-threaded nature, and more.

Let's go!

Now that you're diving into V8, you might be interested to dive into Application Monitoring for Node.js with AppSignal as well.

Starting With the Basics

How does machine code work? Machine code, in short, is a bunch of very low-level instructions that execute in specific parts of the machine’s memory.

The process of generating it, using C++ language as a reference, is similar to this:

Before going any further, it’s important to point out that this is a compilation process, which is different from the JavaScript interpretation process. While the compiler, in fact, generates a whole program at the end of the process, the interpreter works as a program itself that does the job by reading the instructions (usually as scripts, like the JavaScript scripts) and translating them into executable commands.

The interpreting process can happen both on-the-fly (on which the interpreter parses and runs only the current command) or fully-parsed (that’s when the interpreter first translates the script entirely before proceeding with the respective machine instructions).

Back to the figure, the compilation process usually starts with the source code, as you know. You implement the code, save it and run. The running process, in turn, starts at the compiler. The compiler is a program, like any other, running on your machine. It then goes through all the code and generates object files. Those files are the machine code. They’re optimized code that runs in that specific machine, that’s why you’ll have to use a specific compiler when you move from one OS to another.

But you can’t execute separate object files, you need to combine them into a single file, the well-known .exe file (the executable file). That’s the linker’s job.

Finally, the loader is the agent responsible for transferring the code inside of that exe file to the virtual memory of your OS. It is basically a transporter. And here, you have your program finally up and running.

Sounds like a lengthy process, isn't it?

Most of the time (unless you’re a developer working with Assembly in a bank’s mainframe) you’ll spend your time programming in high-level languages: Java, C#, Ruby, JavaScript, etc.

The higher the language is, the slower it is. That’s why C and C++ are so much faster, they’re very close to the machine code language: the assembly language.

One of the main benefits of V8, apart from the performance, is the possibility of going beyond the ECMAScript standards and understand, for example, C++ as well:

JavaScript is restricted to ECMAScript. And V8, in order to exist, must be compliant but not restricted to it.

Having the ability to incorporate C++ features into V8 is great. Since C++ has evolved to be very good at specificities of the OS — like file manipulation and memory/threads handling — having all this power in JavaScript's hands is very useful.

If you think about it, Node.js itself was born in a similar way. It followed a similar path to V8’s plus the server and networking capabilities.

Single-Threaded

If you’re a Node developer, you’ll be familiar with V8’s single-threaded nature. Each JavaScript execution context is directly proportional to one thread.

Of course, V8 manages the OS threading mechanism behind the scenes. It works with more than one thread because it’s a complex software and executes lots of stuff at the same time.

We have the main thread that executes the code, another one to compile the code (yes, we can’t stop the execution every time new code has to be compiled), some others to deal with garbage collection, and so on.

However, V8 creates an environment of a single thread for each of JavaScript’s execution context. The rest is kept under its control.

Imagine the stack of function calls your JavaScript code is supposed to make. JavaScript works by stacking one function on top of another, following the order by which each one was inserted/called. Before reaching each function’s content, we can’t know if it calls other functions. If and when that happens, then the called functions will be placed right after the caller in the stack.

When it comes to callbacks, for example, they are placed at the end of the pile.

The management of this stack organization and the memory the process will need is one of the main tasks of V8.

Ignition and TurboFan

Since version 5.9, released in May 2017, V8 comes with a new JavaScript execution pipeline that was built on top of Ignition, V8’s interpreter. It also includes a newer and better optimizing compiler ⁠— TurboFan.

These changes were totally focused on the overall performance and the difficulties Google developers were facing when adapting the engine to all the quick and considerable changes that the JavaScript universe brought up.

From the very beginning of the project, V8 maintainers were always worried about finding a good way to improve V8’s performance at the same pace JavaScript was evolving.

Now we can see huge improvements when running the new engine against the biggest benchmarks:

TurboFan and Ignition performance metrics

Source: https://v8.dev/blog/launching-ignition-and-turbofan

You can read more about Ignition and TurboFan here and here.

Hidden Classes

This is another one of V8's magic tricks. JavaScript is a dynamic language. That means that new properties can be added, replaced and removed during execution time. This is not possible with languages like Java, for example, in which everything (classes, method, objects and variables) must be defined before program execution and can’t be dynamically changed after the app starts.

Because of its particular nature, the JavaScript interpreters usually perform a dictionary lookup based on a hash function to know exactly where this variable or that object is allocated in memory.

This costs a lot to the final process. In other languages, when objects are created, they receive an address (a pointer) as one of their implicit attributes. This way, we know exactly where they’re placed in memory and how much space to allocate.

With JavaScript, that’s impossible since we can’t map what doesn't exist yet. That’s where the hidden classes reign.

Hidden classes are almost the same as they are in Java: static and fixed classes with a unique address to locate them. However, rather than doing it before program execution, V8 will do it during runtime, every time we have a “dynamic change” in the object’s structure.

Let’s look at an example to clarify things. Consider the following code snippet:

JavaScript

function User(name, fone, address) {
  this.name = name;
  this.phone = phone;
  this.address = address;
}

Within JavaScript’s prototype-based nature, every time we instantiate a new User object, let’s say:

JavaScript

var user = new User("John May", "+1 (555) 555-1234", "123 3rd Ave");

Then V8 creates a new hidden class. Let’s call it _User0.

Each object has a reference to its class representation in memory. It’s the class pointer. At this point, since we just instantiated a new object, only a hidden class was created in memory. It is empty for now.

When you execute the first line of code in this function, a new hidden class is going to be created based on the previous one, this time _User1.

It is basically the memory address of a User that has a name property. In our example, we’re not using users with just a name as an attribute, but every time you do it, this is the hidden class V8 will load as reference.

The name property is added to the offset 0 of your memory buffer, which means this will be considered our first attribute in the final order.

V8 will also add a transition value to the _User0 hidden class. This helps the interpreter to understand that every time a name property is added to a User object, the transition from _User0 to _User1 must be addressed.

When the second line in the function is called, the same process happens again and a new hidden class is created:

You can see that the hidden classes keep track of the stack. One hidden class leads to another one in a chain maintained by the transition values.

The order in which the properties are added determines how many hidden classes V8 is going to create. If you change the order of the lines in the code snippet we’ve created, different hidden classes will be created as well. That’s why some developers try to maintain the order to reuse hidden classes and therefore, reduce the overhead.

Inline Caching

This is a term very common in the JIT (Just In Time) compilers world. And it connects directly with the concept of hidden classes.

Every time you call a function passing an object as a parameter, for example, V8 will take a look at this action and think: “Hmm, this object was successfully passed twice or more as a param to this function… why not store it in my cache for future calls rather than perform the whole time-consuming-hidden-class-validation process again?”

Let’s recap our last example:

JavaScript

function User(name, fone, address) {
  // Hidden class _User0
  this.name = name; // Hidden class _User1
  this.phone = phone; // Hidden class _User2
  this.address = address; // Hidden class _User3
}

After sending this User object twice instantiated with any values as a parameter to a function, V8 will jump the hidden class lookup and go directly to the offset’s properties. This is much faster.

However, remember that if you change the order of any attribute assignment in the function, it will result in different hidden classes, so V8 won’t be able to make use of the inline caching feature.

This is a great example to show that developers shouldn’t avoid getting to know the engine more intimately. On the contrary, having such knowledge will help your code perform better.

Garbage Collecting

Do you remember that we mentioned V8 collects memory garbage in a different thread? So, this helps a lot, since our program execution won’t get affected.

V8 uses the well-known strategy of “mark-and-sweep” to collect dead and old objects in memory. In this strategy, the phase where the GC scans memory objects to “mark” them for collection is a bit slow, because it pauses execution in order to achieve it.

However, V8 does it incrementally, i.e., for each GC stop, V8 tries to mark as many objects as possible. It makes everything faster because there’s no need to stop the entire execution until the collection finishes. In large applications, the performance improvement makes a lot of difference.

If you are monitoring your Node.js app with AppSignal, the Magic Dashboard for Garbage Collection stats for our Node.js integration can help you discover and fix this problem. No setting up is required, this dashboard will ~~magically~~ automatically appear among the rest of your dashboards.

A Continuous Ride

I hope you enjoyed this piece. The goal was to clarify a bit on V8's structural details, which are at many times misunderstood or ignored.

The whole thing is naturally complex. And it’s constantly evolving. But these are pretty much the core concepts.

At the time of writing this article, the GitHub repo counts 15.3k stars and 2.9k forks. You too can fork the open source code and change the V8 engine as you please. Add your own custom C++ instructions, change the way the hidden classes are dealt with, use your imagination.

But first, don’t forget to give a good read over the official docs. Good reading!

P.S. If you liked this post, subscribe to our new JavaScript Sorcery list for a monthly deep dive into more magical JavaScript tips and tricks.

P.P.S. If you are now interested to see how to monitor your Node.js app and zoom in on Garbage Collection, you might want to check out AppSignal applicaiton monitoring for Node.js.