# Laymen’s JS spec While reading this, you are strongly advised to have a JS console opened, so you can try some stuff out for yourself. This document won't explain the syntactical oddities of JS, just the runtime semantics for someone looking to implement a JS runtime. ## 1. Value types - Undefined and null - basically the same - Booleans (true, false) - Numbers (64 bit floats) - Symbols - unique and special "marker" GC-able values that contain nothing - BigInts - integers with no size restrictions - Strings - characters are of a 16-bit width, and may contain \0 - Objects - a string to value dictionary, with a prototype (later on) - Functions - an object that can be called, basically an object with a func ptr appended to it - Arrays - Objects that have special behaviors when accessing number members, as well as “length” ## 2. Objects ### 2.1. Object basics In layman's terms, a JS object is just a string to value hashmap. When we do a member lookup, we have to just convert the lookup key to a string and search for it in the hashmap (of course there's more to it, but let's get to it later). NOTE: Keys are stored in order, so if you set "a", "b" and "c" in order, when we iterate the object's keys, we will get them in the same order - a, b and c. You can try it out in the console, by writing out the object literal `({ a: 10, b: 20, c: 30 })`. You can shuffle the properties around, and they will keep their order. Objects define a few base operations for members: - Member lookup (get the member's value) - Member setting - Member query (check if we have the member) - Member removal - Member definition (for now, it might look the same as member set, but we'll talk about it later) - Member enumeration (in order) ### 2.2. Object prototypes Prototypes are in the base of JS's OOP model. In short, it is a object that the lookup will "fallback" to, if we don't find the member in our object. Every JS object can have a prototype (some objects have what's called a `null` prototype). Since all object can have a prototype, we can have object A that has prototype B, and B can have prototype C. In this case, C can define "foo", B can define "bar" and A can define "baz". When we look up "foo" in A, the lookup will fallback on B, which will fallback on C, and then the member will be found. ```js var A = { baz: 1 }; var B = { bar: 2 } var C = { foo: 3 }; Object.setPrototypeOf(A, B); Object.setPrototypeOf(B, C); A.foo // 3 B.foo // 3 A.bar // 2 C.foo = 10; A.foo // 10 B.foo // 10 ``` Prototypes however make it impossible to know which object we're acting upon. This is why JS defines two types of member operations - normal and those, restricted to the current object (aka they don't work with the prototype). So with this consideration, our actual list of member operations is the following: - Own member lookup - Own member set - Own member query - Own member enumeration - Own member removal - Own member definition (again, we will get to the difference from `set`) - Normal member lookup - Normal member set - Normal member query - Normal member enumeration Note how we can ony define and remove members from the specific object. In short, this is because we almost never want to delete a member from the prototype if we don't find it in the object itself. This is because a prototype may be used by thousands of objects, and if one of them accidentally deletes a property of the prototype, it will break the rest. The same logic follows for member definition Additionally, we can define three more operations with the prototype: - Prototype get - Prototype set - Prototype remove ### 2.3. Properties and fields Until now, we've looked at just members, which were just JS values. However, the JS model is far more flexible. First of all, we have two types of members: fields and properties. Fields are the simpler member - they are just a value, with the following additional options: - writable - if false, sets to this member will not have an effect - configurable - we'll get to it - enumerable - if false, the enumeration will skip over this member by default Properties are the more complex counterpart of members - when we try to get them, a user-defined JS function will be called, and its return value will be returned as the member's value. Similarly, a user-defined JS function with one argument is called when setting the value of the member. Those two functions - getters and setters, are optionally defined (aka a user may only define the getter, making the property effectively readonly) Properties have the same flags as fields, except for writable - as was said above, the setter can just be omitted. ### 2.4. Member configuration (redefinition) and deletion Since members have the above mentioned flags, we can "tweak" them under certain circumstances. This is called a configuration. The act of redefining or deleting a member, to be precise, is a configuration. As you might've spotted, we have a "configurable" flag. This flag is used to "lock down" the member, preventing people from redefining it. After setting "configurable" to false, we can no longer delete or redefine the member. In fact, just one redefinition is allowed for non-configurable members - if the member is a field, and it is writable, we are allowed to redefine it with a field with the same value, enumerable and configurable flag, but with "writable" set to false. Otherwise, if we try to redefine a non-enumerable member with the exact same parameters, the operation will succeed, and nothing will change. Another caveat is that, when a configurable property is redefined, instead of straight up overwritting it, if the property has a null getter or setter, the respective function is going to be taken from the old object: ```js var obj = {}; Object.defineProperty(obj, "test", { get: () => 10, enumerable: true, configurable: true }); /* { get: [Function], set: undefined, configurable: true, enumerable: true, } */ Object.defineProperty(obj, "test", { set: v => console.log(v), enumerable: false, configurable: true }); Object.getOwnMemberDescriptor(obj, "test"); /* { get: [Function], set: [Function], configurable: true, enumerable: false, } */ ``` The pseudocode for redefining a field is as follows: ```js function redefine(old, new) { if (old.configurable) { if (old is Property) { old.get = new.get ?? old.get; old.set = new.set ?? old.set; } old = new; return true; } else { if (old.enumerable != new.enumerable) return false; if (new.configurable) return false; if (old is Field) { if (!old.writable) { if (old.value != new.value) return false; if (new.writable) return false; } else old = new; return true; } else (old is Property) { if (old.get != new.get) return false; if (old.set != new.set) return false; return true; } } } ``` Deletion is much more straight-forward - we only ought to check if the member exists and is configurable. Note that a deletion will succeed if the member doesn't exist - it will fail only if the member isn't configurable. ### 2.5. Normal, non-extensible, sealed and frozen objects These are the four states of a JS object, which will determine to what extent we can modify it. 1. Normal mode - all the so far explained semantics are the same 2. Non-extensible mode, we can't define new members, but we are allowed to redefine and delete them. The prototype of the object is also locked as-is 3. Sealed - Same as non-extensible, but all the members are non-configurable, too (can't delete them and can't redefine them, except for the mentioned mechanics) 4. Frozen - Same as non-extensible, but all members are non-configurable and all fields are readonly. Properties retain their setters Note that a runtime may implement only a extensible flag on the object, and the rest may be done from JS instead. ## 3. Arrays In short, arrays are nothing more than special-case objects. All arrays have the "length" member, as well as members with numeric keys, ranging from 0 to length. These keys are arbitrary, and it's just a "contract" of sorts that they will represent elements from an array. JS arrays are sparse as well. This means that a JS array may define elements from 0 to 5, and then from 8 to 10. In this case, elements 6 to 7 are not members of the array (Try it out by writing the `[1,,2]` literal in your terminal. You will see `[1, , 2]`, and when you try to call `Object.getOwnPropertyDescriptor([1,,2], "1")`, you will get `undefined`, aka no such field). Another caveat is that when you define the member "5" of an array of size "3", the array's length will automagically update to `6` (because index `5` is the 6th element). Setting the "length" property of the array can have two effects: if the new size is bigger, nothing (the array will just grow, and will be filled by empty values), and if the size is smaller, the array will automagically shrink to that size, deleting the members that are outside the bounds of the array. Most sane engines of course define a special case object for arrays, which are usually backed by a linear buffer. When you try to grow the array by setting members out of its bounds, it will grow its buffer to accommodate its new members. When deleting a member (which will make the array sparse), engines usually take one of two routes: they either enter panic mode and convert the whole array to a normal object, or just set a special "flag" value in place of the empty value. Other nasty things the user can do are: define a normal member (for example "pe6o" or "0.5") in the array, defining a property (getter-setter member) with the name "0" in an array. The first two are handled by just having an underlying object on speed dial. The last example however is more interesting - in such cases, most engines will either, again, enter panic mode and revert to object mode, or just overlay the array with a backing object. ## 4. Functions Functions are at the core of what makes the JS clock tick. As in any language, they are the storage of JavaScript code that gets executed when the function value gets invoked. But JS functions are objects as well. This means that you can work with a function as you would with any other regular object - define properties of it, freeze it, list its keys, etc. However, a function wouldn't be a function without its ability to be called. In JS, functions can be called in two distinct ways: - apply - a normal call, aka `my_func(a, b, c)` - construct - a call with new, aka `new my_func(a, b, c)` ### Applying functions When applying a function, you will pass all the arguments you put in the parens, but an implicit `this` argument will get passed, too. In most cases, `this` will be passed as `undefined`. After that, the function will consume the arguments, execute its body and return the result. Then the result will become the evaluated value of the call expression. Now, the JS syntax allows for one special way of calling - a member call. This is achieved by calling a member expression: `a.b(c)`. In this case, the value of `a` will be passed as the implicit `this` argument of the call. This here leads to a lot of JS gotchas, mostly when trying to pass a method of an object as a value to somewhere else, when you get these `this is undefined` exceptions. This is easily resolved by calling `a.member.bind(a)`, which will produce a new function that when invoked will replace the implicit this argument with the passed value instead. ### Constructing functions In JS, we achieve OOP by "simulating" a class with a function. When we call a function with the `new` prefix, we effectively call it in the special "construct" mode. In this mode, as the `this` argument, a special object is passed in that the function will modify. After the function evaluates, if its return value is a primitive, that special object is returned. Otherwise, the return value of the function becomes the evaluated value of the expression. In pseudocode, this is how a high level "construct" would look: ```js function construct(func, ...args) { const obj = {}; const res = func.apply(obj, args); if (is_primitive(res)) return obj; else return res; } ``` ### The Function.prototype member This member is a special member of each function. It contains the object that will become the prototype (`__proto__`) of the newly created `this` for the constructed function. In it we define the instance methods, getters and setters of the class. ### The post ES6 construct method ES6 introduced classes and inheritance, which led to the necessity of reworking the function model a little bit. In essence, the following was changed: - In a constructor of a derived class, the `this` "variable" remains uninitialized until the `super` constructor has been called. When it is called, its value will become `this`'s value. - The function is given the function that is being instantiated, instead of the `this` object. This is because the function may be the super class of another, and it needs to make the prototype of `this` equal to the `prototype` field of the derived class, instead of its own Implementing this change in the runtime however is somewhat trivial. ## 5. Variables Variables in JS are simple: each function has a single repository of variables that are accessible from the inception of the function. When you declare a variable, it is not visible from the point of declaration, but instead from the point of the function start. This can lead to some confusion, but also makes the runtime implementation 10 times easier. ### Capturing variables Since in JS, a function may be defined in another, we need a mechanism via which the variables of the parent function can be made accessible by the child function. This can happen via variable capturing. Basically, during compilation, if it's determined that JS tries to access a variable from its parent function, that variable is marked as "capturable", aka a variable that is kept as a pointer to a value, instead of just a value, so that other functions can share that pointer, and the child function keeps track of which variables it has captured and to which parent variables it corresponds. The runtime however, as far as its concerned, needs to only keep track of the captured, capturable and regular variables and provide a mechanism of constructing a function by supplying the raw function body (aka the instructions, name and other metadata) and a list of the captured variable instances. ### ES6 variables Since the addition of the new `let` and `const` variables, the runtime model has been complicated a little bit - a function still has one repository of variables, but the runtime now needs to keep track of whether or not the variable has been initialized yet. This is because we can have code like this: ```js const a = () => console.log(b); a(); const b = 10; ``` For which static analysis of the scope is impossible. Of course, the compiler can feel free to omit these checks whenever it determines that the variable is definitely assigned, but the defined-ness of the variable must be guaranteed for it to be accessible. Another consequence of the new variable model is that we can have code like this: ```js const funcs = []; for (let i = 0; i < 100; i++) { funcs[i] = () => i; } console.log(funcs[69]()); // prints 69 ``` In this case, for each iteration of the loop a new capturable instance needs to be created. This requires the runtime to have a mechanism of dynamically creating (and destroying) capturables. ### The global scope When the compiler tries to resolve a variable, but doesn't find it anywhere along the scope chain, the variable is converted to a global variable access. In essence, this is just a more fancy way to get a field from an implicit "globals" object. However, unlike the traditional `my_obj.test`, where if `test` is not a member of `my_obj`, `undefined` is returned, instead, an error that reports that no such variable exists is thrown instead. The same mechanic follows for trying to assign to a nonexistent property of the global scope. The only way of defining a new value of the global scope is by using `var name = value` in the top-level of the file. This statement will get 1 to 1 converted to `globalThis.name = value` (`globalThis` for the confused is just a reference to the global object. Alternatives are `self`, `window` and just `global`).