Core JavaScript Semantics
Posted on January 17, 2015This post is a summary of the core JavaScript semantics, based on the papers The Essence of JavaScript and A Tested Semantics for Getters, Setters, and Eval in JavaScript from the semantics of JavaScript project at Brown University. Note that throughout this post “reference” has a very precise meaning as a pointer to a memory location; think ML-style references (or, if you must, C++-style references).
Objects
The { "propName": expression, ... }
syntax allocates a new object in memory and returns a reference to that object, which is dereferenced whenever you attempt to access (get or set) a property of the object:
var a = { "someProp": 1, "someOtherProp": 2 };
var b = a;
.someProp = 3;
bconsole.log(a.someProp); // prints 3
After the assignment, a
is a reference to some object. When we assign a
to b
it is this reference, not the object itself, which gets copied, and hence when we dereference b
to update its someProp
this is modifying the same object that a
references. The same goes for passing objects as function arguments:
function f(x) {
.someOtherProp = 4;
x
}f(b);
console.log(a.someOtherProp); // prints 4
Accessing a property which isn’t present in the object yields undefined
, and setting a non-existing property will add it:
console.log(a.thirdProp); // prints undefined
.thirdProp = 5;
bconsole.log(a.thirdProp); // prints 5
Fields can however also be deleted again:
delete b.thirdProp;
console.log(a.thirdProp); // prints undefined
Finally, property names can be computed:
console.log(a["some" + "Prop"]); // prints 3
Prototypes
Objects can have a __proto__
property containing a reference to another object, which is dereferenced when getting a non-existent property.
var c = { "someProp": 1 };
var d = { "__proto__": c, "someOtherProp": 2 };
console.log(d.someProp, d.someOtherProp); // prints 1 2
Since __proto__
is a reference, modifying c
will affect d
:
.someProp = 3;
cconsole.log(d.someProp); // prints 3
However, setting a non-existent property does not affect the linked object but adds the property directly (though see the sections on accessors and metadata):
.someProp = 4;
dconsole.log(c.someProp, d.someProp); // prints 3 4
Once this property has been added, updating x.someProp
no longer affects y
:
.someProp = 5;
cconsole.log(c.someProp, d.someProp); // prints 5 4
until the property is deleted again:
delete d.someProp;
console.log(c.someProp, d.someProp); // prints 5 5
After someProp
is deleted from d
, getting d.someProp
once again returns c.someProp
.
this
When you call a function on an object it is passed the (non-dereferenced) object as an implicit argument called this
:
.setSomeProp = function(someProp) {
cthis.someProp = someProp;
}.setSomeProp(6);
dconsole.log(c.someProp, d.someProp); // prints 5 6
When we call d.setSomeProp
we attempt to lookup setSomeProp
in the object that d
references; since we don’t find it, we look it up in the object referenced by d.__proto__
and find the function we previously assigned to c.setSomeProp
. When we call this function, we pass d
(the reference) as the implicit this
argument.
When we call a function without specifying an object the “global” object is passed as the implicit argument:
var foo = c.setSomeProp;
foo(7);
console.log(global.someProp); // prints 7
(What exactly the global object is depends; in node.js
it is global
; in a browser it is window
.)
new
Functions are objects themselves and therefore can have arbitrary properties:
function someFun() {
return 1;
}.someProp = 2;
someFunconsole.log(someFun(), someFun.someProp); // prints 1 2
When we call new someConstr()
on some function someConstr
, a new object is created whose only property is __proto__
, which is initialized to be equal to someConstr.prototype
. This new object is passed as the implicit this
parameter:
var e = { "someProp": 3 };
function someConstr() {
this.someOtherProp = this.__proto__;
}.prototype = e;
someConstrvar f = new someConstr();
console.log(f.someOtherProp.someProp); // prints 3
As before, when we assign e
to someConstr.prototype
we are copying a reference; when we call new
, it is this reference that we use to initialize the __proto__
property of the new object, and hence when we assign that to this.someOtherProp
we again are copying a reference. Hence, after this, changing e
affects f
:
.someProp = 4;
econsole.log(f.someOtherProp.someProp); // prints 4
Apart from the fact that new someConstr()
expects the someConstr
object to have a property called prototype
there is nothing special about such “constructor” functions. This is however the “official” way to set an object’s __proto__
property; not all JavaScript implementations allow you to set __proto__
directly (node.js
, which is based on V8, the JavaScript engine in Chrome, does).
instanceof
The instanceof
compares a __proto__
property of an object with the prototype
property of a (function) object:
var g = {};
var h = { "__proto__": g }
function someFun() {}
.prototype = g;
someFun
console.log(h instanceof someFun); // prints true
Typically of course this used with the new
keyword which, as we have seen, copies the prototype
property of the function (recall that this is a reference) to the __proto__
property of the new object:
function Person(name) {
this.name = name;
}.prototype = {
Person"sayHi": function() {
console.log("Hi, my name is " + this.name);
}
}var somePerson = new Person("John");
.sayHi(); // prints "Hi, my name is John"
somePersonconsole.log(somePerson instanceof Person); // prints true
but there is nothing going on here that we have not already seen, and indeed we can get unexpected results if we subsequently modify a function’s prototype
property.
Scope
Scope in JavaScript is an object like any other. “Global” variables are properties of the global object (see above for a note on global
):
= 1;
someProp = 2;
someOtherProp console.log(global.someProp, global.someOtherProp); // prints 1 2
There is circularity here of course: if someProp
means “lookup someProp
in the global object”, then global
must mean “lookup global
in the global object”; indeed, global
contains a reference to itself:
console.log(global === global.global); // prints true
Inside a function we introduce a new scope object, but it “chains” to the outer scope.
function f() {
var someProp;
= 3;
someProp = 4;
someOtherProp console.log(someProp, someOtherProp); // prints 3 4
}f();
console.log(someProp, someOtherProp); // prints 1 4
This chaining of scope is similar to the chaining of prototypes, but not the same: writes to variables that are not explicitly declared to be local writes to the enclosing scope, it does not add them to the local scope; by constrast, a write to a property of an object never follows the prototype chain.
All local variable declarations are hoisted to the top of the function:
function g() {
+= 1;
someProp var someProp;
= 5;
someProp console.log(someProp); // prints 5
}g();
console.log(someProp); // prints 1
Note that the increment of someProp
in g
had no effect on the global variable. It gets even weirder when we have an initial value:
function h() {
+= 1;
someProp var someProp = 6;
console.log(someProp); // prints 6
}h();
console.log(someProp);
Here the var someProp
is hoisted to the top-level of the function definition, but the assignment of 6
to someProp
is left where it is. This hoisting of var
even happens across block scope:
function h2() {
+= 1;
someProp if (true) {
var someProp = 7;
}console.log(someProp); // prints 7
}h2();
console.log(someProp); // prints 1
(JavaScript 1.7 introduces a let
keyword with more sane scoping rules, but this JavaScript version is not widely available yet; node.js
supports it if you pass the --harmony
flag.) This also means you cannot initialize a local variable to be equal to a variable with the same name in the enclosing scope:
function bar(x) {
return function() {
var x = x;
return x;
}
}console.log(bar(200)()); // prints undefined
The assignment of x
to x
in the local function inside bar
just assigns that variable to itself.
A function closure captures its scope, which provides a way to encapsulate state (private attributes):
function Counter() {
var x = 0;
function increment() {
++;
x
}
function show() {
console.log(x);
}
return { "increment": increment, "show": show }
}var counter = Counter();
.increment();
counter.increment();
counter.show(); // prints 2 counter
Be aware however that if you use this pattern, each “instance” of Counter
will have explicit entries for each of the methods increment
and show
, rather than defining these methods just once in the prototype and having all instances refer back to that; hence there are some efficiency concerns.
The combination of closures with the weird behaviour of var
can have some insiduous consequences (example adapted from the Future of JavaScript talk):
var counters = [Counter(), Counter(), Counter ()];
function addOnClick() {
for (i = 0; i < counters.length; i++) {
var x = counters[i];
.onclick = function() { x.increment(); }
x
}
}
addOnClick();
for (i = 0; i < counters.length; i++) {
.onclick();
counters[i].show(); // prints 0 0 3
counters[i] }
Finally, you can introduce arbitrary other objects into the scope chain using with
:
var x = { "someProp": 3 };
function f(x) {
var someOtherProp = 4;
with(x) {
console.log(someProp, someOtherProp); // prints 3 4
= 5;
someProp = 6;
someOtherProp
}
console.log(someProp, someOtherProp); // prints 1 6
= 7;
someProp = 8;
someOtherProp
}
= 1;
someProp = 2;
someOtherProp f(x);
console.log(someProp, someOtherProp); // prints 7 2
console.log(x.someProp); // prints 5
Accessors
Starting with JavaScript version ES5—the language currently implemented by V8, and hence Chrome and node.js
—there are actually two kinds of properties: data properties defined by their value, of the form that we have already seen, and accessor properties defined by a getter and a setter:
var x = {
someProp(v) {
set this._someProp = v * 2;
,
}someProp() {
get return this._someProp + 1;
}
}
.someProp = 5;
xconsole.log(x._someProp, x.someProp); // prints 10 11
When we assign to x.someProp
, we actually call the setter function. The setter function is passed the implicit this
argument as usual; in this example, it sets this._someProp
, a value property (we could instead use a closure to hide this underlying property completely). When we get x.someProp
, we actually call the getter.
Setters and getters are looked up through the prototype chain, as usual:
var y = { "__proto__": x };
.someProp = 6;
yconsole.log(y._someProp, y.someProp); // prints 12 13
console.log(x._someProp, x.someProp); // prints 10 11
Note what is going on here: when we assign to y.someProp
, the setter is looked up through the prototype chain, but when this setter writes to the value property _someProp
, as usual this is set in y
, not in the prototype x
. This is entirely consistent: we traverse the prototype chain when we get properties (including when we look for functions to execute), but we ignore the prototype chain when we set value properties (this is not entirely true; see next section on metadata).
Metadata
Also starting from ES5, objects and properties have associated metadata. The details are beyond the scope of this blog post; I merely mention them here:
- Objects can be marked as extensible or not, which must be true in order to be allowed to add new properties to the object.
- Data properties can be marked as writable or not, which must be true in order to be allowed to write to the property. Note that if
x
is the prototype ofy
, setting propertysomeProp
ony
will add that field toy
, as we have seen, but this is write is disallowed if (the prototype)x
also has that property and it is not writable. - Both data properties and accessor properties can be marked as enum and config or not;
enum
determines if this field will show up infor
enumeration, whileconfig
determines if any of the metadata on this property can be changed.
Many of these properties can only change in certain ways: for example, once config
is set to false
, it cannot subsequently be set to true
anymore. See States and transitions of the attributes of an EcmaScript 5 property for details.
eval
The eval
primitive evaluates a string as if it was a JavaScript program. However, the precise behaviour of eval
, in particular with respect to scope, depends on two things. If we call eval
directly then the snippet gets executed in the current scope; if we call eval
indirectly it gets executed in the global scope:
var indirectEval = eval;
= 10;
x
function f() {
var x = 20;
eval("x = 30");
console.log(x, global.x); // prints 30 10
indirectEval("x = 40");
console.log(x, global.x); // prints 30 40
}f();
The behaviour of eval
is also affected by whether we execute the snippet in “strict mode” or not. The most (only?) reliable way to use strict mode is to specify this in the argument to eval
itself (see section 5.3 of the Tested Semantics for.. paper). Without strict mode, any local variables introduced by the snippet are added to its scope (be it the current scope for direct calls or the global scope for indirect calls):
function g() {
eval("var x; x = 60");
console.log(x, global.x); // prints 60 40
= 70;
x console.log(x, global.x); // prints 70 40
}g();
Note that the direct assignment to x
in g
affects a function local variable, not the global variable; execution of the snippet affected the scope of the function. In strict mode however any local variables introduced by the snippet are local to the snippet:
function h() {
eval("'use strict'; var x; x = 80");
console.log(x, global.x); // prints 40 40
= 90;
x console.log(x, global.x); // prints 90 90;
}h();
Minor remarks
Evaluation order
Evaluation order is strictly left to right.
Arrays
Arrays are objects like any other, and hence we can remove an object from the middle of an array:
function sum(arr) {
var r = 0;
for (var i = 0; i < arr.length; i++) {
+= arr[i];
r
}return r;
}
console.log(sum([1,2,3])); // prints 6
var a = [1,2,3,4];
delete a["2"];
console.log(sum(a)); // prints NaN (1 + 2 + undefined + 4)
for
.. in
We haven’t talked much about iteration here (except to mention the enum
attribute), but be aware that this always iterates over keys, not values, even for arrays:
for (i in [3, 4, 5]) {
console.log(i); // prints 0, 1, 2
}
The next version of JavaScript will have a for .. of
loop that is more flexible.
References
- Papers The Essence of JavaScript and A Tested Semantics for Getters, Setters, and Eval in JavaScript
- JavaScript standard ES5 and the annotated version.
- Presentation by Dave Herman on The Future of JavaScript
- We have not discussed the semantics of specific operations in this blog post at all, but it’s good to be aware that equality in JavaScript behaves weirdly. See the JavaScript Equality Table. Gary Bernhardt’s talk Wat on the quirks of Ruby and JavaScript is also a must-see (and very entertaining), as is wtfjs.com, a collection of weird irregularities in the JavaScript libraries.
- Douglas Crockford’s site javascript.crockford.com contains lots of useful articles.