Core JavaScript Semantics

Posted on January 17, 2015

This post is a summary of the core JavaScript semantics, based on the papers The Essence of JavaScript and A Tested Semantics for Getters, Setters, and Eval in JavaScript from the semantics of JavaScript project at Brown University. Note that throughout this post “reference” has a very precise meaning as a pointer to a memory location; think ML-style references (or, if you must, C++-style references).

Objects

The { "propName": expression, ... } syntax allocates a new object in memory and returns a reference to that object, which is dereferenced whenever you attempt to access (get or set) a property of the object:

var a = { "someProp": 1, "someOtherProp": 2 };
var b = a;
b.someProp = 3;
console.log(a.someProp); // prints 3

After the assignment, a is a reference to some object. When we assign a to b it is this reference, not the object itself, which gets copied, and hence when we dereference b to update its someProp this is modifying the same object that a references. The same goes for passing objects as function arguments:

function f(x) {
  x.someOtherProp = 4;
}
f(b);
console.log(a.someOtherProp); // prints 4

Accessing a property which isn’t present in the object yields undefined, and setting a non-existing property will add it:

console.log(a.thirdProp); // prints undefined
b.thirdProp = 5; 
console.log(a.thirdProp); // prints 5

Fields can however also be deleted again:

delete b.thirdProp;
console.log(a.thirdProp); // prints undefined

Finally, property names can be computed:

console.log(a["some" + "Prop"]); // prints 3

Prototypes

Objects can have a __proto__ property containing a reference to another object, which is dereferenced when getting a non-existent property.

var c = { "someProp": 1 };
var d = { "__proto__": c, "someOtherProp": 2 };
console.log(d.someProp, d.someOtherProp); // prints 1 2

Since __proto__ is a reference, modifying c will affect d:

c.someProp = 3;
console.log(d.someProp); // prints 3

However, setting a non-existent property does not affect the linked object but adds the property directly (though see the sections on accessors and metadata):

d.someProp = 4;
console.log(c.someProp, d.someProp); // prints 3 4

Once this property has been added, updating x.someProp no longer affects y:

c.someProp = 5;
console.log(c.someProp, d.someProp); // prints 5 4

until the property is deleted again:

delete d.someProp;
console.log(c.someProp, d.someProp); // prints 5 5

After someProp is deleted from d, getting d.someProp once again returns c.someProp.

this

When you call a function on an object it is passed the (non-dereferenced) object as an implicit argument called this:

c.setSomeProp = function(someProp) { 
  this.someProp = someProp;
}
d.setSomeProp(6);
console.log(c.someProp, d.someProp); // prints 5 6

When we call d.setSomeProp we attempt to lookup setSomeProp in the object that d references; since we don’t find it, we look it up in the object referenced by d.__proto__ and find the function we previously assigned to c.setSomeProp. When we call this function, we pass d (the reference) as the implicit this argument.

When we call a function without specifying an object the “global” object is passed as the implicit argument:

var foo = c.setSomeProp;
foo(7);
console.log(global.someProp); // prints 7

(What exactly the global object is depends; in node.js it is global; in a browser it is window.)

new

Functions are objects themselves and therefore can have arbitrary properties:

function someFun() {
  return 1;
}
someFun.someProp = 2;
console.log(someFun(), someFun.someProp); // prints 1 2

When we call new someConstr() on some function someConstr, a new object is created whose only property is __proto__, which is initialized to be equal to someConstr.prototype. This new object is passed as the implicit this parameter:

var e = { "someProp": 3 };
function someConstr() {
  this.someOtherProp = this.__proto__;
}
someConstr.prototype = e;
var f = new someConstr();
console.log(f.someOtherProp.someProp); // prints 3

As before, when we assign e to someConstr.prototype we are copying a reference; when we call new, it is this reference that we use to initialize the __proto__ property of the new object, and hence when we assign that to this.someOtherProp we again are copying a reference. Hence, after this, changing e affects f:

e.someProp = 4;
console.log(f.someOtherProp.someProp); // prints 4

Apart from the fact that new someConstr() expects the someConstr object to have a property called prototype there is nothing special about such “constructor” functions. This is however the “official” way to set an object’s __proto__ property; not all JavaScript implementations allow you to set __proto__ directly (node.js, which is based on V8, the JavaScript engine in Chrome, does).

instanceof

The instanceof compares a __proto__ property of an object with the prototype property of a (function) object:

var g = {};
var h = { "__proto__": g } 

function someFun() {}
someFun.prototype = g;

console.log(h instanceof someFun); // prints true

Typically of course this used with the new keyword which, as we have seen, copies the prototype property of the function (recall that this is a reference) to the __proto__ property of the new object:

function Person(name) {
  this.name = name;
}
Person.prototype = { 
  "sayHi": function() {
    console.log("Hi, my name is " + this.name);
  }
}
var somePerson = new Person("John");
somePerson.sayHi(); // prints "Hi, my name is John"
console.log(somePerson instanceof Person); // prints true

but there is nothing going on here that we have not already seen, and indeed we can get unexpected results if we subsequently modify a function’s prototype property.

Scope

Scope in JavaScript is an object like any other. “Global” variables are properties of the global object (see above for a note on global):

someProp      = 1;
someOtherProp = 2;
console.log(global.someProp, global.someOtherProp); // prints 1 2

There is circularity here of course: if someProp means “lookup someProp in the global object”, then global must mean “lookup global in the global object”; indeed, global contains a reference to itself:

console.log(global === global.global); // prints true

Inside a function we introduce a new scope object, but it “chains” to the outer scope.

function f() {
  var someProp;
  someProp      = 3;
  someOtherProp = 4;
  console.log(someProp, someOtherProp); // prints 3 4
}
f();
console.log(someProp, someOtherProp); // prints 1 4

This chaining of scope is similar to the chaining of prototypes, but not the same: writes to variables that are not explicitly declared to be local writes to the enclosing scope, it does not add them to the local scope; by constrast, a write to a property of an object never follows the prototype chain.

All local variable declarations are hoisted to the top of the function:

function g() {
  someProp += 1;
  var someProp;
  someProp = 5;
  console.log(someProp); // prints 5
}
g();
console.log(someProp); // prints 1

Note that the increment of someProp in g had no effect on the global variable. It gets even weirder when we have an initial value:

function h() {
  someProp += 1;
  var someProp = 6;
  console.log(someProp); // prints 6
}
h();
console.log(someProp);

Here the var someProp is hoisted to the top-level of the function definition, but the assignment of 6 to someProp is left where it is. This hoisting of var even happens across block scope:

function h2() {
  someProp += 1;
  if (true) {
    var someProp = 7;
  }
  console.log(someProp); // prints 7
}
h2();
console.log(someProp); // prints 1

(JavaScript 1.7 introduces a let keyword with more sane scoping rules, but this JavaScript version is not widely available yet; node.js supports it if you pass the --harmony flag.) This also means you cannot initialize a local variable to be equal to a variable with the same name in the enclosing scope:

function bar(x) {
  return function() {
    var x = x;
    return x;
  }
}
console.log(bar(200)()); // prints undefined

The assignment of x to x in the local function inside bar just assigns that variable to itself.

A function closure captures its scope, which provides a way to encapsulate state (private attributes):

function Counter() {
  var x = 0;

  function increment() {
    x++;
  }

  function show() {
    console.log(x);
  }

  return { "increment": increment, "show": show }
}
var counter = Counter();
counter.increment();
counter.increment();
counter.show(); // prints 2

Be aware however that if you use this pattern, each “instance” of Counter will have explicit entries for each of the methods increment and show, rather than defining these methods just once in the prototype and having all instances refer back to that; hence there are some efficiency concerns.

The combination of closures with the weird behaviour of var can have some insiduous consequences (example adapted from the Future of JavaScript talk):

var counters = [Counter(), Counter(), Counter ()];

function addOnClick() {
  for (i = 0; i < counters.length; i++) {
    var x = counters[i];
    x.onclick = function() { x.increment(); }
  }
}

addOnClick();
for (i = 0; i < counters.length; i++) {
  counters[i].onclick();
  counters[i].show(); // prints 0 0 3
}

Finally, you can introduce arbitrary other objects into the scope chain using with:

var x = { "someProp": 3 };

function f(x) {
  var someOtherProp = 4;

  with(x) {
    console.log(someProp, someOtherProp); // prints 3 4
    someProp      = 5;
    someOtherProp = 6;
  }

  console.log(someProp, someOtherProp); // prints 1 6
  someProp      = 7;
  someOtherProp = 8;
}

someProp      = 1;
someOtherProp = 2;
f(x);
console.log(someProp, someOtherProp); // prints 7 2
console.log(x.someProp); // prints 5

Accessors

Starting with JavaScript version ES5—the language currently implemented by V8, and hence Chrome and node.js—there are actually two kinds of properties: data properties defined by their value, of the form that we have already seen, and accessor properties defined by a getter and a setter:

var x = { 
  set someProp(v) { 
    this._someProp = v * 2;
  },
  get someProp() {
    return this._someProp + 1;
  }
}

x.someProp = 5;
console.log(x._someProp, x.someProp); // prints 10 11

When we assign to x.someProp, we actually call the setter function. The setter function is passed the implicit this argument as usual; in this example, it sets this._someProp, a value property (we could instead use a closure to hide this underlying property completely). When we get x.someProp, we actually call the getter.

Setters and getters are looked up through the prototype chain, as usual:

var y = { "__proto__": x };

y.someProp = 6;
console.log(y._someProp, y.someProp); // prints 12 13
console.log(x._someProp, x.someProp); // prints 10 11

Note what is going on here: when we assign to y.someProp, the setter is looked up through the prototype chain, but when this setter writes to the value property _someProp, as usual this is set in y, not in the prototype x. This is entirely consistent: we traverse the prototype chain when we get properties (including when we look for functions to execute), but we ignore the prototype chain when we set value properties (this is not entirely true; see next section on metadata).

Metadata

Also starting from ES5, objects and properties have associated metadata. The details are beyond the scope of this blog post; I merely mention them here:

Many of these properties can only change in certain ways: for example, once config is set to false, it cannot subsequently be set to true anymore. See States and transitions of the attributes of an EcmaScript 5 property for details.

eval

The eval primitive evaluates a string as if it was a JavaScript program. However, the precise behaviour of eval, in particular with respect to scope, depends on two things. If we call eval directly then the snippet gets executed in the current scope; if we call eval indirectly it gets executed in the global scope:

var indirectEval = eval;

x = 10;

function f() {
  var x = 20; 
  eval("x = 30");
  console.log(x, global.x); // prints 30 10 

  indirectEval("x = 40");
  console.log(x, global.x);  // prints 30 40 
}
f();

The behaviour of eval is also affected by whether we execute the snippet in “strict mode” or not. The most (only?) reliable way to use strict mode is to specify this in the argument to eval itself (see section 5.3 of the Tested Semantics for.. paper). Without strict mode, any local variables introduced by the snippet are added to its scope (be it the current scope for direct calls or the global scope for indirect calls):

function g() {
  eval("var x; x = 60");
  console.log(x, global.x); // prints 60 40

  x = 70;
  console.log(x, global.x); // prints 70 40
}
g();

Note that the direct assignment to x in g affects a function local variable, not the global variable; execution of the snippet affected the scope of the function. In strict mode however any local variables introduced by the snippet are local to the snippet:

function h() {
  eval("'use strict'; var x; x = 80");
  console.log(x, global.x); // prints 40 40

  x = 90;
  console.log(x, global.x); // prints  90 90;
}
h();

Minor remarks

Evaluation order

Evaluation order is strictly left to right.

Arrays

Arrays are objects like any other, and hence we can remove an object from the middle of an array:

function sum(arr) {
  var r = 0;
  for (var i = 0; i < arr.length; i++) {
    r += arr[i];
  }
  return r;
}

console.log(sum([1,2,3])); // prints 6
var a = [1,2,3,4];
delete a["2"];
console.log(sum(a)); // prints NaN (1 + 2 + undefined + 4)

for .. in

We haven’t talked much about iteration here (except to mention the enum attribute), but be aware that this always iterates over keys, not values, even for arrays:

for (i in [3, 4, 5]) {
  console.log(i); // prints 0, 1, 2
}

The next version of JavaScript will have a for .. of loop that is more flexible.

References