Saturday, March 28, 2009

Namespaces, Subjects and Verbs in JavaScript

It has been interesting to follow Dion Almaer's comments about using Dojo in Bespin. This post mentioning call conventions higlighted something I have wanted to talk more about.

The basic issue is namespacing and how you like to call functions. Here are some choices:
  1. subject.verb()
  2. namespace.verb(subject)
  3. namspace(subject).verb(), which can lead to namspace(subject).verb().verb().verb()
#1 is the Prototype/Ruby way. You define a verb on the object's class, or in JavaScript, on its prototype.

In this model the namespace is really the subject's namespace. There is a possibility of collision with other code that wants to use the same verb. This can be mitigated by keeping your project small and focused, and managing your dependencies. This also works well when your code is a leaf node, or one step from the leaf node -- your code is not consumed by other code, except maybe a top level web application.

The benefit is a nice call structure, and I think fits better with normal English. The subject is identified and then a verb/action is performed with that subject.

Unfortunately, in JavaScript, there can be problems adding things to basic prototypes, like Array, String (and shudder, Object). In Dojo we have had to put in some protections in our code in case the page also uses code that modifies the built-in prototypes.

I believe the situation will improve in future JavaScript releases if added properties/verbs can be marked as not enumerable. That will help a bit, but namespace collision is still an issue.

Some browsers now have native String.prototype.trim implementations and Dojo now delegates to them for Dojo's trim method. Recently, there was a bug filed for Dojo where the core issue was some other code adding a String.prototype.trim() method that did not strip out beginning whitespace.

#2 takes the position that verb collision is bad and can cause hard to trace bugs, so always use your own namespace. It also feels more "functional" or procedural: use small functions that do not maintain interior state but operate on their arguments.

This has the benefit of being safer, but it can be more verbose than #1, and therefore more of a constant tax on the developer.

This is mitigated somewhat in Dojo, where you can assign Dojo to another namespace. So you could map "dojo" to "$" to cut down on the typing.

#3 is a nice compromise: Define a function that wraps the real subject in an object and provides verbs to act on that subject, without directly modifying the object/class structure of the subject. jQuery took this to new heights by allowing chaining of the verbs. Similarly, dojo.query() returns a dojo.NodeList object that has chainable verbs on it.

An explicit namespace is involved, but the idea is to make it as small as possible, "$". So the overhead for having the namespace is "$()". Coupled with the chainable verbs, it can lead to short code, less of a tax on the developer.

For Dojo, I think #3 is the right approach in general: it gives nice namespace protection, but still gives short call structures.

I want to explore a dojo() function that does this verb mapping in general for all of Dojo. Dijit might be able to use a dijit() to get something similar for the verbs it exposes in its namespace.

So for instance: dojo({foo: "bar"}).clone() would act the same as dojo.clone({foo: "bar"}). Scopemap dojo to $ and you get $({foo: "bar"}).clone().

The difficult part is how to deal with verbs/methods that deal with Strings. I want dojo("div") to actually call dojo.query("div"). But how to allow dojo(" some string ").trim()?

Maybe not map dojo("string") to dojo.query("string"), but instead, allow scope mapping of "d" (or even "_"?) to dojo and another mapping of "$" to dojo.query. That would probably match best with the expectations of what $ does today in other toolkits.

I will have to do more exploration. Eugene Lazutkin's work on providing adaptAs* functions for mixing in dojo methods into an object prototype as chainable methods might point the way to do this.


Alex Russell said...

This is a great analysis. It really clarifies a lot for me.

It really does seem clear that the language-intrinsic bits of Dojo are one namespace and we try to unify to that (e.g., connect()), while the DOM/NodeList bits are a different set with different normalized behaviors. Related but different, and giving them different top-level entry points might be a nice way to end the war.

One of the things that's still left lingering in my mind is the difference (if there should be one) between singles and multiples. For dojo.NodeList(), we deal in multiple items, whereas other APIs which are lower level all deal (explicitly) with single elements. Dojo hasn't had an Element class for various reasons, and it seems that Element is a 3rd namespace, or at least a flattened version of the NodeList "thing". Getting to a common view between Elements and NodeLists strikes me as the missing link which would allow Dijit to have an abstraction that springs directly from Core.

Dunno. What do you think?

James Burke said...

Alex, my first thought is that we do not need a separate Element API.

What I would like to see for all the getter methods on NodeList, if the NodeList has one element, just return the value not as an array but as the getter value.

I think this gets to your developer productivity desires, and similar conversations we have had about dojo.create: it is OK if the return type might be different things in different contexts, the context will be obvious for the developer.

One reason to do an Element API may be performance. However, Element should have the same API as the NodeList API, and the Element type should be interoperable with NodeLists, being able to add them to each other would be nice.

But the performance gain would have to be awesome and the Element implementation size would have to be tiny for it to be worth it.

I would rather see us optimize the NodeList operations as much as possible. Maybe we can fast-path the NodeList operations if there is only one node in the list. If we can do that magic in the adaptAs* functions Eugene added, that might be enough of a speed benefit/implementation size tradeoff to avoid the Element construct.