RequireJS 0.14.1 is now available.
I just pushed a small update to fix three issues that dealt mostly with the new shortened anonymous module syntax used to wrap traditional CommonJS modules, and the converter tool for adding in the anonymous wrapper.
If you were using the regular RequireJS module format, with the dependencies specified outside the definition function, then you likely do not need to upgrade right away.
In fact, you may want to wait a couple days to see if there are any other updates. Given the newness of the anonymous module code and more people trying it with traditional CommonJS modules, I want to push out quicker releases to give those new users the best experience. I will be sure to mention if the update is recommended for all users.
There is one fix in this release for a type of deeply cyclic/circular dependency issue, but I believe it to be a rare issue for most current users. If 0.14.0 is working for you, then no need to try the latest version right away.
Monday, September 27, 2010
Sunday, September 26, 2010
RequireJS 0.14.0 Released
RequireJS 0.14.0 is now available. The big changes:
The async module format/async require that is now supported by RequireJS really feels like it is the best of both worlds: something that is close enough to traditional CommonJS modules to allow those environments to support the format, while still having something that performs well and is easy to debug in the browser. I really hope the format can be natively supported in existing CommonJS engines. Until then, RequireJS works in Rhino and has an adapter for Node.
I put up a CommonJS Notes page for people coming from a CommonJS background. Also, the API docs are updated to reflect the simpler anonymous module format, and it includes a new section about loading modules from CommonJS packages.
What is next
Working with CommonJS packages has a few manual steps: finding the package, downloading it, configuring its location. I want to work on a command line package tool that makes this easy. Hopefully it will be able to talk to a server-side package registry too, to allow simpler package lookups by name (something like npm, but something that can house modules in a format used by RequireJS). Kris Zyp has already done some work in this area, and I hope to just use it outright for RequireJS, or leverage the code.
Once that lands, then it feels like it will be time for a RequireJS 1.0 release. The code has been very usable for a few releases now, but I have kept the release numbers below 1.0 to indicate that the final mix of features were being worked out. With the changes in this release, it feels like the major format changes have landed. For those of you who have used previous RequireJS releases, your code should still work fine, and it should work as-is in future releases too.
- Anonymous modules support, CommonJS Asynchronous Module proposal supported.
- Loading modules from CommonJS packages.
- Bug fixes (see commits starting from (see commits starting from 2010-09-15 through 2010-09-26))
The async module format/async require that is now supported by RequireJS really feels like it is the best of both worlds: something that is close enough to traditional CommonJS modules to allow those environments to support the format, while still having something that performs well and is easy to debug in the browser. I really hope the format can be natively supported in existing CommonJS engines. Until then, RequireJS works in Rhino and has an adapter for Node.
I put up a CommonJS Notes page for people coming from a CommonJS background. Also, the API docs are updated to reflect the simpler anonymous module format, and it includes a new section about loading modules from CommonJS packages.
What is next
Working with CommonJS packages has a few manual steps: finding the package, downloading it, configuring its location. I want to work on a command line package tool that makes this easy. Hopefully it will be able to talk to a server-side package registry too, to allow simpler package lookups by name (something like npm, but something that can house modules in a format used by RequireJS). Kris Zyp has already done some work in this area, and I hope to just use it outright for RequireJS, or leverage the code.
Once that lands, then it feels like it will be time for a RequireJS 1.0 release. The code has been very usable for a few releases now, but I have kept the release numbers below 1.0 to indicate that the final mix of features were being worked out. With the changes in this release, it feels like the major format changes have landed. For those of you who have used previous RequireJS releases, your code should still work fine, and it should work as-is in future releases too.
Monday, September 20, 2010
Anonymous Module Support in RequireJS
Thanks to the clever research and design feedback from Kris Zyp, I just finished committing some preliminary support for anonymous modules in RequireJS.
What are anonymous modules?
They are modules that do not declare their name as part of their definition. So instead of defining a module like so in RequireJS:
require.def('foo', ['bar'], function(bar) {});
You can now do this:
require.def(['bar'], function (bar) {});
When using RequireJS in the browser, the name of the module will be inferred by the script tag that loads it. For Rhino/Node, the module name is known at the time of the require.def call by the require() code, so those environments have an easier way to associate the module definition with the name.
Why is this important?
This allows your modules to be more portable -- if you change the directory structure of where a module is, there are fewer things that you need to change. You still may want to check module dependencies, but RequireJS now fully supports of relative module names, like "./bar" and "../bar", so by using those, it can help make your module more portable.
Requiring a module name in the module definition was also a notable objection that the CommonJS group had to the module format that RequireJS supports natively. By removing this objection, it gets easier to talk about unifying module formats across the groups.
To that end, there has been talk in the CommonJS group of an Asynchronous Module definition, something that allows the modules to work well in the browser without needing any server or client transforms. Some of the participants do not like the module format mentioned above, and prefer something that looks more like the existing CommonJS modules.
Tom Robinson put forward this suggestion:
So I also put support for the above syntax into RequireJS, in an attempt to get to an async module proposal in CommonJS that works for the the people that like the old, browser-unfriendly syntax and for the people like me that prefer a browser-friendly format I can code in source.
We still need to hash out the proposal more, but I am hopeful we can find a good middle ground. I also hope the above syntax makes it easier to support setting the module export value via "return" instead of having to use "module.exports =" or "module.setExports()".
I still plan to support the syntax that RequireJS has supported in the past -- any of this new syntax will hopefully be additive.
What is the fine print?
Only one anonymous module can be in a file. This should not be a problem, since you are encouraged to only put one module in a file for your source code.
The RequireJS optimization tool can group modules together into an optimized file, and it has the smarts to also inject the module name at that time, so you get less typing and a more robust module source form, but still get the optimization benefits for deployment.
In addition to adding the module name, the RequireJS optimization tool will also pull out the dependencies that are specified using the CommonJS Asynchronous Module proposal mentioned above, and add those to the require.def call to make that form more efficient.
When will it be available?
Right now the code is in the master branch. Feel free to pull it and try it. There may be some loose ends to clean up, but there are unit tests for it, and the old unit tests pass.
This code will likely be part of a 0.14 release. I want to get in loading modules from CommonJS-formatted packages before I do the 0.14 release, so it still is probably a few weeks away. But please feel free to try out the latest code in master to get an early preview.
Again, many thanks to Kris Zyp for seeing patterns I overlooked, doing some great IE research, and for pushing for these changes.
What are anonymous modules?
They are modules that do not declare their name as part of their definition. So instead of defining a module like so in RequireJS:
require.def('foo', ['bar'], function(bar) {});
You can now do this:
require.def(['bar'], function (bar) {});
When using RequireJS in the browser, the name of the module will be inferred by the script tag that loads it. For Rhino/Node, the module name is known at the time of the require.def call by the require() code, so those environments have an easier way to associate the module definition with the name.
Why is this important?
This allows your modules to be more portable -- if you change the directory structure of where a module is, there are fewer things that you need to change. You still may want to check module dependencies, but RequireJS now fully supports of relative module names, like "./bar" and "../bar", so by using those, it can help make your module more portable.
Requiring a module name in the module definition was also a notable objection that the CommonJS group had to the module format that RequireJS supports natively. By removing this objection, it gets easier to talk about unifying module formats across the groups.
To that end, there has been talk in the CommonJS group of an Asynchronous Module definition, something that allows the modules to work well in the browser without needing any server or client transforms. Some of the participants do not like the module format mentioned above, and prefer something that looks more like the existing CommonJS modules.
Tom Robinson put forward this suggestion:
require.def(function(require, exports, module) {and use Function.prototype.toString() to pull out the require calls and be sure to load them before executing the module's definition function. After doing some research, it seems like this approach could work for modules in development, and optimizations could be done for deployment that would add the module name and the dependencies outside the function.
var foo = require("foo"),
bar = require("bar");
exports.someProp = "value";
});
So I also put support for the above syntax into RequireJS, in an attempt to get to an async module proposal in CommonJS that works for the the people that like the old, browser-unfriendly syntax and for the people like me that prefer a browser-friendly format I can code in source.
We still need to hash out the proposal more, but I am hopeful we can find a good middle ground. I also hope the above syntax makes it easier to support setting the module export value via "return" instead of having to use "module.exports =" or "module.setExports()".
I still plan to support the syntax that RequireJS has supported in the past -- any of this new syntax will hopefully be additive.
What is the fine print?
Only one anonymous module can be in a file. This should not be a problem, since you are encouraged to only put one module in a file for your source code.
The RequireJS optimization tool can group modules together into an optimized file, and it has the smarts to also inject the module name at that time, so you get less typing and a more robust module source form, but still get the optimization benefits for deployment.
In addition to adding the module name, the RequireJS optimization tool will also pull out the dependencies that are specified using the CommonJS Asynchronous Module proposal mentioned above, and add those to the require.def call to make that form more efficient.
When will it be available?
Right now the code is in the master branch. Feel free to pull it and try it. There may be some loose ends to clean up, but there are unit tests for it, and the old unit tests pass.
This code will likely be part of a 0.14 release. I want to get in loading modules from CommonJS-formatted packages before I do the 0.14 release, so it still is probably a few weeks away. But please feel free to try out the latest code in master to get an early preview.
Again, many thanks to Kris Zyp for seeing patterns I overlooked, doing some great IE research, and for pushing for these changes.
Saturday, September 11, 2010
RequireJS 0.13.0 Released
RequireJS 0.13.0 is available! The changes:
- module.setExports and module.exports are now supported for converted CommonJS modules.
- Bug fixes (see commits starting from 2010-07-05 through 2010-09-10), in particular a fix to throw when a timeout for a script occurs. That fix should make debugging issues much easier.
Tuesday, July 13, 2010
Simple Modules Feedback
Executive summary, with apologies to Jay-Z: "I got 99 problems but lack of lexical scoping ain't one".
While at the Mozilla Summit, I saw Dave Herman's presentation on a Simple Modules proposal for JavaScript/ECMAScript. Dave posted the slides, and be sure to read his follow-up post. I suggest you read his slides and blog posts first for some background.
[Sidenote: I'm going to use JavaScript instead of ECMAScript in this post -- JavaScript and I go way back, before it got its colonial, skin disease-inspired name.]
Simple Modules is a strawman proposal at the moment, it is still a work in progress. Some of the more interesting parts for me, the dynamic loading, are still very rough and in a separate proposal. It sounded like Dave wants to focus on prototyping the lexical scoping and static loading bits first before proceeding further on the dynamic bits. Great idea on actual prototyping, sounds like they will leverage Narcissus for doing the prototyping.
So some of my feedback may be a bit premature, but some of it gets to why there are modules and what should be allowed as a module, so hopefully that might be useful even at this early stage.
First, some perspective on where my feedback comes from: I am a front-end developer, I do web apps in the browser. I love JavaScript, I want to use it everywhere, and I believe that it is the only language that has the potential to be used effectively anywhere.
However, that is only because JavaScript is available and works well in the browser. Any new solutions for modules should *work well* in the browser to be considered a solution. The browser environment should be treated as a first class citizen, keeping in mind the browser performance implications on any approach. This is one of my main criticisms of CommonJS modules, and the reason I write RequireJS, a module loader that works well in the browser. I also maintain Dojo's module loader and build system.
Why
Why have modules? What are they? Modules are smaller units of code that help build up larger code structures. They make programming in the large easier. They usually have specific scope, and avoid dumping properties into the global scope. Otherwise the likelihood of a name collision between two modules is very high and errors occur. So a module system has syntax to avoid polluting the global space.
There also needs to be a way for modules to reference other modules.
Simple Modules
The Simple Modules proposal outlines a Module {} block to define what looks like a JavaScript object as far as inspection (for .. in notation, dot property referencing), but is something more nuanced underneath.
Anything inside the Module {} block is not allowed to use a global object, and you cannot add/change a Module after its definition. Here is a sample module, called M, that demonstrates some of the syntax and scoped variable implications:
Lexical scoping/Global access
In addition to using the function-based module pattern to avoid leaking globals, I use JSLint to avoid accessing globals and the use of eval/with. For programming in the large, JSLint helps even more because it enforces a code style that produces much more uniform code. It is built into many editors and easy to run as part of build processes.
JSLint is not perfect (I would like to see JavaScript 1.7/1.8 idioms supported, like let and for each), and you may not like some of the style choices. However, reproducible, consistent style that can be checked automatically is more important than bikeshed-based style choices. It warns of global usage, eval and with, and even helps you find unused local variables.
What is even nicer is that you can opt out of some of the JSLint choices, you can use some globals if you need to. There is some flexibility in the choices.
Functions as Modules
For #3, better syntax for module definitions, I do not see it as a net win over the function(){} module pattern, particularly how it is used for modules in RequireJS where it encourages not defining global objects.
The Simple Modules syntax does not allow exporting a function as the module definition. This is a big wart to me. Functions are first class entities in JavaScript, one of its strongest features. It is really ugly to me that I have to create a property on a module object to export a constructor function, or some module that lends itself to being a function:
It means the developer that codes a circular dependency case needs to take some care, but it works. Coding a circular dependencies is a much rarer event than wanting to use a module that exports just a constructor function or function. The majority use case should not be punished to make a minority use case a little easier, particularly since you can still trigger errors in the minority use case. Coding a circular dependency will always require special care.
This particular point makes it hard for me to get on board with Simple Modules even in its basic lexical scoping/static loading form. I strongly urge any module proposal to make sure functions can be treated as the exported value. We have that capability today with existing module implementations, and it fits with JavaScript and the importance it places on functions.
Given the extra typing that would be needed to access functions that are exported as modules, I do not see the Simple Modules syntax a net win over the function-based module pattern, particularly as used in RequireJS.
Beyond Lexical Scoping
For programming in the large, what is really needed are more capabilities than what has been outlined so far for Simple Modules. However, making modules useful needs attention in these areas. This is the "99 problems" part:
Dynamic loading
Dynamic loading is harder to work out than static loading. If there is dynamic loading, it is unclear I would need static loading . The goal of modules is to allow programming in the large, and even for a smaller project, why do I need to learn two ways to load modules (static vs. dynamic), when one (dynamic) will do? Dynamic loading is also necessary to enable all the performance options we have today to load scripts in the browser.
There is a module loader strawman proposal that would tie into Simple Modules, but I understand it will not be nailed down more until the basic Simple Modules with static loading is worked out/prototyped.
Referring to other modules
It is unclear how a Module Resource Locator (MRL) is translated to a path to find a module. In CommonJS/RequireJS, an MRL looks like "some/module", and that MRL is used in require() calls to refer to other modules. require("some/module") translates the MRL string "some/module" to some path, "a/directory/that/has/some/module.js". That path is used to find and load the referenced module.
Looking at the Simple Modules examples, it looks like just plain URLs are used as the MRL, and those do not scale well for programming in the large. You will want to use a symbolic name for the MRL, and allow some environment config to map those symbolic names to paths. Otherwise it places too many constraints on how the code is stored. It may not even be a disk -- apparently CouchDB uses design docs to store modules.
I have seen some comments about using more symbolic names for MRLs in some of the notes around the proposals, so maybe it is planned.
In RequireJS, the symbolic name is also used in the module definition. However, since symbolic names can be mapped, they do not have to be the reverse DNS symbolic names, like "org/mozilla/foo". In fact it is encouraged to not use long names.
Distributing and sharing modules/module groups (packages)
This issue can be treated separately from a module spec, but it could affect how MRLs are mapped via a module loader. And this issue really is important for programming in the large. The solution may just be "use packages as outlined by CommonJS". While there are still some gray areas in the package-related specs for CommonJS, that could be a fine answer to the problem.
Performance in the browser
This is getting even further away from the basic Simple Modules spec, but a solution to this issue should be considered for any module solution. The browser needs to be able to deliver many modules at once to the browser in an efficient way. I have heard that Alexander Limi's Resource Packages proposal may be a way to solve this that may work with the Simple Modules approach.
A common loading pattern for web apps will be to load some base scripts from a Content Delivery Network (CDN), then have some domain-specific scripts to load. As long as this still works well with the bundling solution that is great. We already have tools today to help bundling, minifying and gzipping scripts. Any solution will have to be better than what we can do today. Resource Packages could be since it allows other things like images to be effectively bundled.
Summary
I do not feel like Simple Modules are an improvement over what can be done today. In particular, I feel RequireJS when used alongside JSLint is a compelling existing solution, and it works well, and fast, in the browser.
For the more immediate goals of Simple Modules:
I do not want to contribute stop energy around the proposals, I am just hoping to provide feedback to indicate what problems need to be solved better from my web developer viewpoint. I appreciate I could be wrong on some things too. I may be missing something grander or larger, but hopefully if that is the case, this feedback can indicate how to explain the proposals better.
While at the Mozilla Summit, I saw Dave Herman's presentation on a Simple Modules proposal for JavaScript/ECMAScript. Dave posted the slides, and be sure to read his follow-up post. I suggest you read his slides and blog posts first for some background.
[Sidenote: I'm going to use JavaScript instead of ECMAScript in this post -- JavaScript and I go way back, before it got its colonial, skin disease-inspired name.]
Simple Modules is a strawman proposal at the moment, it is still a work in progress. Some of the more interesting parts for me, the dynamic loading, are still very rough and in a separate proposal. It sounded like Dave wants to focus on prototyping the lexical scoping and static loading bits first before proceeding further on the dynamic bits. Great idea on actual prototyping, sounds like they will leverage Narcissus for doing the prototyping.
So some of my feedback may be a bit premature, but some of it gets to why there are modules and what should be allowed as a module, so hopefully that might be useful even at this early stage.
First, some perspective on where my feedback comes from: I am a front-end developer, I do web apps in the browser. I love JavaScript, I want to use it everywhere, and I believe that it is the only language that has the potential to be used effectively anywhere.
However, that is only because JavaScript is available and works well in the browser. Any new solutions for modules should *work well* in the browser to be considered a solution. The browser environment should be treated as a first class citizen, keeping in mind the browser performance implications on any approach. This is one of my main criticisms of CommonJS modules, and the reason I write RequireJS, a module loader that works well in the browser. I also maintain Dojo's module loader and build system.
Why
Why have modules? What are they? Modules are smaller units of code that help build up larger code structures. They make programming in the large easier. They usually have specific scope, and avoid dumping properties into the global scope. Otherwise the likelihood of a name collision between two modules is very high and errors occur. So a module system has syntax to avoid polluting the global space.
There also needs to be a way for modules to reference other modules.
Simple Modules
The Simple Modules proposal outlines a Module {} block to define what looks like a JavaScript object as far as inspection (for .. in notation, dot property referencing), but is something more nuanced underneath.
Anything inside the Module {} block is not allowed to use a global object, and you cannot add/change a Module after its definition. Here is a sample module, called M, that demonstrates some of the syntax and scoped variable implications:
module M {
//In normal code this would define a global,
//but not inside the module declaration. This
//is likely to be an error(?) in Simple Modules.
foo = "bar";
//color is only visible within module M's block
var color = "blue";
//Creates a publicly visible property called
//"name" on the module.
export name = "Module M";
//setColor is only visible within module M's block
function setColor() {}
//Creates a publicly visible property called
//"reverseName" on the module whose value
//is a function
export function reverseName() {}
}You can reference/statically load other modules via load (syntax is just a placeholder, not set in stone):module jQuery = load “jquery.js”;The goals with this approach:
- Stronger lexical scoping: no eval or with allowed in the modules, and a loaded module shares some lexical scope with the module that loaded it.
- No access to a global object by default.
- Hopefully better syntax for declaring modules over the existing function-based module pattern.
Lexical scoping/Global access
In addition to using the function-based module pattern to avoid leaking globals, I use JSLint to avoid accessing globals and the use of eval/with. For programming in the large, JSLint helps even more because it enforces a code style that produces much more uniform code. It is built into many editors and easy to run as part of build processes.
JSLint is not perfect (I would like to see JavaScript 1.7/1.8 idioms supported, like let and for each), and you may not like some of the style choices. However, reproducible, consistent style that can be checked automatically is more important than bikeshed-based style choices. It warns of global usage, eval and with, and even helps you find unused local variables.
What is even nicer is that you can opt out of some of the JSLint choices, you can use some globals if you need to. There is some flexibility in the choices.
Functions as Modules
For #3, better syntax for module definitions, I do not see it as a net win over the function(){} module pattern, particularly how it is used for modules in RequireJS where it encourages not defining global objects.
The Simple Modules syntax does not allow exporting a function as the module definition. This is a big wart to me. Functions are first class entities in JavaScript, one of its strongest features. It is really ugly to me that I have to create a property on a module object to export a constructor function, or some module that lends itself to being a function:
module jQuery {
export jQuery = function () {};
}
/*** In some other file ***/
module jQuery = load “jquery.js”;
//ugly
var selection = jQuery.jQuery();
//less ugly, but more typing, so still ugly
var $ = jQuery.jQuery;
var selection = $();Again, ugly. It should be possible to set the module value to be a function. I know this makes some circular dependency cases harder to deal with, but as I outlined in the CommonJS trade-offs post, it is possible to still have circular dependencies. Even in CommonJS environments now, it is seen as useful. Node supports setting the exported value to a function via module.exports, and there is a more general CommonJS proposal for a module.setExports.It means the developer that codes a circular dependency case needs to take some care, but it works. Coding a circular dependencies is a much rarer event than wanting to use a module that exports just a constructor function or function. The majority use case should not be punished to make a minority use case a little easier, particularly since you can still trigger errors in the minority use case. Coding a circular dependency will always require special care.
This particular point makes it hard for me to get on board with Simple Modules even in its basic lexical scoping/static loading form. I strongly urge any module proposal to make sure functions can be treated as the exported value. We have that capability today with existing module implementations, and it fits with JavaScript and the importance it places on functions.
Given the extra typing that would be needed to access functions that are exported as modules, I do not see the Simple Modules syntax a net win over the function-based module pattern, particularly as used in RequireJS.
Beyond Lexical Scoping
For programming in the large, what is really needed are more capabilities than what has been outlined so far for Simple Modules. However, making modules useful needs attention in these areas. This is the "99 problems" part:
Dynamic loading
Dynamic loading is harder to work out than static loading. If there is dynamic loading, it is unclear I would need static loading . The goal of modules is to allow programming in the large, and even for a smaller project, why do I need to learn two ways to load modules (static vs. dynamic), when one (dynamic) will do? Dynamic loading is also necessary to enable all the performance options we have today to load scripts in the browser.
There is a module loader strawman proposal that would tie into Simple Modules, but I understand it will not be nailed down more until the basic Simple Modules with static loading is worked out/prototyped.
Referring to other modules
It is unclear how a Module Resource Locator (MRL) is translated to a path to find a module. In CommonJS/RequireJS, an MRL looks like "some/module", and that MRL is used in require() calls to refer to other modules. require("some/module") translates the MRL string "some/module" to some path, "a/directory/that/has/some/module.js". That path is used to find and load the referenced module.
Looking at the Simple Modules examples, it looks like just plain URLs are used as the MRL, and those do not scale well for programming in the large. You will want to use a symbolic name for the MRL, and allow some environment config to map those symbolic names to paths. Otherwise it places too many constraints on how the code is stored. It may not even be a disk -- apparently CouchDB uses design docs to store modules.
I have seen some comments about using more symbolic names for MRLs in some of the notes around the proposals, so maybe it is planned.
In RequireJS, the symbolic name is also used in the module definition. However, since symbolic names can be mapped, they do not have to be the reverse DNS symbolic names, like "org/mozilla/foo". In fact it is encouraged to not use long names.
Distributing and sharing modules/module groups (packages)
This issue can be treated separately from a module spec, but it could affect how MRLs are mapped via a module loader. And this issue really is important for programming in the large. The solution may just be "use packages as outlined by CommonJS". While there are still some gray areas in the package-related specs for CommonJS, that could be a fine answer to the problem.
Performance in the browser
This is getting even further away from the basic Simple Modules spec, but a solution to this issue should be considered for any module solution. The browser needs to be able to deliver many modules at once to the browser in an efficient way. I have heard that Alexander Limi's Resource Packages proposal may be a way to solve this that may work with the Simple Modules approach.
A common loading pattern for web apps will be to load some base scripts from a Content Delivery Network (CDN), then have some domain-specific scripts to load. As long as this still works well with the bundling solution that is great. We already have tools today to help bundling, minifying and gzipping scripts. Any solution will have to be better than what we can do today. Resource Packages could be since it allows other things like images to be effectively bundled.
Summary
I do not feel like Simple Modules are an improvement over what can be done today. In particular, I feel RequireJS when used alongside JSLint is a compelling existing solution, and it works well, and fast, in the browser.
For the more immediate goals of Simple Modules:
- the expanded, stricter lexical scoping is nice, but for a web developer, it is a slight incremental benefit if JSLint is already in use.
- Not being able to set a function as the module value means the syntax is not a net win over the function-based module pattern.
I do not want to contribute stop energy around the proposals, I am just hoping to provide feedback to indicate what problems need to be solved better from my web developer viewpoint. I appreciate I could be wrong on some things too. I may be missing something grander or larger, but hopefully if that is the case, this feedback can indicate how to explain the proposals better.
Sunday, July 04, 2010
RequireJS 0.12.0 Released
RequireJS 0.12.0 is available! This release has the following enhancements:
- A new plugin: order -- it ensures that scripts are fetched asynchronously and in parallel, but executed in the order specified in the call to require(). Ideal for traditional browser scripts that do not participate in modules defined via calls to require.def().
- Web Worker support. RequireJS can be used in a web worker.
- Multiple module names can now be mapped via the paths config option to the same URL, and that URL will only be fetched once.
- Added Firefox 2 to supported browsers. Safari 3.2 also works with require().
- Bug fixes.
Wednesday, May 19, 2010
Using RequireJS syntax in Jetpack Reboot
I have a clone of the Jetpack SDK that has support for the require() and require.def() syntax supported by RequireJS.
Right now the syntax support is very basic. It does not support these features of RequireJS:
require(["dependency"], function (dependency){}());
and define modules via:
require.def("moduleName", ["dependency"], function (dependency){}());
It should also support CommonJS modules that were converted to RequireJS syntax via the conversion tool in RequireJS, but I have not tested it extensively.
The changes are just in one file in the sdk, securable-module.js. So you could just grab that file if you wanted to play with it. There is a sample app in the source if you want to see it in action. Also viewing the changeset shows the diff on the securable-module.js file as well as the example app source.
The full cloned repo is available via:
hg clone http://hg.mozilla.org/users/jrburke_gmail.com/jetpack-sdk-requirejs
Why do this? Because sharing code between the browser and other environments is hard with the regular CommonJS syntax. It does not work well in the browser. The browser-based CommonJS loaders that use eval() have a worse debugging experience. Starting with the RequireJS syntax makes it easy to transfer the modules for use in the web browser, and the RequireJS code works in Node and Rhino.
I would like to add support for RequireJS plugins in Jetpack. I can see the i18n plugin and text file plugin being useful for Jetpacks. That will likely take more work though. I want to see if the basic syntax support is useful first.
I ended up not using that much RequireJS code, just some argument conversions and supporting "setting the exported value". It relies on the existing Jetpack code for paths and package support.
Right now the syntax support is very basic. It does not support these features of RequireJS:
- configuring require() by passing in a config object to it
- plugins
- require.modify
- require.nameToUrl
- require.ready (does not make sense)
require(["dependency"], function (dependency){}());
and define modules via:
require.def("moduleName", ["dependency"], function (dependency){}());
It should also support CommonJS modules that were converted to RequireJS syntax via the conversion tool in RequireJS, but I have not tested it extensively.
The changes are just in one file in the sdk, securable-module.js. So you could just grab that file if you wanted to play with it. There is a sample app in the source if you want to see it in action. Also viewing the changeset shows the diff on the securable-module.js file as well as the example app source.
The full cloned repo is available via:
hg clone http://hg.mozilla.org/users/jrburke_gmail.com/jetpack-sdk-requirejs
Why do this? Because sharing code between the browser and other environments is hard with the regular CommonJS syntax. It does not work well in the browser. The browser-based CommonJS loaders that use eval() have a worse debugging experience. Starting with the RequireJS syntax makes it easy to transfer the modules for use in the web browser, and the RequireJS code works in Node and Rhino.
I would like to add support for RequireJS plugins in Jetpack. I can see the i18n plugin and text file plugin being useful for Jetpacks. That will likely take more work though. I want to see if the basic syntax support is useful first.
I ended up not using that much RequireJS code, just some argument conversions and supporting "setting the exported value". It relies on the existing Jetpack code for paths and package support.
Sunday, May 16, 2010
RequireJS 0.11.0 Released
RequireJS 0.11.0 is available to download! This release has the following enhancements:
The priority config option is the parallel download support I mentioned in the "A require() for jQuery" post. I now believe RequireJS meets all the requirements outlined in that post.
Some icing on the cake I want to pursue: a server-based service that can create optimization layers on the fly. I have all the pieces in place in the optimization tool to allow this, and I previously built a server build option for Dojo. With that, you could conceivably use the priority config support with a server that did the optimization layers on the fly:
That server-based service will likely take a more design work and thought, but if you feel it is something necessary for your project, please let me know. Better yet, if you want to contribute to the project in this area, leave a note on the mailing list.
- There is a new priority config option to indicate priority, parallel download of build layers.
- A new JSONP plugin allows you to treat any JSONP service as dependency.
- require.js should be Caja-compliant. The plugins may not be, but the main require.js file passed cajoling on http://caja.appspot.com/.
- Instructions and optimization support for renaming require().
- There is a new RequireJS+Transport D download option that supports the CommonJS Transport D proposal. This can be useful in conjunction with the server-side Transporter project.
The priority config option is the parallel download support I mentioned in the "A require() for jQuery" post. I now believe RequireJS meets all the requirements outlined in that post.
Some icing on the cake I want to pursue: a server-based service that can create optimization layers on the fly. I have all the pieces in place in the optimization tool to allow this, and I previously built a server build option for Dojo. With that, you could conceivably use the priority config support with a server that did the optimization layers on the fly:
require({
priority: [
"http://your.domain.com/opt?include=event,object,widget,Dialog&exclude=jquery",
"http://your.domain.com/opt?include=page1,Tabs&exclude=jquery,event,object,widget,Dialog&exclude=jquery"
]
}, ["page1"]);
Or something like that. The fun part -- this server endpoint would use server-side JavaScript, since the optimization tool in RequireJS is built in JavaScript. I could use Node or something Rhino-based. It is likely to be Rhino-based since that allows the minifier, Closure Compiler, to work on the fly, since Closure Compiler is written in Java.That server-based service will likely take a more design work and thought, but if you feel it is something necessary for your project, please let me know. Better yet, if you want to contribute to the project in this area, leave a note on the mailing list.
Thursday, April 29, 2010
A require() for jQuery
I had a fun time at the Bay Area jQuery Conference. Great people, and I learned some neat things.
In the conference wrap-up, John Resig mentioned some requirements he has for a jQuery script loader:
1) script loading must be async
2) script loading should do as much in parallel as possible. This means in particular, that it should be possible to avoid dynamic nested dependency loading.
3) it looks like a script wrapper is needed to allow #1 and #2 to work effectively, particularly for cross-domain loading. It is unfortunate, but a necessity for script loading in browsers.
I believe these requirements mesh very well with RequireJS. I will talk about how they mesh, and some other things that should be considered for any require() that might become part of jQuery.
Async Loading
As explained in the RequireJS Why page, I believe the best-performing, native browser option for async loading is dynamically created script tags. RequireJS only uses this type of script loading, no XHR.
The text plugin uses XHR in dev mode, but the optimization tool inlines the text content to avoid XHR for deployment. Also, the plugin capability in RequireJS is optional, it is possible to build RequireJS without it. That is what I do for the integrated jQuery+RequireJS build.
Parallel Loading
John mentioned that dynamic nested dependency resolution was slower and potentially a hazard for end users. Slow, because it means you need to fetch the module, wait for it to be received, then fetch its dependencies. So the module gets loaded serially relative to its dependencies. Potentially hazardous because a user may not know the loading pattern.
The optimization tool in RequireJS avoids the parallel loading for nested dependencies, by just inlining the modules together. The optimization tool can also build files into "layers" that could be loaded in parallel.
For each build layer, there is an exclude option, in which you can list a module or modules you want to exclude. exclude will also exclude their nested dependencies from the build layer.
There is an excludeShallow option if you just want specific modules to exclude, but still want their nested dependencies included in the build layer. This is a great option for making your development process fast: just excludeShallow the current module you are debugging/developing.
While dynamically loading nested dependencies can be slower than a full parallel load, what is needed is listing dependencies individually for each module. There needs to be a way to know what an individual file needs to function if the file is to be portable in any fashion. So the question is how to specify those dependencies for a given file/module.
There are schemes that list the dependencies as a separate companion file with the module, and schemes that list the dependencies in the module file. Using a separate file means the module is less portable -- more "things" need to follow the module, so it makes copy/pasting, just distributing one module more onerous.
So I prefer listing the dependencies in the file. Should the dependencies be listed in a comment or as some sort of script structure?
Comments can be nice since they can be stripped from the built/optimized layer. However, it means modules essentially need to communicate with each other through the global variable space. This ultimately does not scale -- at some point you will want to load two different versions of a module, or two modules that want to use the same global name, and you will be stuck. For that reason, I favor the way RequireJS does it:
This model also frees the jQuery object from namespace collisions by allowing a terse way to reference modules without needing them to hang off of the jQuery object. There are many utility functions that do not need to be on the jQuery object to be useful, and today the jQuery object itself is starting to become a global of sorts that can have name collisions.
Script Wrapper
Because async script tags are used to load modules, each script needs to be wrapped in a function wrapper, to prevent its execution before its dependencies are ready. CommonJS recognizes this concern (one of the reasons for their Transport proposals) and so does YUI3. xdomain builds for Dojo also use a script wrapper.
While it is unfortunate -- many people are not used to it -- it ends up being an advantage. Functions are JavaScript's natural module construct, and it encourages well scoped code that does not mess with the global space. For RequireJS, that wrapper is called require.def, as shown above.
Here are some other things that should be considered for a require implementation:
require as a global
I believe it makes more sense to keep require as a global, not something that is a function hanging off of the jQuery object. require can be used to load jQuery itself, and as mentioned above, it would be possible to load more than one version of jQuery if it was constructed like this.
CommonJS awareness
The CommonJS module format was not constructed for the browser, but having an awareness of their design goals and a way to support their modules in the browser will allow more code reuse. RequireJS has an adapter for the CommonJS Transport/D proposal, and it has a conversion script to change CommonJS modules into RequireJS modules.
In addition, RequireJS was constructed with many of the same design goals as CommonJS: allow modules to be enclosed/do not pollute the global space, use the "path/to/module" module identifiers, have the ability to support the module and exports variables used in CommonJS.
Browsers need more than a require API
They also need an optimization/build tool that can combine modules together. RequireJS has such a system today. It is server-independent, a command line tool. It builds up the layers as static files which can be served from anywhere.
I am more than happy to look at a runtime system that uses the optimization tool on the server. RequireJS works in Node and in Rhino. The optimization tool is written in JavaScript and uses require.js itself to build the optimization layers.
I can see using either Node or Rhino to build a run-time server tool to allow combo-loading on the fly. Using Rhino via the Java VM has an advantage because Closure Compiler or YUI Compressor could be used to minify the response, but I am open to some other minification scheme that is implemented in plain JavaScript.
Loader plugins
I have found the text plugin for RequireJS to be very useful -- it allows you to reference HTML templates on disk and edit HTML in an HTML editor vs. dealing with HTML in a string. The optimization tool is smart enough to inline that HTML during a build, so the extra network cost goes away for deployment.
In addition, Sean Vaughan and I have been talking about support for JSONP-based services and scripts that need extra setup besides just being ready on the script onload event. I can see those as easy plugins to add that open up loading Google Ajax API services on the fly.
For these reasons I have found loader plugins to be useful. They are not needed in the basic case, but they can make overall dependency management better.
script.onload
Right now RequireJS has support for knowing when a script is loaded by waiting for the script.onload event. This could be avoided by mandating that anything loaded via require() register via require.def to indicate when it is loaded.
However, by using script.onload it allows some existing scripts to be loaded without modification today, to give people time to migrate to the require.def pattern. I am open to doing a build without the script.onload support, however the amount of minified file savings will not be that great.
Explicit .js suffix
RequireJS allows two different types of strings for dependencies. Here is an example:
The transform rules for a dependency name are as follows: if the name contains a colon before a front slash (has a protocol), starts with a front slash, or ends in .js, do not transform the name. Otherwise, transform the name to "some/base/path/some/module.js".
I believe that gives a decent compromise to short, remappable module names (by changing the baseUrl or setting a specific path via a require config call) to loading scripts that do not participate in the require.def call pattern. There is also a regexp property on require that can be changed to allow more exceptions to the rules.
However, if this was found insufficient, I am open to other rules or a different way to list dependencies. The "some/module" format was chosen to be compatible with CommonJS module names, but probably some algorithm or approach could be used to satisfy both desires.
File Size/Implementation
Right now the stock RequireJS is around 3.7KB minified and gzipped. However, there are build options that get the size down to 2.6KB minified and gzipped by removing some features:
I am open to getting that file size smaller based on the feature set that needs to be supported.
3 layer loading
John mentioned a typical loading scenario that might involve three sections:
1) loading core libraries from a CDN (like jQuery and maybe a require implementation)
2) loading a layer of your common app scripts
3) loading a page-specific layer
RequireJS can support this scenario like so today:
Then the optimization tool instructions would look like so:
However, it is not quite flexible enough -- typically modules that are part of app/page1 will not want to refer to the complete "app/common" as the only dependency, but specify finer-grained dependencies, like "app/common/helper". So the above could result in a request for "app/commom/helper" from the "app/page1" script, depending on how fast "app/common" is loaded.
So I would build in support for the following:
This would give the most flexibility in coding individual modules, but give a very clear optimization path to getting a configurable number of script layers to load async and in parallel. I will be working on this feature for RequireJS for the next release.
Summary
Hopefully I have demonstrated how RequireJS could be the require implementation for jQuery. I am very open to doing code changes to support jQuery's desires, and even if jQuery or John feel like they want to write their own implementation, hopefully we can at least agree on the same API, and maybe even still use the optimization tool in RequireJS. I am happy to help with an alternative implementation too.
I know John and the jQuery team are busy, focusing mostly on mobile and templating concerns, but hopefully they can take the above into consideration when they get to script loading.
In the meantime, I will work on the layers config option support, improving RequireJS, and keeping my jQuery fork up to date with the changes. You can try out RequireJS+jQuery today if you want to give it a spin yourself.
In the conference wrap-up, John Resig mentioned some requirements he has for a jQuery script loader:
1) script loading must be async
2) script loading should do as much in parallel as possible. This means in particular, that it should be possible to avoid dynamic nested dependency loading.
3) it looks like a script wrapper is needed to allow #1 and #2 to work effectively, particularly for cross-domain loading. It is unfortunate, but a necessity for script loading in browsers.
I believe these requirements mesh very well with RequireJS. I will talk about how they mesh, and some other things that should be considered for any require() that might become part of jQuery.
Async Loading
As explained in the RequireJS Why page, I believe the best-performing, native browser option for async loading is dynamically created script tags. RequireJS only uses this type of script loading, no XHR.
The text plugin uses XHR in dev mode, but the optimization tool inlines the text content to avoid XHR for deployment. Also, the plugin capability in RequireJS is optional, it is possible to build RequireJS without it. That is what I do for the integrated jQuery+RequireJS build.
Parallel Loading
John mentioned that dynamic nested dependency resolution was slower and potentially a hazard for end users. Slow, because it means you need to fetch the module, wait for it to be received, then fetch its dependencies. So the module gets loaded serially relative to its dependencies. Potentially hazardous because a user may not know the loading pattern.
The optimization tool in RequireJS avoids the parallel loading for nested dependencies, by just inlining the modules together. The optimization tool can also build files into "layers" that could be loaded in parallel.
For each build layer, there is an exclude option, in which you can list a module or modules you want to exclude. exclude will also exclude their nested dependencies from the build layer.
There is an excludeShallow option if you just want specific modules to exclude, but still want their nested dependencies included in the build layer. This is a great option for making your development process fast: just excludeShallow the current module you are debugging/developing.
While dynamically loading nested dependencies can be slower than a full parallel load, what is needed is listing dependencies individually for each module. There needs to be a way to know what an individual file needs to function if the file is to be portable in any fashion. So the question is how to specify those dependencies for a given file/module.
There are schemes that list the dependencies as a separate companion file with the module, and schemes that list the dependencies in the module file. Using a separate file means the module is less portable -- more "things" need to follow the module, so it makes copy/pasting, just distributing one module more onerous.
So I prefer listing the dependencies in the file. Should the dependencies be listed in a comment or as some sort of script structure?
Comments can be nice since they can be stripped from the built/optimized layer. However, it means modules essentially need to communicate with each other through the global variable space. This ultimately does not scale -- at some point you will want to load two different versions of a module, or two modules that want to use the same global name, and you will be stuck. For that reason, I favor the way RequireJS does it:
With this model, dependency1 does not need to be global, and it allows a very terse way to reference the module. It also minifies nicely. By using string names to reference the modules and using a return value from the function, it is then possible to load two versions of module in a page. See the Multiversion Support in RequireJS for more info, and the unit tests for a working example.
require.def("my/module", ["dependency1"], function (dependency1) {
//dependency1 is the module definition for "dependency1"
//Return a value to define "my/module"
return {
limit: 500,
action: function () {}
};
});
This model also frees the jQuery object from namespace collisions by allowing a terse way to reference modules without needing them to hang off of the jQuery object. There are many utility functions that do not need to be on the jQuery object to be useful, and today the jQuery object itself is starting to become a global of sorts that can have name collisions.
Script Wrapper
Because async script tags are used to load modules, each script needs to be wrapped in a function wrapper, to prevent its execution before its dependencies are ready. CommonJS recognizes this concern (one of the reasons for their Transport proposals) and so does YUI3. xdomain builds for Dojo also use a script wrapper.
While it is unfortunate -- many people are not used to it -- it ends up being an advantage. Functions are JavaScript's natural module construct, and it encourages well scoped code that does not mess with the global space. For RequireJS, that wrapper is called require.def, as shown above.
Here are some other things that should be considered for a require implementation:
require as a global
I believe it makes more sense to keep require as a global, not something that is a function hanging off of the jQuery object. require can be used to load jQuery itself, and as mentioned above, it would be possible to load more than one version of jQuery if it was constructed like this.
CommonJS awareness
The CommonJS module format was not constructed for the browser, but having an awareness of their design goals and a way to support their modules in the browser will allow more code reuse. RequireJS has an adapter for the CommonJS Transport/D proposal, and it has a conversion script to change CommonJS modules into RequireJS modules.
In addition, RequireJS was constructed with many of the same design goals as CommonJS: allow modules to be enclosed/do not pollute the global space, use the "path/to/module" module identifiers, have the ability to support the module and exports variables used in CommonJS.
Browsers need more than a require API
They also need an optimization/build tool that can combine modules together. RequireJS has such a system today. It is server-independent, a command line tool. It builds up the layers as static files which can be served from anywhere.
I am more than happy to look at a runtime system that uses the optimization tool on the server. RequireJS works in Node and in Rhino. The optimization tool is written in JavaScript and uses require.js itself to build the optimization layers.
I can see using either Node or Rhino to build a run-time server tool to allow combo-loading on the fly. Using Rhino via the Java VM has an advantage because Closure Compiler or YUI Compressor could be used to minify the response, but I am open to some other minification scheme that is implemented in plain JavaScript.
Loader plugins
I have found the text plugin for RequireJS to be very useful -- it allows you to reference HTML templates on disk and edit HTML in an HTML editor vs. dealing with HTML in a string. The optimization tool is smart enough to inline that HTML during a build, so the extra network cost goes away for deployment.
In addition, Sean Vaughan and I have been talking about support for JSONP-based services and scripts that need extra setup besides just being ready on the script onload event. I can see those as easy plugins to add that open up loading Google Ajax API services on the fly.
For these reasons I have found loader plugins to be useful. They are not needed in the basic case, but they can make overall dependency management better.
script.onload
Right now RequireJS has support for knowing when a script is loaded by waiting for the script.onload event. This could be avoided by mandating that anything loaded via require() register via require.def to indicate when it is loaded.
However, by using script.onload it allows some existing scripts to be loaded without modification today, to give people time to migrate to the require.def pattern. I am open to doing a build without the script.onload support, however the amount of minified file savings will not be that great.
Explicit .js suffix
RequireJS allows two different types of strings for dependencies. Here is an example:
require(["some/module", "http://some.site.com/path/to/script.js"]);"some/module" is transformed to "some/base/path/some/module.js", while the other one is used as-is.
The transform rules for a dependency name are as follows: if the name contains a colon before a front slash (has a protocol), starts with a front slash, or ends in .js, do not transform the name. Otherwise, transform the name to "some/base/path/some/module.js".
I believe that gives a decent compromise to short, remappable module names (by changing the baseUrl or setting a specific path via a require config call) to loading scripts that do not participate in the require.def call pattern. There is also a regexp property on require that can be changed to allow more exceptions to the rules.
However, if this was found insufficient, I am open to other rules or a different way to list dependencies. The "some/module" format was chosen to be compatible with CommonJS module names, but probably some algorithm or approach could be used to satisfy both desires.
File Size/Implementation
Right now the stock RequireJS is around 3.7KB minified and gzipped. However, there are build options that get the size down to 2.6KB minified and gzipped by removing some features:
- plugin support
- require.modify
- multiversion support (the "context" switching in RequireJS)
- DOM Ready support
I am open to getting that file size smaller based on the feature set that needs to be supported.
3 layer loading
John mentioned a typical loading scenario that might involve three sections:
1) loading core libraries from a CDN (like jQuery and maybe a require implementation)
2) loading a layer of your common app scripts
3) loading a page-specific layer
RequireJS can support this scenario like so today:
<script src="http:/some.cdn.com/jquery/1.5/require-jquery.js"></script>
<script>
require({
baseUrl: "./scripts"
},
["app/common", "app/page1"]
);
</script>
Then the optimization tool instructions would look like so:
{
modules: [
{
//inside app/common.js there is a require call that
//loads all the common modules.
name: "app/common",
exclude: ["jquery"]
},
{
//app/page1 references jquery and app/common as a dependencies,
//as well as page-specific modules
name: "app/page1",
//jquery, app/common and all their dependencies will be excluded
exclude: ["jquery", "app/common"]
},
... other pages go here following same pattern ...
]
}
This would result in app/common and app/page1 being loaded async in parallel. If require.js was a separate file from jquery.js, the following HTML could be used to load jQuery, app/common and app/page1 async and in parallel (the optimization instructions stay the same):Those configurations work today.
<script src="http:/some.cdn.com/jquery/1.5/require.js"></script>
<script>
require({
baseUrl: "./scripts",
paths: {
"jquery": "http:/some.cdn.com/jquery/1.5/jquery"
}
},
["jquery", "app/common", "app/page1"]
);
</script>
However, it is not quite flexible enough -- typically modules that are part of app/page1 will not want to refer to the complete "app/common" as the only dependency, but specify finer-grained dependencies, like "app/common/helper". So the above could result in a request for "app/commom/helper" from the "app/page1" script, depending on how fast "app/common" is loaded.
So I would build in support for the following:
Notice the new "layers" config option, and now the required modules for the page is just "app/page1". The "layers" config option would tell RequireJS to load all of those layers first, and find out what is in them before trying to fetch any other dependencies.
<script src="http:/some.cdn.com/jquery/1.5/require.js"></script>
<script>
require({
baseUrl: "./scripts",
paths: {
"jquery": "http:/some.cdn.com/jquery/1.5/jquery"
},
layers: ["jquery", "app/common", "app/page1"]
},
["app/page1"]
);
</script>
This would give the most flexibility in coding individual modules, but give a very clear optimization path to getting a configurable number of script layers to load async and in parallel. I will be working on this feature for RequireJS for the next release.
Summary
Hopefully I have demonstrated how RequireJS could be the require implementation for jQuery. I am very open to doing code changes to support jQuery's desires, and even if jQuery or John feel like they want to write their own implementation, hopefully we can at least agree on the same API, and maybe even still use the optimization tool in RequireJS. I am happy to help with an alternative implementation too.
I know John and the jQuery team are busy, focusing mostly on mobile and templating concerns, but hopefully they can take the above into consideration when they get to script loading.
In the meantime, I will work on the layers config option support, improving RequireJS, and keeping my jQuery fork up to date with the changes. You can try out RequireJS+jQuery today if you want to give it a spin yourself.
Sunday, April 25, 2010
RequireJS+jQuery Talk
I gave a talk about RequireJS with jQuery at the jQuery Conference today. Here are the slides:
Thanks to the folks that came to the talk! I had a great time at the conference.
If you went to the talk, please feel free to rate the talk so I can improve for the next time.
Thanks to the folks that came to the talk! I had a great time at the conference.
If you went to the talk, please feel free to rate the talk so I can improve for the next time.
Friday, April 23, 2010
RequireJS 0.10.0 Released, Node integration
RequireJS 0.10.0 is now available.
The big feature in this release is integration with Node. Now you can use a the same module format for both browser and server side modules. The RequireJS-Node adapter translates existing CommonJS modules on the fly, as they are loaded by the adapter, so you can continue to use server modules written in the CommonJS format for your Node projects.
The RequireJS-Node adapter is freshly baked, so there could be some rough edges with it, but it is exciting to see it work. See the docs for all the details.
0.10.0 also includes support for an excludeShallow option in the optimization tool. This will allow you to do an optimization build during development, but still excludeShallow the specific module you want to develop/debug in the browser. So you can get great debug support in the browser for just that one module, but still load the rest of your JS super-fast. No need for special server transforms.
I will be at the jQuery conference this weekend in Mountain View, CA. I will be speaking on Sunday about jQuery+RequireJS. Stop by and say hi if you are at the conference!
The big feature in this release is integration with Node. Now you can use a the same module format for both browser and server side modules. The RequireJS-Node adapter translates existing CommonJS modules on the fly, as they are loaded by the adapter, so you can continue to use server modules written in the CommonJS format for your Node projects.
The RequireJS-Node adapter is freshly baked, so there could be some rough edges with it, but it is exciting to see it work. See the docs for all the details.
0.10.0 also includes support for an excludeShallow option in the optimization tool. This will allow you to do an optimization build during development, but still excludeShallow the specific module you want to develop/debug in the browser. So you can get great debug support in the browser for just that one module, but still load the rest of your JS super-fast. No need for special server transforms.
I will be at the jQuery conference this weekend in Mountain View, CA. I will be speaking on Sunday about jQuery+RequireJS. Stop by and say hi if you are at the conference!
Tuesday, April 13, 2010
JavaScript object inheritance with parents
There are different ways to inherit functionality in JavaScript, including using mixins (mixing in all the properties of one object into another object) and the use of prototypes.
In Dojo, there is dojo.mixin for doing mixins, and dojo.delegate for inheriting properties via prototypes. dojo.delegate is like ECMAScript 5/Crockford's Object.create(), but with a dojo.mixin convenience call.
I really like the dojo.delegate or a Object.create+dojo.mixin combination for inheriting, but it makes it hard to call methods you override from your parent. I see this problem show up frequently with widgets, which typically inherit from each other:
In Dojo, there is dojo.declare() that helps with this by defining an "inherited" method that can be used to find the BaseWidget's postCreate:
So here is an experiment on something simpler:
The second argument to the object() function allows for specifying mixins.
With two mixins, mixin1 and mixin2, the parent for MyWidget would be an object that inherits from BaseWidget with mixin1 and mixin2's properties mixed in:
The object() implementation is simpler than dojo.declare, but still gives easy access for calling a parent implementation of a function. It is not has powerful as dojo.declare -- dojo.declare has the concept of postscript and a preamble and even auto-chaining calls. However, I feel the simplified approach is better. It is clearer to follow the code, and to predict how it will behave. I also expect it to perform better.
I like the object() method because it uses closures and a function that accepts the parent function as an argument. Feels very JavaScripty. The prototype chain is a bit longer with the extra object.create() calls creating some intermediate objects, but I expect prototype walking is fast in JavaScript, particularly when you go to measure it in comparison to any DOM operation.
Are there ways in which the object() function is broken or insufficient? Is there a better way to do this? Or even a different way, something that does not rely on a parent reference?
There is traits.js, for using traits. Alex Russell experimented with a trait implementation inside dojo.delegate. Kris Zyp pointed out that Alex's implementation does not have conflict detection or method require support.
I like the idea of mixing in just part of a mixin or remapping a method to fit some other API's expectations, so I can see adding support for the remapping features, similar to what Alex does in the dojo.delegate experiment. However, I am not sure how valuable conflict detection or method require support is.
I can see in large systems it would help with detecting errors sooner, but then maybe the bigger problem is the complexity of the large system. And there is a balance to forcing strictness up front over ease of use. The trait.js syntax looks fairly wordy to me, and the extra benefit of the strictness may not be realized for most web apps.
Also, I do not see an easy way to get the parent reference. It seems like you need to remap each overridden parent function you want to call to a new property name. It seems wordy, with more properties hanging off an object. And do you need to make sure you do not pick a name that is already in use by an ancestor? Seems like it could lead to a bunch of goofy names on an object.
Reusing code effectively is an interesting topic. The traits approach is newer to me, and I keep wondering if there is a better way to do it. It has been fun to experiment with alternatives.
In Dojo, there is dojo.mixin for doing mixins, and dojo.delegate for inheriting properties via prototypes. dojo.delegate is like ECMAScript 5/Crockford's Object.create(), but with a dojo.mixin convenience call.
I really like the dojo.delegate or a Object.create+dojo.mixin combination for inheriting, but it makes it hard to call methods you override from your parent. I see this problem show up frequently with widgets, which typically inherit from each other:
Not too bad, but the BaseWidget.prototype.postCreate.apply junk is a bit much to type, and it gets a bit trickier when there are mixins that also contribute to the functionality.
var MyWidget = Object.create(BaseWidget);
//BaseWidget also defines a postCreate method,
//But we want our widget to do work too.
MyWidget.prototype.postCreate = function () {
//Call BaseWidget's implementation
BaseWidget.prototype.postCreate.apply(this, arguments);
//Do MyWidget's postCreate work here.
};
In Dojo, there is dojo.declare() that helps with this by defining an "inherited" method that can be used to find the BaseWidget's postCreate:
This is an improvement as far as typing, but the implementation of dojo.declare has always scared me. My JavaScript Fu is not strong enough to follow it, and I am concerned it is actually a bit too complicated.
var MyWidget = dojo.declare(BaseWidget, {
postCreate: function () {
//Call BaseWidget's implementation
this.inherited("postCreate", arguments);
//Do MyWidget's postCreate work here.
}
});
So here is an experiment on something simpler:
Here is the implementation of that object function, and here are some tests. That implementation is wrapped in a RequireJS module, but it can be extracted as a standalone script.
var MyWidget = object("BaseWidget", null, function (parent) {
return {
postCreate: function () {
//Call BaseWidget's implementation
parent(this, "postCreate", arguments);
//Do MyWidget's postCreate work here.
}
};
});
The second argument to the object() function allows for specifying mixins.
With two mixins, mixin1 and mixin2, the parent for MyWidget would be an object that inherits from BaseWidget with mixin1 and mixin2's properties mixed in:
dojo.declare has the concept of calling a method called "constructor" if it is defined on the declared object, whenever a new object of the MyWidget type is created. I preserved that ability in object() but the property name for that function is "init" in the object() implementation.
var MyWidget = object("BaseWidget", [mixin1, mixin2], function (parent) {
return {
postCreate: function () {
//Call BaseWidget's postCreate, but if it
//does not have a postCreate method, mixin1's
//postCreate function will be used. If mixin1
//does not have an implementation, then mixin2's
//postCreate function will be used. If mixin2 does
//not have an implementation an error is thrown.
parent(this, "postCreate", arguments);
//Do MyWidget's postCreate work here.
}
};
});
The object() implementation is simpler than dojo.declare, but still gives easy access for calling a parent implementation of a function. It is not has powerful as dojo.declare -- dojo.declare has the concept of postscript and a preamble and even auto-chaining calls. However, I feel the simplified approach is better. It is clearer to follow the code, and to predict how it will behave. I also expect it to perform better.
I like the object() method because it uses closures and a function that accepts the parent function as an argument. Feels very JavaScripty. The prototype chain is a bit longer with the extra object.create() calls creating some intermediate objects, but I expect prototype walking is fast in JavaScript, particularly when you go to measure it in comparison to any DOM operation.
Are there ways in which the object() function is broken or insufficient? Is there a better way to do this? Or even a different way, something that does not rely on a parent reference?
There is traits.js, for using traits. Alex Russell experimented with a trait implementation inside dojo.delegate. Kris Zyp pointed out that Alex's implementation does not have conflict detection or method require support.
I like the idea of mixing in just part of a mixin or remapping a method to fit some other API's expectations, so I can see adding support for the remapping features, similar to what Alex does in the dojo.delegate experiment. However, I am not sure how valuable conflict detection or method require support is.
I can see in large systems it would help with detecting errors sooner, but then maybe the bigger problem is the complexity of the large system. And there is a balance to forcing strictness up front over ease of use. The trait.js syntax looks fairly wordy to me, and the extra benefit of the strictness may not be realized for most web apps.
Also, I do not see an easy way to get the parent reference. It seems like you need to remap each overridden parent function you want to call to a new property name. It seems wordy, with more properties hanging off an object. And do you need to make sure you do not pick a name that is already in use by an ancestor? Seems like it could lead to a bunch of goofy names on an object.
Reusing code effectively is an interesting topic. The traits approach is newer to me, and I keep wondering if there is a better way to do it. It has been fun to experiment with alternatives.
Tuesday, March 30, 2010
CommonJS Module Trade-offs
First of all: why should you care about module formats?
If you use JavaScript, particularly in the browser, more is being expected of you each day. Every site or webapp that you build will want to do more things over time, and browser engines are getting faster, making more complex, web-native experiences possible. Having modular code makes it much easier to build these experiences.
One wrinkle though, there is no standard module format for the browser. There is the very useful Module Pattern, that helps encapsulate code to define a module, but there is no standard way to indicate your module's dependencies.
I have been following some of the threads in the CommonJS mailing list about trying to come up with a require.async/ensure spec and a Transport spec. The reason those two specs are needed in addition to the basic module spec is because the CommonJS module spec decided to make some tradeoffs that were not browser-friendly.
This is my attempt to explain the trade-offs the CommonJS module spec has made, and why I believe they are not the right trade-offs. The trade-offs end up creating a bunch of extra work and gear that is needed in the browser case -- to me, the most important case to get right.
I do not expect this to influence or change the CommonJS spec -- the developers that make up most of the list seem to generally like the module format as written. At least they agreed on something. It is incredibly hard to get a group of people to code in a certain direction, and I believe they are doing it because they love coding and want to make it easier.
I want to point out the trade-offs made though, and suggest my own set of trade-offs. Hopefully by explicitly listing them out, other developers can make informed choices on what they want to use for their project.
Most importantly, just because "CommonJS" is used for the module spec, it should not be assumed that it is an optimal module spec for the browser, or that it should be the default choice for a module spec.
Disclosure: I have a horse in this race, RequireJS, and much of its design comes from a different set of tradeoffs that I will list further down. I am sure someone who prefers the CommonJS spec might have a different take on the trade-offs.
To the trade-offs:
1) No function for encapsulating a module.
A function around a module can seem like more boilerplate. Instead each module in the CommonJS spec is just a file. This means only one module per file. This is fine on the server or local disk, but not great in the browser if you want performance.
2) Referencing and loading dependencies synchronously is easier than asynchronous
In general, sync programming is easier to do. That does not work so well in the browser though.
3) exports
How do you define the module value that other modules can use? If a function was used around the module, a return value from that function could be used as the module definition. However, in the effort to avoid a function wrapper, it complicates setting up a return value. The CommonJS spec instead uses a free variable called "exports".
The value of exports is different for each module file, and it means that you can only attach properties to the exports module. Your module cannot assign a value to exports.
It means you cannot make a function as the module value. Some frameworks use constructor functions as the module values -- these will not be possible in CommonJS modules. Instead you will need to define a property on the exports object that holds the function. More typing for users of your module.
Using an exports object has an advantage: you can pass it to circular dependencies, and it reduces the probability of an error in a circular dependency case. However, it does not completely avoid circular dependency problems.
Instead, I favor these trade-offs:
1) Use a function to encapsulate the module.
This is basically the core of the previously-mentioned Module Pattern. It is in use today, it is an understood practice, and functions are at the core of JavaScript's built-in modularity.
While it is an extra function(){} to type, it is fairly standard to do this in JavaScript. It also means you can put more than one module in a file.
While you should avoid multiple modules in a file while developing, being able to concatenate a bunch of modules together for better performance in the browser is very desirable.
2) Assume async dependencies
Async performs better overall. While it may not help performance much in the server case, making sure a format performs well out of the box in the browser is very important.
This means module dependencies must be listed outside the function that defines the module, so they can be loaded before the module function is called.
3) Use return to define modules
Once a function is used to encapsulate the module, the function can return a value to define the module. No need for exports.
This fits more naturally with basic JavaScript syntax, and it allows returning functions as the module definition. Hooray!
There is a slightly higher chance of problems in circular dependency cases, but circular dependencies are rare, and usually a sign of bad design. There are valid cases for having circular dependencies, but the cases where a return value might be a problem for a circular dependency case is very small, and can be worked around.
If getting function return values means a slightly higher probability of a circular dependency error (which has a mitigation) then that is the good trade-off.
This avoids the need for the "exports" variable. This is fairly important to me, because exports has always looked odd to me, like it did not belong. It requires extra discovery to know its purpose.
Return values are more understandable, and allowing your module to return a function value, like a constructor function, seems like a basic requirement. It fits better with basic JavaScript.
4) Pass in dependencies to the module's function wrapper
This is done to decrease the amount of boilerplate needed with a function wrapped modules. If this is not done, you end up typing the dependency name twice (an opportunity for error), and it does not minify as well.
An example: let's define a module called "foo", which needs the "logger" module to work:
Passing in the module has some circular dependency hazards -- logger may not be defined yet if it was a circular dependency. So the first style, using require() inside the function wrapper should still be allowed. For instance, require("logger") inside a method that is created on the foo object could be used to avoid the circular dependency problem.
So again, I am making a trade-off where the more common useful case is easier to code vs increasing the probability of circular dependency issues. Circular dependencies are rare, and the above has a mitigation via the use of require("modulename").
There is another hazard that can happen with naming args in the function for each dependency. You can get an off-by-one problem:
The convenience and less typing of having the argument be the module is useful. It also fits well with JSLint -- it can help catch spelling errors using the argument name inside the function.
5) Code the module name inside the module
To define the foo module, the name "foo" needs to be part of the module definition:
If script.onload fired exactly after the script is executed, not having the module name in the module definition might work, but this is not the case across browsers. And we still need to allow the name to be there for optimization case, where more than one module is in a file.
There is a legitimate concern that encoding the module name in the module definition makes it hard to move around code -- if you want to change the directory where the module is stored, it means touching the module source to change the names.
While that can be an issue, in Dojo we have found it is not a problem. I have not heard complaints of that specific issue. I am sure it happens, but the fix cost is not that onerous. This is not Java. And YUI 3 does something similar to Dojo, encode a name with the module definition.
I think the rate of occurrence of this issue, and the work it takes to fix are rarer and one time costs vs. forcing every browser developer taking extra, ongoing costs of using the CommonJS module format in the browser.
Conclusion
Those are the CommonJS trade-offs and my trade-offs. Some of them are not "more right" but just preferences, just like any language design. However, the lack of browser support in the basic module spec is very concerning to me.
In my eyes, the trade-offs CommonJS has made puts more work on browser developers to navigate more specs and need more gear to get it to work. Adding more specs that allow modules to be expressed in more than one way is not a good solution for me.
I see it as the CommonJS module spec making a specific bet: treating the browser as a second class module citizen will pay off in the long run and allow it to get a foothold in other environments where Ruby or Python might live.
Historically, and more importantly for the future, treating the browser as second class is a bad bet to make.
All that said, I wish the CommonJS group success, and there are lots of smart people on the list. I will try to support what I can of their specs in RequireJS, but I do feel the trade-offs in the basic module spec are not so great for browser developers.
If you use JavaScript, particularly in the browser, more is being expected of you each day. Every site or webapp that you build will want to do more things over time, and browser engines are getting faster, making more complex, web-native experiences possible. Having modular code makes it much easier to build these experiences.
One wrinkle though, there is no standard module format for the browser. There is the very useful Module Pattern, that helps encapsulate code to define a module, but there is no standard way to indicate your module's dependencies.
I have been following some of the threads in the CommonJS mailing list about trying to come up with a require.async/ensure spec and a Transport spec. The reason those two specs are needed in addition to the basic module spec is because the CommonJS module spec decided to make some tradeoffs that were not browser-friendly.
This is my attempt to explain the trade-offs the CommonJS module spec has made, and why I believe they are not the right trade-offs. The trade-offs end up creating a bunch of extra work and gear that is needed in the browser case -- to me, the most important case to get right.
I do not expect this to influence or change the CommonJS spec -- the developers that make up most of the list seem to generally like the module format as written. At least they agreed on something. It is incredibly hard to get a group of people to code in a certain direction, and I believe they are doing it because they love coding and want to make it easier.
I want to point out the trade-offs made though, and suggest my own set of trade-offs. Hopefully by explicitly listing them out, other developers can make informed choices on what they want to use for their project.
Most importantly, just because "CommonJS" is used for the module spec, it should not be assumed that it is an optimal module spec for the browser, or that it should be the default choice for a module spec.
Disclosure: I have a horse in this race, RequireJS, and much of its design comes from a different set of tradeoffs that I will list further down. I am sure someone who prefers the CommonJS spec might have a different take on the trade-offs.
To the trade-offs:
1) No function for encapsulating a module.
A function around a module can seem like more boilerplate. Instead each module in the CommonJS spec is just a file. This means only one module per file. This is fine on the server or local disk, but not great in the browser if you want performance.
2) Referencing and loading dependencies synchronously is easier than asynchronous
In general, sync programming is easier to do. That does not work so well in the browser though.
3) exports
How do you define the module value that other modules can use? If a function was used around the module, a return value from that function could be used as the module definition. However, in the effort to avoid a function wrapper, it complicates setting up a return value. The CommonJS spec instead uses a free variable called "exports".
The value of exports is different for each module file, and it means that you can only attach properties to the exports module. Your module cannot assign a value to exports.
It means you cannot make a function as the module value. Some frameworks use constructor functions as the module values -- these will not be possible in CommonJS modules. Instead you will need to define a property on the exports object that holds the function. More typing for users of your module.
Using an exports object has an advantage: you can pass it to circular dependencies, and it reduces the probability of an error in a circular dependency case. However, it does not completely avoid circular dependency problems.
Instead, I favor these trade-offs:
1) Use a function to encapsulate the module.
This is basically the core of the previously-mentioned Module Pattern. It is in use today, it is an understood practice, and functions are at the core of JavaScript's built-in modularity.
While it is an extra function(){} to type, it is fairly standard to do this in JavaScript. It also means you can put more than one module in a file.
While you should avoid multiple modules in a file while developing, being able to concatenate a bunch of modules together for better performance in the browser is very desirable.
2) Assume async dependencies
Async performs better overall. While it may not help performance much in the server case, making sure a format performs well out of the box in the browser is very important.
This means module dependencies must be listed outside the function that defines the module, so they can be loaded before the module function is called.
3) Use return to define modules
Once a function is used to encapsulate the module, the function can return a value to define the module. No need for exports.
This fits more naturally with basic JavaScript syntax, and it allows returning functions as the module definition. Hooray!
There is a slightly higher chance of problems in circular dependency cases, but circular dependencies are rare, and usually a sign of bad design. There are valid cases for having circular dependencies, but the cases where a return value might be a problem for a circular dependency case is very small, and can be worked around.
If getting function return values means a slightly higher probability of a circular dependency error (which has a mitigation) then that is the good trade-off.
This avoids the need for the "exports" variable. This is fairly important to me, because exports has always looked odd to me, like it did not belong. It requires extra discovery to know its purpose.
Return values are more understandable, and allowing your module to return a function value, like a constructor function, seems like a basic requirement. It fits better with basic JavaScript.
4) Pass in dependencies to the module's function wrapper
This is done to decrease the amount of boilerplate needed with a function wrapped modules. If this is not done, you end up typing the dependency name twice (an opportunity for error), and it does not minify as well.
An example: let's define a module called "foo", which needs the "logger" module to work:
Compare with a version that passes in "logger" to the function:
require.def("foo", ["logger"], function () {
//require("logger") can be a synchronous call here, since
//logger was specified in the dependency array outside
//the module function
require("logger").debug("starting foo's definition");
//Define the foo object
return {
name: "foo"
};
});
require.def("foo", ["logger"], function (logger) {
//Once "logger" module is loaded it is passed
//to this function as the logger function arg
logger.debug("starting foo's definition");
//Define the foo object
return {
name: "foo"
};
});
Passing in the module has some circular dependency hazards -- logger may not be defined yet if it was a circular dependency. So the first style, using require() inside the function wrapper should still be allowed. For instance, require("logger") inside a method that is created on the foo object could be used to avoid the circular dependency problem.
So again, I am making a trade-off where the more common useful case is easier to code vs increasing the probability of circular dependency issues. Circular dependencies are rare, and the above has a mitigation via the use of require("modulename").
There is another hazard that can happen with naming args in the function for each dependency. You can get an off-by-one problem:
However, this is a standard coding hazard, not matching inputs args to a function. And there is mitigation, you could use require("three") inside the module if you wanted.
require.def("foo", ["one", "two", "three"], function (one, three) {
//In here, three is actually pointing to the "two" module
});
The convenience and less typing of having the argument be the module is useful. It also fits well with JSLint -- it can help catch spelling errors using the argument name inside the function.
5) Code the module name inside the module
To define the foo module, the name "foo" needs to be part of the module definition:
This is needed because we want the ability to combine multiple module definitions into one file for optimization. In addition, there is no good way to match a module definition to its name in the browser without it.
require.def("foo", ["logger"], function () {});
If script.onload fired exactly after the script is executed, not having the module name in the module definition might work, but this is not the case across browsers. And we still need to allow the name to be there for optimization case, where more than one module is in a file.
There is a legitimate concern that encoding the module name in the module definition makes it hard to move around code -- if you want to change the directory where the module is stored, it means touching the module source to change the names.
While that can be an issue, in Dojo we have found it is not a problem. I have not heard complaints of that specific issue. I am sure it happens, but the fix cost is not that onerous. This is not Java. And YUI 3 does something similar to Dojo, encode a name with the module definition.
I think the rate of occurrence of this issue, and the work it takes to fix are rarer and one time costs vs. forcing every browser developer taking extra, ongoing costs of using the CommonJS module format in the browser.
Conclusion
Those are the CommonJS trade-offs and my trade-offs. Some of them are not "more right" but just preferences, just like any language design. However, the lack of browser support in the basic module spec is very concerning to me.
In my eyes, the trade-offs CommonJS has made puts more work on browser developers to navigate more specs and need more gear to get it to work. Adding more specs that allow modules to be expressed in more than one way is not a good solution for me.
I see it as the CommonJS module spec making a specific bet: treating the browser as a second class module citizen will pay off in the long run and allow it to get a foothold in other environments where Ruby or Python might live.
Historically, and more importantly for the future, treating the browser as second class is a bad bet to make.
All that said, I wish the CommonJS group success, and there are lots of smart people on the list. I will try to support what I can of their specs in RequireJS, but I do feel the trade-offs in the basic module spec are not so great for browser developers.
RequireJS 0.9.0 Released
I just pushed a new release of RequireJS, 0.9.0.
The optimization tool has seen the most change in this release. It sports some CSS optimizations now and it is much more robust. It also includes command line options for optimizing just one JS file or one CSS file.
The other new feature is the support for relative module names for require.def() dependencies. So this kind of call works now:
require.def("my/project/module", ["./dependency1"], function(){});
It will load my/project/dependency1.js. This should help cut down the amount of typing for larger projects that have deep directories of modules.
This release has some backwards-incompatible changes. That was the reason for the bump to 0.9.0. The project is still not at 1.0, so backwards-incompatible changes may still be considered. I do not have any more changes like that planned, but I will be sure to give more notice in the RequireJS list before doing so in the future.
All the details are on the download page.
Raindrop has been updated to the latest RequireJS release, and it is working great. Give the new RequireJS build a spin!
The optimization tool has seen the most change in this release. It sports some CSS optimizations now and it is much more robust. It also includes command line options for optimizing just one JS file or one CSS file.
The other new feature is the support for relative module names for require.def() dependencies. So this kind of call works now:
require.def("my/project/module", ["./dependency1"], function(){});
It will load my/project/dependency1.js. This should help cut down the amount of typing for larger projects that have deep directories of modules.
This release has some backwards-incompatible changes. That was the reason for the bump to 0.9.0. The project is still not at 1.0, so backwards-incompatible changes may still be considered. I do not have any more changes like that planned, but I will be sure to give more notice in the RequireJS list before doing so in the future.
All the details are on the download page.
Raindrop has been updated to the latest RequireJS release, and it is working great. Give the new RequireJS build a spin!
Sunday, March 14, 2010
RequireJS, kicking some AST
RequireJS has an optimization tool that can combine and minify your scripts. It uses Google's Closure Compiler to do the minification. Recently, but after the RequireJS 0.8.0 release, I ported over the CSS optimizations from the Dojo build system, so the optimization tool now inlines @import calls and remove comments from CSS files.
The script combining still has some rough edges though, and it was mainly due to me trying to use suboptimal regexp calls to find require() and require.def() calls in the files, so the dependencies for a script could be traced.
So I finally took the dive into Abstract Syntax Trees (ASTs) to do the work. What is an AST? An analogy that works for me: an AST is to JavaScript source as the DOM API is to HTML source. The AST has methods for walking through the nodes in the JS code structure, and you can get properties on a node.
Figuring out how to generate an AST from scratch can be a bit of work, but since I was already using Closure Compiler, I just used an AST it can generate.
Since the optimization tool for RequireJS is written in JavaScript, which makes calls into Java-land to do file access and minification calls, I wanted the same approach for working with the AST -- do my work in JavaScript, but call the Java methods for the AST walking and source transform.
My task was fairly simple -- I just wanted to find require() or require.def() calls that used strings for module names and dependencies, pull those calls out of the file, then just execute those calls to work out the dependencies.
The end result was this file:
http://github.com/jrburke/requirejs/blob/master/build/jslib/parse.js
The basic idea of the script:
I was tempted to try to go direct to just use Rhino for the AST, but decompiling the AST into source looked harder to do, and from what I recall, Rhino has a newer AST API in the trunk code. I believe the one in Closure Compiler is the older one? All that added up to me being wary of that path.
Most of the time spent was trying to figure out the Java invocations to get the code parsed, understand the tree structure, deal with Java-to-JavaScript translation issues and then figure out the Java invocations to convert a subtree back into source.
I am glad I finally stepped into working with a real AST. While some of the AST calls are a bit awkward (at least for me as a JavaScript person), it is a lot better than trying to use regexps for it. I still need to do more testing, but I feel more confident in the robustness of the solution now.
If you see how I can do it better, point me in the right direction!
The script combining still has some rough edges though, and it was mainly due to me trying to use suboptimal regexp calls to find require() and require.def() calls in the files, so the dependencies for a script could be traced.
So I finally took the dive into Abstract Syntax Trees (ASTs) to do the work. What is an AST? An analogy that works for me: an AST is to JavaScript source as the DOM API is to HTML source. The AST has methods for walking through the nodes in the JS code structure, and you can get properties on a node.
Figuring out how to generate an AST from scratch can be a bit of work, but since I was already using Closure Compiler, I just used an AST it can generate.
Since the optimization tool for RequireJS is written in JavaScript, which makes calls into Java-land to do file access and minification calls, I wanted the same approach for working with the AST -- do my work in JavaScript, but call the Java methods for the AST walking and source transform.
My task was fairly simple -- I just wanted to find require() or require.def() calls that used strings for module names and dependencies, pull those calls out of the file, then just execute those calls to work out the dependencies.
The end result was this file:
http://github.com/jrburke/requirejs/blob/master/build/jslib/parse.js
The basic idea of the script:
//Set up shortcut to long Java package name,Thanks to the Closure Compiler team for doing the hard work and open sourcing the code. It looks like Closure Compiler deals with two AST formats -- one is perhaps an older one generated by Rhino, while the other one is a more custom one? It seems like I was getting back the Rhino-based Nodes for the methods I called.
//and create a Compiler instance.
var jscomp = Packages.com.google.javascript.jscomp,
compiler = new jscomp.Compiler(),
//The parse method returns an AST.
//astRoot is a kind of Node for the AST.
//Comments are not present as nodes in the AST.
astRoot = compiler.parse(jsSourceFile),
node = astRoot.getChildAtIndex(0);
//Use Node methods to get child nodes, and their types.
if (node.getChildAtIndex(1).getFirstChild().getType() === CALL) {
//Convert this call node and its children to JS source.
//This generated source does not have comments and
//may not be space-formatted exactly the same as the input
//source
var codeBuilder = new jscomp.Compiler.CodeBuilder();
compiler.toSource(codeBuilder, 1, node);
//Return the JavaScript source.
//Need to use String() to convert the Java String
//to a JavaScript String.
return String(codeBuilder.toString());
}
I was tempted to try to go direct to just use Rhino for the AST, but decompiling the AST into source looked harder to do, and from what I recall, Rhino has a newer AST API in the trunk code. I believe the one in Closure Compiler is the older one? All that added up to me being wary of that path.
Most of the time spent was trying to figure out the Java invocations to get the code parsed, understand the tree structure, deal with Java-to-JavaScript translation issues and then figure out the Java invocations to convert a subtree back into source.
I am glad I finally stepped into working with a real AST. While some of the AST calls are a bit awkward (at least for me as a JavaScript person), it is a lot better than trying to use regexps for it. I still need to do more testing, but I feel more confident in the robustness of the solution now.
If you see how I can do it better, point me in the right direction!
Thursday, February 18, 2010
RequireJS 0.8.0 Released
RequireJS, the next generation in script loading, now has an official release and a new web site: http://requirejs.org.
The 0.8.0 release is a formal release of the code, and it includes built versions of jQuery 1.4.2 with RequireJS already integrated.
I also updated my jQuery fork to include the latest changes -- jQuery's page load callbacks will not fire unless all scripts loaded with RequireJS have also finished loading.
I plan to do integrations with other browser toolkits, MooTools and Prototype being next on my list. I also hope the jQuery community will want to pull the changes I have in my jQuery fork into their master at some point.
If you are a team member for one of these toolkits, please let me know what I can do in RequireJS to provide the best code loading and module format for browser-based toolkits. It would be great if we can reach consensus on code loading. I am happy to make changes in RequireJS if it moves us all closer to that.
While the release version is 0.8, this code has been battle-tested in Raindrop, a sizable JavaScript-centric messaging web app. Raindrop uses a version of Dojo 1.4 that has been converted to the RequireJS module format, and all Raindrop modules are written as RequireJS modules.
Some other notes about the release:
The 0.8.0 release is a formal release of the code, and it includes built versions of jQuery 1.4.2 with RequireJS already integrated.
I also updated my jQuery fork to include the latest changes -- jQuery's page load callbacks will not fire unless all scripts loaded with RequireJS have also finished loading.
I plan to do integrations with other browser toolkits, MooTools and Prototype being next on my list. I also hope the jQuery community will want to pull the changes I have in my jQuery fork into their master at some point.
If you are a team member for one of these toolkits, please let me know what I can do in RequireJS to provide the best code loading and module format for browser-based toolkits. It would be great if we can reach consensus on code loading. I am happy to make changes in RequireJS if it moves us all closer to that.
While the release version is 0.8, this code has been battle-tested in Raindrop, a sizable JavaScript-centric messaging web app. Raindrop uses a version of Dojo 1.4 that has been converted to the RequireJS module format, and all Raindrop modules are written as RequireJS modules.
Some other notes about the release:
- RequireJS 0.8.0 can run in Rhino, but its main design target is the browser.
- It comes with an optimization tool that can combine and minify your JavaScript.
Tuesday, February 09, 2010
RunJS is now RequireJS
As mentioned before, I considered renaming RunJS to RequireJS. I did the transition, and RequireJS is on GitHub. There is a conversion script that will convert CommonJS modules to the Transport/C proposal that works with RequireJS.
I have converted Raindrop to use a modified Dojo 1.4 that uses RequireJS instead of the normal Dojo loader, and all Raindrop modules are written in the Transport/C module format that RequireJS understands. Raindrop works, so the RequireJS code has been proven in a real project that has many modules with nested dependencies. The RequireJS code is already battle-tested.
I have opened a thread on the jQuery forum about using RequireJS for jQuery's require() needs. I can do a build with Dojo 1.4 that uses RequireJS, and any Dojo 2.0 effort is likely to use RequireJS as the module loader.
I believe RequireJS is the loader browser-based toolkits should use. At the very least, the module format and API supported by RequireJS should be used by browser-based toolkits, even if they want to build their own loader.
Next plans for RequireJS:
1) Contact the MooTools and Prototype folks to see if they want to use it. It allows loading code that does not export a module value, and has access to the global environment. They can use it to load code that augments native prototypes. RequireJS can load existing, plain JS files that do not define a module too.
2) Put up a web site with builds. While you can use RequireJS just from grabbing it from GitHub, it would be nice to have the builds of RequireJS with its different levels of functionality already built and easy to download.
2) Do a fork of Node that uses RequireJS on the server. I believe the async module format used by RequireJS is great fit for Node and its async goals.
3) See if I can do a fork of Narwhal to do the same thing.
I believe I can get RequireJS to work on the server and still support the existing CommonJS format when the server supports synchronous loading. By having native support in some server-based systems for RequireJS, it will be easier to share code with the browser.
To recap, my three main issues with using the existing CommonJS module spec, and why RequireJS exists:
1) So far the CommonJS group does not think the browser is common enough to qualify as a first class citizen in the module spec. The group is mainly concerned with environments outside the browser. As a result, the CommonJS module spec does not work well natively in the browser -- it either requires an XHR-based loader, which we have found to have problems in Dojo, or require a server-side transform process. A server-side transform process should not be required to do web development in the browser.
RequireJS uses a function wrapper around the module to avoid these problems and allow loading modules via script tags. Just save the file and hit reload in the browser.
2) There is a free variable, called "exports". It is an object. You cannot set the value of exports inside your module code, you can only add properties. This means for instance, your module cannot be a function. In Dijit, Dojo's widget system, all widgets are constructor functions. The "exports" restriction makes your APIs awkward if you want to export functions for module values. The claim is that this exports restriction helps with circular dependencies, but it only helps a little bit. To me, it is not worth slightly improving an edge case when it sacrifices a greater usefulness and simplicity in user's modules.
RequireJS and its format can handle circular dependencies just fine. In the format supported by RequireJS, you to define a module as a function. Although, you can still use exports as CommonJS uses it if you so desire.
3) The require.main property seems like a hack. It is normally used so that a module can say if (require.main === module.id) or if (require.main === module) then do some "main" work. The module format should just define an exports.main convention for indicating "main" functionality. It is less typing, and more robust, since different code entry points have a different idea of "main". For instance, an HTTP request handler likely has specific requirements on what it considers to be the "main". The top level entry point should decide what code to execute as "main", not logic inside the module.
RequireJS does not support the require.main idiom.
So I believe the path used by RequireJS is more robust overall, works better/scales better in the browser. However, I still want to provide enough support for the existing CommonJS modules in the meantime to allow more code sharing.
In the long run though, the CommonJS format as it exists today should be replaced with something better. It is troublesome that the CommonJS group is not really targeting the browser, but over time, the broader JS community will expect browser toolkits to support CommonJS specs. It does not seem right to end up with non-optimal solutions in the browser when the browser is the most common JS platform.
I have converted Raindrop to use a modified Dojo 1.4 that uses RequireJS instead of the normal Dojo loader, and all Raindrop modules are written in the Transport/C module format that RequireJS understands. Raindrop works, so the RequireJS code has been proven in a real project that has many modules with nested dependencies. The RequireJS code is already battle-tested.
I have opened a thread on the jQuery forum about using RequireJS for jQuery's require() needs. I can do a build with Dojo 1.4 that uses RequireJS, and any Dojo 2.0 effort is likely to use RequireJS as the module loader.
I believe RequireJS is the loader browser-based toolkits should use. At the very least, the module format and API supported by RequireJS should be used by browser-based toolkits, even if they want to build their own loader.
Next plans for RequireJS:
1) Contact the MooTools and Prototype folks to see if they want to use it. It allows loading code that does not export a module value, and has access to the global environment. They can use it to load code that augments native prototypes. RequireJS can load existing, plain JS files that do not define a module too.
2) Put up a web site with builds. While you can use RequireJS just from grabbing it from GitHub, it would be nice to have the builds of RequireJS with its different levels of functionality already built and easy to download.
2) Do a fork of Node that uses RequireJS on the server. I believe the async module format used by RequireJS is great fit for Node and its async goals.
3) See if I can do a fork of Narwhal to do the same thing.
I believe I can get RequireJS to work on the server and still support the existing CommonJS format when the server supports synchronous loading. By having native support in some server-based systems for RequireJS, it will be easier to share code with the browser.
To recap, my three main issues with using the existing CommonJS module spec, and why RequireJS exists:
1) So far the CommonJS group does not think the browser is common enough to qualify as a first class citizen in the module spec. The group is mainly concerned with environments outside the browser. As a result, the CommonJS module spec does not work well natively in the browser -- it either requires an XHR-based loader, which we have found to have problems in Dojo, or require a server-side transform process. A server-side transform process should not be required to do web development in the browser.
RequireJS uses a function wrapper around the module to avoid these problems and allow loading modules via script tags. Just save the file and hit reload in the browser.
2) There is a free variable, called "exports". It is an object. You cannot set the value of exports inside your module code, you can only add properties. This means for instance, your module cannot be a function. In Dijit, Dojo's widget system, all widgets are constructor functions. The "exports" restriction makes your APIs awkward if you want to export functions for module values. The claim is that this exports restriction helps with circular dependencies, but it only helps a little bit. To me, it is not worth slightly improving an edge case when it sacrifices a greater usefulness and simplicity in user's modules.
RequireJS and its format can handle circular dependencies just fine. In the format supported by RequireJS, you to define a module as a function. Although, you can still use exports as CommonJS uses it if you so desire.
3) The require.main property seems like a hack. It is normally used so that a module can say if (require.main === module.id) or if (require.main === module) then do some "main" work. The module format should just define an exports.main convention for indicating "main" functionality. It is less typing, and more robust, since different code entry points have a different idea of "main". For instance, an HTTP request handler likely has specific requirements on what it considers to be the "main". The top level entry point should decide what code to execute as "main", not logic inside the module.
RequireJS does not support the require.main idiom.
So I believe the path used by RequireJS is more robust overall, works better/scales better in the browser. However, I still want to provide enough support for the existing CommonJS modules in the meantime to allow more code sharing.
In the long run though, the CommonJS format as it exists today should be replaced with something better. It is troublesome that the CommonJS group is not really targeting the browser, but over time, the broader JS community will expect browser toolkits to support CommonJS specs. It does not seem right to end up with non-optimal solutions in the browser when the browser is the most common JS platform.
Wednesday, January 27, 2010
RunJS to RequireJS?
There was a thread that started on the CommonJS list about a transport format, something that works well in the browser via script injection. I sketched out a proposal, Transport/C, that builds on the Transport/B and Transport/A specs.
Transport/C is very similar to some basic mechanics of RunJS but uses require() as the top level function, and supports the special "module" and "exports" free variables used in the normal CommonJS module spec.
In order to prove the concept for Transport/C, I made a branch of the RunJS code, calling it RequireJS, that implements Transport/C.
It seems like it fits with the existing CommonJS module spec, but is something that works well in the browser. I also made a simple conversion script that converts traditional CommonJS modules to this format.
I am tempted to convert from RunJS to this RequireJS branch, and to start evangelizing that approach for browser toolkits. It would be great if Transport/C would also be approved as the transport format for CommonJS too.
Kris Kowal has some concerns about the *very* long-term effects of the approach. I read his comments as possibly pointing out some things that would be done differently if the primordials and e-maker type of modules were ever accepted as part of an ECMAScript standard.
As I read the primordials and e-maker strawman proposals, I think the only difference is Transport/C functions are only expected to be called once, but e-maker style would favor calling the function on every require() call. As I say in my response, I believe e-maker support, would affect regular CommonJS modules in the same way as the transport format, and it is assuming the strawmans make it in to the spec at some point, as they are specified now.
I also believe how it works in how I coded Transport/C as part of the RequireJS branch is what a normal developer would expect, and I think fits better with existing browser/script behavior, and the assumptions that go along with coding CommonJS modules today.
So I am tempted to rename the RunJS project to RequireJS and proceed with that. If you have any feedback to the contrary, please let me know. Otherwise, I will likely do the change early this week.
Transport/C is very similar to some basic mechanics of RunJS but uses require() as the top level function, and supports the special "module" and "exports" free variables used in the normal CommonJS module spec.
In order to prove the concept for Transport/C, I made a branch of the RunJS code, calling it RequireJS, that implements Transport/C.
It seems like it fits with the existing CommonJS module spec, but is something that works well in the browser. I also made a simple conversion script that converts traditional CommonJS modules to this format.
I am tempted to convert from RunJS to this RequireJS branch, and to start evangelizing that approach for browser toolkits. It would be great if Transport/C would also be approved as the transport format for CommonJS too.
Kris Kowal has some concerns about the *very* long-term effects of the approach. I read his comments as possibly pointing out some things that would be done differently if the primordials and e-maker type of modules were ever accepted as part of an ECMAScript standard.
As I read the primordials and e-maker strawman proposals, I think the only difference is Transport/C functions are only expected to be called once, but e-maker style would favor calling the function on every require() call. As I say in my response, I believe e-maker support, would affect regular CommonJS modules in the same way as the transport format, and it is assuming the strawmans make it in to the spec at some point, as they are specified now.
I also believe how it works in how I coded Transport/C as part of the RequireJS branch is what a normal developer would expect, and I think fits better with existing browser/script behavior, and the assumptions that go along with coding CommonJS modules today.
So I am tempted to rename the RunJS project to RequireJS and proceed with that. If you have any feedback to the contrary, please let me know. Otherwise, I will likely do the change early this week.
Thursday, January 21, 2010
Script async, Raindrop and Firefox 3.6
In honor of the Firefox 3.6 release, I upgraded Raindrop to use the new async attribute for script tags.
Why is async neat? It does not block the rest of the page, and will just evaluate the script once it is retrieved. More information is in the HTML5 spec. Note that the script you add async to should NOT use document.write(), as doc.write will likely destroy your page.
Also, be aware that async is a boolean attribute, but that does not mean you should use async="true" to turn it on. The HTML5 spec on boolean attributes says that a value of empty string or a string that matches the attribute name should only be used. To avoid async, just do not include the attribute. For Raindrop, I used async="async" since that looks better to me than an empty string.
Raindrop uses RunJS for the module loader, and RunJS uses dynamically added script tags via head.appendChild(), so the modules loaded by RunJS already behave in an async manner.
However apps that want to use the raindrop front end libraries normally include a script called rdconfig.js as their own script file, and that config file does a document.write to write out the Dojo+RunJS and jQuery tags. Those Dojo+RunJS and jQuery tags now use the async attribute.
Why is async neat? It does not block the rest of the page, and will just evaluate the script once it is retrieved. More information is in the HTML5 spec. Note that the script you add async to should NOT use document.write(), as doc.write will likely destroy your page.
Also, be aware that async is a boolean attribute, but that does not mean you should use async="true" to turn it on. The HTML5 spec on boolean attributes says that a value of empty string or a string that matches the attribute name should only be used. To avoid async, just do not include the attribute. For Raindrop, I used async="async" since that looks better to me than an empty string.
Raindrop uses RunJS for the module loader, and RunJS uses dynamically added script tags via head.appendChild(), so the modules loaded by RunJS already behave in an async manner.
However apps that want to use the raindrop front end libraries normally include a script called rdconfig.js as their own script file, and that config file does a document.write to write out the Dojo+RunJS and jQuery tags. Those Dojo+RunJS and jQuery tags now use the async attribute.
Saturday, January 09, 2010
RunJS Dependency API
In my last post, I talked about a suggestion from David Ascher to try to improve the syntax of specifying dependencies when declaring a module. Here is an example of that suggestion, using run.def(), a possibly new API dedicated to just defining a module called "rdw/Message":
While that does make it clear what each module name's variable will be inside the function that defines the module, it has the following drawbacks:
1) It hurts minification -- having properties off the R object passed to the module function means that minification tools will not be able to minify those references as easy.
2) All the modules dependencies must be referenced via a prefix "R.". For example R.api. It is a small bit of typing and an extra property lookup. That might be seen as an advantage too -- it is clear that a symbol is a dependency because it is off the R. function.
3) Makes it hard to find typos for properties on the R. function. With the current runjs format, JSLint can actually find typos for a dependency's variable name.
4) This syntax is more verbose for JS files that do not care about scope encapsulation. For JavaScript libraries like jQuery, MooTools and Prototype, they pretty much operate in the same global scope. jQuery does have a noConflict(), but that just helps if there is only one other thing called $ in the page. MooTools and Prototype add things to global prototypes.
So for these libraries, specifying dependencies is more like just specifying script tags, and they do not need a local-scoped variable in the function module to get things defined. They would just need to do the following:
So for these libraries, it would be onerous to be forced to create variable names for each of those dependencies.
This last point seems to be enough to tip the scales back to using the current model used by run. However I am aware that it is possible for the developer to not get the order or number correct, matching the dependency string with the correct variable for the function.
I am hoping using a coding standard like the following will help:
I purposely trimmed the list of dependencies, so I can get this example to show up in this blog. That is the down-side with this sort of code style: it can have a very long line for the dependency names.
For Raindrop, I want to try to keep local scope encapsulation, particularly since I expect some slicker extensions to it that may introduce other code into the page that may conflict. However, if I did not care to do that, I could shorten up the above example quite a bit.
So at this point, I am favoring making it easy to use RunJS for other toolkits that do not care about local scope encapsulation and detecting bad references to variables inside the module definition over avoiding errors with a mismatched function variable name to a dependency name.
It is a hard choice to make. Neither path is perfect. If you have an opinion, feel free to share it.
run.def("rdw/Message", {
rd: "rd",
dojo: "dojo",
Base: "rdw/_Base",
friendly: "rd/friendly",
hyperlink: "rd/hyperlink",
api: "rd/api",
template: "text!rdw/templates/Message!html"
}, function(R) {
//Module definition function.
//Use things like R.api and R.Base in here that map to the modules up above.
...
});
While that does make it clear what each module name's variable will be inside the function that defines the module, it has the following drawbacks:
1) It hurts minification -- having properties off the R object passed to the module function means that minification tools will not be able to minify those references as easy.
2) All the modules dependencies must be referenced via a prefix "R.". For example R.api. It is a small bit of typing and an extra property lookup. That might be seen as an advantage too -- it is clear that a symbol is a dependency because it is off the R. function.
3) Makes it hard to find typos for properties on the R. function. With the current runjs format, JSLint can actually find typos for a dependency's variable name.
4) This syntax is more verbose for JS files that do not care about scope encapsulation. For JavaScript libraries like jQuery, MooTools and Prototype, they pretty much operate in the same global scope. jQuery does have a noConflict(), but that just helps if there is only one other thing called $ in the page. MooTools and Prototype add things to global prototypes.
So for these libraries, specifying dependencies is more like just specifying script tags, and they do not need a local-scoped variable in the function module to get things defined. They would just need to do the following:
run.def("rdw/Message",
["some/module", "something/else", ...]
function() {
//This function does not need some locally scoped variables, and most
//likely, the script dependencies above may not call run.def() and define
//an object, just add things to the global space
});
So for these libraries, it would be onerous to be forced to create variable names for each of those dependencies.
This last point seems to be enough to tip the scales back to using the current model used by run. However I am aware that it is possible for the developer to not get the order or number correct, matching the dependency string with the correct variable for the function.
I am hoping using a coding standard like the following will help:
run.def("rdw/Message",
["rd", "dojo", "rdw/Base"], function(
rd, dojo, Base) {
//Define the module for rdw/Message and return it.
});
Basically, make sure all dependencies are on one line, along with the function keyword, then put the variable that matches each dependency aligned directly under the depdendency name (the example above may not be correctly aligned depending on the font or format you are viewing this message).I purposely trimmed the list of dependencies, so I can get this example to show up in this blog. That is the down-side with this sort of code style: it can have a very long line for the dependency names.
For Raindrop, I want to try to keep local scope encapsulation, particularly since I expect some slicker extensions to it that may introduce other code into the page that may conflict. However, if I did not care to do that, I could shorten up the above example quite a bit.
So at this point, I am favoring making it easy to use RunJS for other toolkits that do not care about local scope encapsulation and detecting bad references to variables inside the module definition over avoiding errors with a mismatched function variable name to a dependency name.
It is a hard choice to make. Neither path is perfect. If you have an opinion, feel free to share it.
Subscribe to:
Comments (Atom)