Skip to content

Commit ff39d41

Browse files
isaacsry
authored andcommitted
Document module loading
1 parent 35e3222 commit ff39d41

File tree

1 file changed

+260
-53
lines changed

1 file changed

+260
-53
lines changed

doc/api/modules.markdown

+260-53
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,13 @@ one-to-one correspondence. As an example, `foo.js` loads the module
2121

2222
The contents of `foo.js`:
2323

24-
var circle = require('./circle');
24+
var circle = require('./circle.js');
2525
console.log( 'The area of a circle of radius 4 is '
2626
+ circle.area(4));
2727

2828
The contents of `circle.js`:
2929

30-
var PI = 3.14;
30+
var PI = Math.PI;
3131

3232
exports.area = function (r) {
3333
return PI * r * r;
@@ -39,78 +39,285 @@ The contents of `circle.js`:
3939

4040
The module `circle.js` has exported the functions `area()` and
4141
`circumference()`. To export an object, add to the special `exports`
42-
object. (Alternatively, one can use `this` instead of `exports`.) Variables
42+
object.
43+
44+
Variables
4345
local to the module will be private. In this example the variable `PI` is
44-
private to `circle.js`. The function `puts()` comes from the module `'util'`,
45-
which is a built-in module. Modules which are not prefixed by `'./'` are
46-
built-in modules--more about this later.
46+
private to `circle.js`.
47+
48+
### Core Modules
49+
50+
Node has several modules compiled into the binary. These modules are
51+
described in greater detail elsewhere in this documentation.
52+
53+
The core modules are defined in node's source in the `lib/` folder.
54+
55+
Core modules are always preferentially loaded if their identifier is
56+
passed to `require()`. For instance, `require('http')` will always
57+
return the built in HTTP module, even if there is a file by that name.
4758

48-
### Module Resolving
59+
### File Modules
60+
61+
If the exact filename is not found, then node will attempt to load the
62+
required filename with the added extension of `.js`, and then `.node`.
63+
64+
`.js` files are interpreted as JavaScript text files, and `.node` files
65+
are interpreted as compiled addon modules loaded with `dlopen`.
66+
67+
A module prefixed with `'/'` is an absolute path to the file. For
68+
example, `require('/home/marco/foo.js')` will load the file at
69+
`/home/marco/foo.js`.
4970

5071
A module prefixed with `'./'` is relative to the file calling `require()`.
5172
That is, `circle.js` must be in the same directory as `foo.js` for
5273
`require('./circle')` to find it.
5374

54-
Without the leading `'./'`, like `require('assert')` the module is searched
55-
for in the `require.paths` array. `require.paths` on my system looks like
56-
this:
75+
Without a leading '/' or './' to indicate a file, the module is either a
76+
"core module" or is loaded from a `node_modules` folder.
77+
78+
### Loading from `node_modules` Folders
79+
80+
If the module identifier passed to `require()` is not a native module,
81+
and does not begin with `'/'`, `'../'`, or `'./'`, then node starts at the
82+
parent directory of the current module, and adds `/node_modules`, and
83+
attempts to load the module from that location.
5784

58-
`[ '/home/ryan/.node_modules' ]`
85+
If it is not found there, then it moves to the parent directory, and so
86+
on, until either the module is found, or the root of the tree is
87+
reached.
5988

60-
That is, when `require('foo')` is called Node looks for:
89+
For example, if the file at `'/home/ry/projects/foo.js'` called
90+
`require('bar.js')`, then node would look in the following locations, in
91+
this order:
6192

62-
* 1: `/home/ryan/.node_modules/foo`
63-
* 2: `/home/ryan/.node_modules/foo.js`
64-
* 3: `/home/ryan/.node_modules/foo.node`
65-
* 4: `/home/ryan/.node_modules/foo/index.js`
66-
* 5: `/home/ryan/.node_modules/foo/index.node`
93+
* `/home/ry/projects/node_modules/bar.js`
94+
* `/home/ry/node_modules/bar.js`
95+
* `/home/node_modules/bar.js`
96+
* `/node_modules/bar.js`
6797

68-
interrupting once a file is found. Files ending in `'.node'` are binary Addon
69-
Modules; see 'Addons' below. `'index.js'` allows one to package a module as
70-
a directory.
98+
This allows programs to localize their dependencies, so that they do not
99+
clash.
71100

72-
Additionally, a `package.json` file may be used to treat a folder as a
73-
module, if it specifies a `'main'` field. For example, if the file at
74-
`./foo/bar/package.json` contained this data:
101+
#### Optimizations to the `node_modules` Lookup Process
75102

76-
{ "name" : "bar",
77-
"version" : "1.2.3",
78-
"main" : "./lib/bar.js" }
103+
When there are many levels of nested dependencies, it is possible for
104+
these file trees to get fairly long. The following optimizations are thus
105+
made to the process.
79106

80-
then `require('./foo/bar')` would load the file at
81-
`'./foo/bar/lib/bar.js'`. This allows package authors to specify an
82-
entry point to their module, while structuring their package how it
83-
suits them.
107+
First, `/node_modules` is never appended to a folder already ending in
108+
`/node_modules`.
84109

85-
Any folders named `"node_modules"` that exist in the current module path
86-
will also be appended to the effective require path. This allows for
87-
bundling libraries and other dependencies in a 'node_modules' folder at
88-
the root of a program.
110+
Second, if the file calling `require()` is already inside a `node_modules`
111+
heirarchy, then the top-most `node_modules` folder is treated as the
112+
root of the search tree.
89113

90-
To avoid overly long lookup paths in the case of nested packages,
91-
the following 2 optimizations are made:
114+
For example, if the file at
115+
`'/home/ry/projects/foo/node_modules/bar/node_modules/baz/quux.js'`
116+
called `require('asdf.js')`, then node would search the following
117+
locations:
92118

93-
1. If the module calling `require()` is already within a `node_modules`
94-
folder, then the lookup will not go above the top-most `node_modules`
95-
directory.
96-
2. Node will not append `node_modules` to a path already ending in
97-
`node_modules`.
119+
* `/home/ry/projects/foo/node_modules/bar/node_modules/baz/node_modules/asdf.js`
120+
* `/home/ry/projects/foo/node_modules/bar/node_modules/asdf.js`
121+
* `/home/ry/projects/foo/node_modules/asdf.js`
98122

99-
So, for example, if the file at
100-
`/usr/lib/node_modules/foo/node_modules/bar.js` were to do
101-
`require('baz')`, then the following places would be searched for a
102-
`baz` module, in this order:
123+
### Folders as Modules
103124

104-
* 1: `/usr/lib/node_modules/foo/node_modules`
105-
* 2: `/usr/lib/node_modules`
125+
It is convenient to organize programs and libraries into self-contained
126+
directories, and then provide a single entry point to that library.
127+
There are three ways in which a folder may be passed to `require()` as
128+
an argument.
106129

107-
`require.paths` can be modified at runtime by simply unshifting new
108-
paths onto it, or at startup with the `NODE_PATH` environmental
109-
variable (which should be a list of paths, colon separated).
130+
The first is to create a `package.json` file in the root of the folder,
131+
which specifies a `main` module. An example package.json file might
132+
look like this:
110133

111-
The second time `require('foo')` is called, it is not loaded again from
112-
disk. It looks in the `require.cache` object to see if it has been loaded
113-
before.
134+
{ "name" : "some-library",
135+
"main" : "./lib/some-library.js" }
136+
137+
If this was in a folder at `./some-library`, then
138+
`require('./some-library')` would attempt to load
139+
`./some-library/lib/some-library.js`.
140+
141+
This is the extent of Node's awareness of package.json files.
142+
143+
If there is no package.json file present in the directory, then node
144+
will attempt to load an `index.js` or `index.node` file out of that
145+
directory. For example, if there was no package.json file in the above
146+
example, then `require('./some-library')` would attempt to load:
147+
148+
* `./some-library/index.js`
149+
* `./some-library/index.node`
150+
151+
### Caching
152+
153+
Modules are cached after the first time they are loaded. This means
154+
(among other things) that every call to `require('foo')` will get
155+
exactly the same object returned, if it would resolve to the same file.
156+
157+
### All Together...
114158

115159
To get the exact filename that will be loaded when `require()` is called, use
116160
the `require.resolve()` function.
161+
162+
Putting together all of the above, here is the high-level algorithm
163+
in pseudocode of what require.resolve does:
164+
165+
require(X)
166+
1. If X is a core module,
167+
a. return the core module
168+
b. STOP
169+
2. If X begins with `./` or `/`,
170+
a. LOAD_AS_FILE(Y + X)
171+
b. LOAD_AS_DIRECTORY(Y + X)
172+
3. LOAD_NODE_MODULES(X, dirname(Y))
173+
4. THROW "not found"
174+
175+
LOAD_AS_FILE(X)
176+
1. If X is a file, load X as JavaScript text. STOP
177+
2. If X.js is a file, load X.js as JavaScript text. STOP
178+
3. If X.node is a file, load X.node as binary addon. STOP
179+
180+
LOAD_AS_DIRECTORY(X)
181+
1. If X/package.json is a file,
182+
a. Parse X/package.json, and look for "main" field.
183+
b. let M = X + (json main field)
184+
c. LOAD_AS_FILE(M)
185+
2. LOAD_AS_FILE(X/index)
186+
187+
LOAD_NODE_MODULES(X, START)
188+
1. let DIRS=NODE_MODULES_PATHS(START)
189+
2. for each DIR in DIRS:
190+
a. LOAD_AS_FILE(DIR/X)
191+
b. LOAD_AS_DIRECTORY(DIR/X)
192+
193+
NODE_MODULES_PATHS(START)
194+
1. let PARTS = path split(START)
195+
2. let ROOT = index of first instance of "node_modules" in PARTS, or 0
196+
3. let I = count of PARTS - 1
197+
4. let DIRS = []
198+
5. while I > ROOT,
199+
a. if PARTS[I] = "node_modules" CONTINUE
200+
c. DIR = path join(PARTS[0 .. I] + "node_modules")
201+
b. DIRS = DIRS + DIR
202+
6. return DIRS
203+
204+
### Loading from the `require.paths` Folders
205+
206+
In node, `require.paths` is an array of strings that represent paths to
207+
be searched for modules when they are not prefixed with `'/'`, `'./'`, or
208+
`'../'`. For example, if require.paths were set to:
209+
210+
[ '/home/micheil/.node_modules',
211+
'/usr/local/lib/node_modules' ]
212+
213+
Then calling `require('bar/baz.js')` would search the following
214+
locations:
215+
216+
* 1: `'/home/micheil/.node_modules/bar/baz.js'`
217+
* 2: `'/usr/local/lib/node_modules/bar/baz.js'`
218+
219+
The `require.paths` array can be mutated at run time to alter this
220+
behavior.
221+
222+
It is set initially from the `NODE_PATH` environment variable, which is
223+
a colon-delimited list of absolute paths. In the previous example,
224+
the `NODE_PATH` environment variable might have been set to:
225+
226+
/home/micheil/.node_modules:/usr/local/lib/node_modules
227+
228+
#### **Note:** Please Avoid Modifying `require.paths`
229+
230+
For compatibility reasons, `require.paths` is still given first priority
231+
in the module lookup process. However, it may disappear in a future
232+
release.
233+
234+
While it seemed like a good idea at the time, and enabled a lot of
235+
useful experimentation, in practice a mutable `require.paths` list is
236+
often a troublesome source of confusion and headaches.
237+
238+
##### Setting `require.paths` to some other value does nothing.
239+
240+
This does not do what one might expect:
241+
242+
require.paths = [ '/usr/lib/node' ];
243+
244+
All that does is lose the reference to the *actual* node module lookup
245+
paths, and create a new reference to some other thing that isn't used
246+
for anything.
247+
248+
##### Putting relative paths in `require.paths` is... weird.
249+
250+
If you do this:
251+
252+
require.paths.push('./lib');
253+
254+
then it does *not* add the full resolved path to where `./lib`
255+
is on the filesystem. Instead, it literally adds `'./lib'`,
256+
meaning that if you do `require('y.js')` in `/a/b/x.js`, then it'll look
257+
in `/a/b/lib/y.js`. If you then did `require('y.js')` in
258+
`/l/m/n/o/p.js`, then it'd look in `/l/m/n/o/p/lib/y.js`.
259+
260+
In practice, people have used this as an ad hoc way to bundle
261+
dependencies, but this technique is brittle.
262+
263+
##### Zero Isolation
264+
265+
There is (by regrettable design), only one `require.paths` array used by
266+
all modules.
267+
268+
As a result, if one node program comes to rely on this behavior, it may
269+
permanently and subtly alter the behavior of all other node programs in
270+
the same process. As the application stack grows, we tend to assemble
271+
functionality, and it is a problem with those parts interact in ways
272+
that are difficult to predict.
273+
274+
## Addenda: Package Manager Tips
275+
276+
If you were to build a package manager, the tools above provide you with
277+
all you need to very elegantly set up modules in a folder structure such
278+
that they get the required dependencies and do not conflict with one
279+
another.
280+
281+
Let's say that we wanted to have the folder at
282+
`/usr/lib/<some-program>/<some-version>` hold the contents of a specific
283+
version of a package.
284+
285+
Packages can depend on one another. So, in order to install
286+
package `foo`, you may have to install a specific version of package `bar`.
287+
The `bar` package may itself have dependencies, and in some cases, these
288+
dependencies may even collide or form cycles.
289+
290+
Since Node looks up the `realpath` of any modules it loads, and then
291+
looks for their dependencies in the `node_modules` folders as described
292+
above, this situation is very simple to resolve with the following
293+
architecture:
294+
295+
* `/usr/lib/foo/1.2.3/` - Contents of the `foo` package, version 1.2.3.
296+
* `/usr/lib/bar/4.3.2/` - Contents of the `bar` package that `foo`
297+
depends on.
298+
* `/usr/lib/foo/1.2.3/node_modules/bar` - Symbolic link to
299+
`/usr/lib/bar/4.3.2/`.
300+
* `/usr/lib/bar/4.3.2/node_modules/*` - Symbolic links to the packages
301+
that `bar` depends on.
302+
303+
Thus, even if a cycle is encountered, or if there are dependency
304+
conflicts, every module will be able to get a version of its dependency
305+
that it can use.
306+
307+
When the code in the `foo` package does `require('bar')`, it will get
308+
the version that is symlinked into
309+
`/usr/lib/foo/1.2.3/node_modules/bar`. Then, when the code in the `bar`
310+
package calls `require('quux')`, it'll get the version that is symlinked
311+
into `/usr/lib/bar/4.3.2/node_modules/quux`.
312+
313+
Furthermore, to make the module lookup process even more optimal, rather
314+
than putting packages directly in `/usr/lib`, we could put them in
315+
`/usr/lib/node_modules/<name>/<version>`. Then node will not bother
316+
looking for missing dependencies in `/usr/node_modules` or
317+
`/node_modules`.
318+
319+
In order to make modules available to the node repl, it might be useful
320+
to also add the `/usr/lib/node_modules` folder to the `NODE_PATH`
321+
environment variable. Since the module lookups using `node_modules`
322+
folders are all relative, and based on the real path of the files
323+
making the calls to `require()`, the packages themselves can be anywhere.

0 commit comments

Comments
 (0)