Skip to content

Commit 6909ebb

Browse files
committed
RFC: reduce lifecycle script environment size
1 parent 2d2f004 commit 6909ebb

File tree

1 file changed

+131
-0
lines changed

1 file changed

+131
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Reduce the environment variables provided to lifecycle scripts
2+
3+
In versions of npm up to v6, the following items are all placed into the
4+
environment of scripts run for various lifecycle events (install, prepare,
5+
etc., as well as explicit scripts such as test and start).
6+
7+
- `PATH` Configured to include `.../node_modules/.bin` for current and all
8+
parent `node_modules` directories.
9+
- `npm_package_*` for all `package.json` values in the current package for
10+
which the lifecycle event is running.
11+
- `npm_config_*` for all npm configuration values that do not start with a
12+
`_` character.
13+
- `npm_lifecycle_event` the current lifecycle event.
14+
- `npm_lifecycle_script` the command being run.
15+
- `npm_node_execpath` the path to the Node.js executable npm is using.
16+
- `npm_execpath` the path to the npm executable being run.
17+
18+
The suggestion presented here is to remove (or vastly reduce) the
19+
`npm_config_*` and `npm_package_*` environment variables from the context
20+
of lifecycle scripts, and potentially also add new fields that may be more
21+
useful to more users.
22+
23+
## Motivation
24+
25+
Lifecycle scripts are run in many different contexts throughout the npm
26+
codebase.
27+
28+
- Explicit scripts are run directly from the `lib/run-script.js`
29+
command implementation.
30+
- Build scripts are run from the context of the tree building logic, which
31+
is moving to a new implementation with `@npmcli/arborist` in v7.
32+
- Prepare scripts are run by `pacote` when it creates a tarball for
33+
publication or when it installs `git` dependencies.
34+
35+
All of this necessitates passing around a single configuration object,
36+
which has some problems.
37+
38+
1. It is tedious and error-prone, and has led to a more complicated
39+
codebase
40+
2. While we have not had security issues with it in the past, it runs the
41+
risk of exposing something sensitive in a context where it should not be
42+
exposed.
43+
3. It invites users to fork package behavior based on npm configuration,
44+
which should be a contract between the user and npm, and not between the
45+
user, npm, and the publisher.
46+
4. While the package.json data does not have as many of these problems, it
47+
is also largely unnecessary (and not widely used). The `package.json`
48+
file is readily available and easily parsed, and most scripts that would
49+
depend on package data simply read it directly.
50+
5. The environment is created anew for every script that's run. This could
51+
be optimized further, but as it currently stands, it's pretty
52+
inefficient.
53+
6. Lastly, exposing the full configuration and package.json makes the
54+
environment significantly larger, and can lead to problems on
55+
memory-constrained systems.
56+
57+
The advantage of including `npm_config_*` values in the lifecycle
58+
environment is that npm commands run from within lifecycle events will have
59+
the same config values as the process that spawned them, since `env` values
60+
will override any other values except explicit command line flags.
61+
62+
For example, a script named `release` may run tests, update the changelog,
63+
and then publishe the package. Running `npm run release --otp=123456` will
64+
put the two-factor auth one-time password into the `npm_config_otp`
65+
environment variable, so that the subsequent `npm publish` command will
66+
have the one-time password provided in the config.
67+
68+
## Detailed Explanation
69+
70+
1. Remove `npm_package*` values from the script lifecycle environment.
71+
2. Provide a new field, `npm_package_json` with the path to the
72+
`package.json` file
73+
3. Remove all `npm_config_*` values from the script lifecycle environment
74+
_except_:
75+
1. `npm_config_userconfig`
76+
2. `npm_config_globalconfig`
77+
3. Environment variables corresponding to any non-default config
78+
values.
79+
4. Add `npm_package_from`, `npm_package_resolved`, and
80+
`npm_package_integrity` for the package whose lifecycle event is
81+
running, if it's part of an install. (This addresses the needs of build
82+
tools, as discussed in
83+
[#38](https://github.com/npm/rfcs/pull/38#issuecomment-529182151).)
84+
5. `PATH` will continue to be provided as it currently is, so that scripts
85+
find their dependencies' executables first.
86+
87+
This makes it easier to find and rely on package.json data, while ensuring
88+
that config defaults are maintained, without blowing up the size of the
89+
environment for lifecycle processes.
90+
91+
## Rationale and Alternatives
92+
93+
Possible alternatives:
94+
95+
### Just go ahead and pass around the whole config object like we do today
96+
97+
This is not ideal for the reasons mentioned above, but also, it makes it
98+
virtually assured that Arborist remains tightly coupled to the npm cli.
99+
While _some_ degree of coupling is unavoidable, having to provide a valid
100+
npm config object would make this coupling much tighter than necessary.
101+
102+
### Inversion of Control on the npm-lifecycle environment creation
103+
104+
Rather than provide a config object matching a given interface, provide
105+
`npm-lifecycle` with a method that can build up and return the environment
106+
object.
107+
108+
This approach would address the tight coupling between cli and arborist,
109+
but it doesn't address the other problems with having a giant config object
110+
dumped into the environment.
111+
112+
## Implementation
113+
114+
The npm CLI will set the `userconfig`, `globalconfig`, and all non-default
115+
config flags in the environment so that scripts and sub-scripts will have
116+
them set in their configs by default at the env level.
117+
118+
Instead of building the environment up from the config and package data,
119+
`npm-lifecycle` will only set `npm_package_json` to the path to the
120+
package.json file for the package being run, and take an object to define
121+
additional environment variables.
122+
123+
Because the npm CLI sets the relevant config fields, they'll be inherited
124+
to the child processes automatically. Arborist will use the environment
125+
option to pass in the `npm_package_from`, `npm_package_resolved`, and
126+
`npm_package_integrity` values.
127+
128+
## Prior Art
129+
130+
npm v6 and yarn both do roughly the same thing, though they have different
131+
config values.

0 commit comments

Comments
 (0)