|
| 1 | +# Reduce the environment variables provided to lifecycle scripts |
| 2 | + |
| 3 | +In versions of npm up to v6, the following items are all placed into the |
| 4 | +environment of scripts run for various lifecycle events (install, prepare, |
| 5 | +etc., as well as explicit scripts such as test and start). |
| 6 | + |
| 7 | +- `PATH` Configured to include `.../node_modules/.bin` for current and all |
| 8 | + parent `node_modules` directories. |
| 9 | +- `npm_package_*` for all `package.json` values in the current package for |
| 10 | + which the lifecycle event is running. |
| 11 | +- `npm_config_*` for all npm configuration values that do not start with a |
| 12 | + `_` character. |
| 13 | +- `npm_lifecycle_event` the current lifecycle event. |
| 14 | +- `npm_lifecycle_script` the command being run. |
| 15 | +- `npm_node_execpath` the path to the Node.js executable npm is using. |
| 16 | +- `npm_execpath` the path to the npm executable being run. |
| 17 | + |
| 18 | +The suggestion presented here is to remove (or vastly reduce) the |
| 19 | +`npm_config_*` and `npm_package_*` environment variables from the context |
| 20 | +of lifecycle scripts, and potentially also add new fields that may be more |
| 21 | +useful to more users. |
| 22 | + |
| 23 | +## Motivation |
| 24 | + |
| 25 | +Lifecycle scripts are run in many different contexts throughout the npm |
| 26 | +codebase. |
| 27 | + |
| 28 | +- Explicit scripts are run directly from the `lib/run-script.js` |
| 29 | + command implementation. |
| 30 | +- Build scripts are run from the context of the tree building logic, which |
| 31 | + is moving to a new implementation with `@npmcli/arborist` in v7. |
| 32 | +- Prepare scripts are run by `pacote` when it creates a tarball for |
| 33 | + publication or when it installs `git` dependencies. |
| 34 | + |
| 35 | +All of this necessitates passing around a single configuration object, |
| 36 | +which has some problems. |
| 37 | + |
| 38 | +1. It is tedious and error-prone, and has led to a more complicated |
| 39 | + codebase |
| 40 | +2. While we have not had security issues with it in the past, it runs the |
| 41 | + risk of exposing something sensitive in a context where it should not be |
| 42 | + exposed. |
| 43 | +3. It invites users to fork package behavior based on npm configuration, |
| 44 | + which should be a contract between the user and npm, and not between the |
| 45 | + user, npm, and the publisher. |
| 46 | +4. While the package.json data does not have as many of these problems, it |
| 47 | + is also largely unnecessary (and not widely used). The `package.json` |
| 48 | + file is readily available and easily parsed, and most scripts that would |
| 49 | + depend on package data simply read it directly. |
| 50 | +5. The environment is created anew for every script that's run. This could |
| 51 | + be optimized further, but as it currently stands, it's pretty |
| 52 | + inefficient. |
| 53 | +6. Lastly, exposing the full configuration and package.json makes the |
| 54 | + environment significantly larger, and can lead to problems on |
| 55 | + memory-constrained systems. |
| 56 | + |
| 57 | +The advantage of including `npm_config_*` values in the lifecycle |
| 58 | +environment is that npm commands run from within lifecycle events will have |
| 59 | +the same config values as the process that spawned them, since `env` values |
| 60 | +will override any other values except explicit command line flags. |
| 61 | + |
| 62 | +For example, a script named `release` may run tests, update the changelog, |
| 63 | +and then publishe the package. Running `npm run release --otp=123456` will |
| 64 | +put the two-factor auth one-time password into the `npm_config_otp` |
| 65 | +environment variable, so that the subsequent `npm publish` command will |
| 66 | +have the one-time password provided in the config. |
| 67 | + |
| 68 | +## Detailed Explanation |
| 69 | + |
| 70 | +1. Remove `npm_package*` values from the script lifecycle environment. |
| 71 | +2. Provide a new field, `npm_package_json`, with the path to the |
| 72 | + `package.json` file. |
| 73 | +3. Provide a new field, `npm_command`, with the canonical name of the |
| 74 | + command being run. |
| 75 | +3. Remove all `npm_config_*` values from the script lifecycle environment |
| 76 | + _except_: |
| 77 | + 1. `npm_config_userconfig` |
| 78 | + 2. `npm_config_globalconfig` |
| 79 | + 3. Environment variables corresponding to any non-default config |
| 80 | + values. |
| 81 | +4. Add `npm_package_from`, `npm_package_resolved`, and |
| 82 | + `npm_package_integrity` for the package whose lifecycle event is |
| 83 | + running, if it's part of an install. (This addresses the needs of build |
| 84 | + tools, as discussed in |
| 85 | + [#38](https://github.com/npm/rfcs/pull/38#issuecomment-529182151).) |
| 86 | +5. `PATH` will continue to be provided as it currently is, so that scripts |
| 87 | + find their dependencies' executables first. |
| 88 | + |
| 89 | +This makes it easier to find and rely on package.json data, while ensuring |
| 90 | +that config defaults are maintained, without blowing up the size of the |
| 91 | +environment for lifecycle processes, or requiring access to the npm config |
| 92 | +subsystem in every npm CLI dependency. |
| 93 | + |
| 94 | +## Rationale and Alternatives |
| 95 | + |
| 96 | +Possible alternatives: |
| 97 | + |
| 98 | +### Just go ahead and pass around the whole config object like we do today |
| 99 | + |
| 100 | +This is not ideal for the reasons mentioned above, but also, it makes it |
| 101 | +virtually assured that Arborist remains tightly coupled to the npm cli. |
| 102 | +While _some_ degree of coupling is unavoidable, having to provide a valid |
| 103 | +npm config object would make this coupling much tighter than necessary. |
| 104 | + |
| 105 | +### Inversion of Control on the npm-lifecycle environment creation |
| 106 | + |
| 107 | +Rather than provide a config object matching a given interface, provide |
| 108 | +`npm-lifecycle` with a method that can build up and return the environment |
| 109 | +object. |
| 110 | + |
| 111 | +This approach would address the tight coupling between cli and arborist, |
| 112 | +but it doesn't address the other problems with having a giant config object |
| 113 | +dumped into the environment. |
| 114 | + |
| 115 | +## Implementation |
| 116 | + |
| 117 | +The npm CLI will set all non-default config flags in the environment so |
| 118 | +that scripts and sub-scripts will have them set in their configs by default |
| 119 | +at the env level. |
| 120 | + |
| 121 | +Replace `npm-lifecycle` with a lighter-weight approach, published as |
| 122 | +`@npmcli/run-script`. |
| 123 | + |
| 124 | +Instead of building the environment up from the config and package data, |
| 125 | +`@npmcli/run-script` will only set `npm_package_json` to the path to the |
| 126 | +package.json file for the package being run, take an object to define |
| 127 | +additional environment variables, and always inherit the environment in the |
| 128 | +child process being run. |
| 129 | + |
| 130 | +Because the npm CLI sets the relevant config fields, they'll be inherited |
| 131 | +to the child processes automatically. Arborist will use the environment |
| 132 | +option to pass in the `npm_package_from`, `npm_package_resolved`, and |
| 133 | +`npm_package_integrity` values. |
| 134 | + |
| 135 | +## Prior Art |
| 136 | + |
| 137 | +npm v6 and yarn both do roughly the same thing, though they have different |
| 138 | +config values. |
| 139 | + |
| 140 | +## Downsides |
| 141 | + |
| 142 | +Some modules today use `npm_config_argv`. These will have to be updated to |
| 143 | +use other means to get this information. |
| 144 | + |
| 145 | +Where the argv is being parsed in order to determine the command being run, |
| 146 | +the `npm_command` environ provides a safer approach. |
0 commit comments