Skip to content

Commit 132e35a

Browse files
committed
[docs/rfc] RFC about environment variables
1 parent 5f9a7d7 commit 132e35a

File tree

1 file changed

+216
-0
lines changed

1 file changed

+216
-0
lines changed

docs/rfcs/env-vars.md

+216
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# A more sensible Collector environment variable resolution
2+
3+
## Overview
4+
5+
The OpenTelemetry Collector supports three different syntaxes for
6+
environment variable resolution which differ in their syntax, semantics
7+
and allowed variable names. Before we stabilize confmap, we need to
8+
address several issues related to environment variables. This document
9+
describes: the current (as of v0.97.0) behavior of the Collector, the
10+
goals that an environment variable resolution should aim for, existing
11+
deviations from these goals and the desired behavior after making some
12+
changes.
13+
14+
CLI environment variable resolution has a single syntax (`--config env:ENV`)
15+
and it is considered out of scope for this document, focusing
16+
instead on expansion within the Collector configuration.
17+
18+
How to get from the current to desired behavior is also considered out
19+
of scope and will be discussed on individual PRs. It will likely involve
20+
one or multiple feature gates, warnings and transition periods.
21+
22+
## Goals of an expansion system
23+
24+
The following are considered goals of the expansion system:
25+
26+
1. ***Expansion should happen only when the user expects it***. We
27+
should aim to expand when the user expects it and keep the original
28+
value when we don't (e.g. because the syntax is used for something
29+
different).
30+
2. ***Expansion should have predictable behavior***.
31+
3. ***Multiple expansion methods, if present, should have similar behavior.***
32+
Switching from `${env:ENV}` to `${ENV}` or vice versa
33+
should not lead to (too many?) surprises.
34+
4. ***When the syntax overlaps, expansion should be aligned with***
35+
[***the expansion defined by the Configuration Working Group***](https://github.com/open-telemetry/opentelemetry-specification/blob/032213cedde54a2171dfbd234a371501a3537919/specification/configuration/file-configuration.md#environment-variable-substitution).
36+
37+
## Current behavior
38+
39+
The Collector supports three different syntaxes for environment variable
40+
resolution:
41+
42+
1. The *naked syntax*, `$ENV`.
43+
2. The *braces syntax*, `${ENV}`.
44+
3. The *env provider syntax*, `${env:ENV}`.
45+
46+
These differ in the character set allowed for environment variable names
47+
as well as the type of parsing they return. Escaping is supported in all
48+
syntaxes by using two dollar signs.
49+
50+
### Type casting rules
51+
52+
A provider or converter takes a string and returns some sort of value
53+
after potentially doing some parsing. This gets stored in a
54+
confmap.Conf. When unmarshalling, we use mapstructure with
55+
WeaklyTypedInput enabled, which does a lot of implicit casting. The
56+
details of this type casting are complex and are outlined on issue
57+
[#9532](https://github.com/open-telemetry/opentelemetry-collector/issues/9532).
58+
59+
When using this notation in inline mode (e.g.
60+
`http://endpoint/${env:PATH}`) we also do manual implicit type
61+
casting with a similar approach to mapstructure. These are outlined
62+
[here](https://github.com/open-telemetry/opentelemetry-collector/blob/fc4c13d3c2822bec39fa9d9658836d1a020c6844/confmap/expand.go#L124-L139).
63+
64+
### Naked syntax
65+
66+
The naked syntax is supported via the expand converter. It is
67+
implemented using the [`os.Expand`](https://pkg.go.dev/os#Expand) stdlib
68+
function. This syntax supports identifiers made up of:
69+
70+
1. ASCII alphanumerics and the `_` character
71+
2. Certain special characters if they appear alone typically used in
72+
Bash: `*`, `#`, `$`, `@`, `!`, `?` and `-`.
73+
74+
You can see supported identifiers in this example:
75+
[`go.dev/play/p/YfxLtYbsL6j`](https://go.dev/play/p/YfxLtYbsL6j).
76+
77+
The environment variable value is taken as-is and the type is always
78+
string.
79+
80+
### Braces syntax
81+
82+
The braces syntax is supported via the expand converter. It is also
83+
implemented using the os.Expand stdlib function. This syntax supports
84+
any identifiers that don't contain `}`. Again, refer to the os.Expand
85+
example to see how it works in practice:
86+
[`go.dev/play/p/YfxLtYbsL6j`](https://go.dev/play/p/YfxLtYbsL6j).
87+
88+
The environment variable value is taken as-is and the type is always
89+
string.
90+
91+
### `env` provider
92+
93+
The `env` provider syntax is supported via (you guessed it) the `env`
94+
provider. Its implementation is custom made. This syntax supports any
95+
identifier that does not have a $ in it, to support recursive
96+
resolution (e.g. `${env:${http://example.com}}` would get the
97+
environment variable whose name is stored in the URL
98+
`http://example.com`).
99+
100+
The environment variable value is parsed by the yaml.v3 parser to an
101+
any-typed variable. The yaml.v3 parser mostly follows the YAML v1.2
102+
specification with [*some exceptions*](https://github.com/go-yaml/yaml#compatibility).
103+
You can see
104+
how it works for some edge cases in this example:
105+
[`go.dev/play/p/RtPmH8aZA1X`](https://go.dev/play/p/RtPmH8aZA1X).
106+
107+
### Issues of current behavior
108+
109+
#### Unintuitive behavior on unset environment variables
110+
111+
When an environment variable is empty, all syntaxes return an empty
112+
string with no warning given; this is frequently unexpected but can also
113+
be used intentionally. This is especially unintuitive when the user did
114+
not expect expansion to happen. Two examples where this is unexpected
115+
are the following:
116+
117+
1. **Opaque values such as passwords that contain `$`** (issue
118+
[#8215](https://github.com/open-telemetry/opentelemetry-collector/issues/8215)).
119+
If the $ is followed by an alphanumeric character or one of the
120+
special characters, it's going to lead to false positives.
121+
2. **Prometheus relabel config** (issue
122+
[`contrib#9984`](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/9984)).
123+
Prometheus uses `${1}` in some of its configuration values. We
124+
resolve this to the value of the environment variable with name
125+
'`1`'.
126+
3. **Other uses of $** (issue
127+
[`contrib#11846`](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/11846)).
128+
If a product requires the use of `$` in some field, we would most
129+
likely interpret it as an environment variable. This is not
130+
intuitive for users.
131+
132+
#### Unexpected type casting
133+
134+
When using the env syntax we parse its value as YAML. Even if you are
135+
familiar with YAML, because of the implicit type casting rules and the
136+
way we store intermediate values, we can get unintuitive results.
137+
138+
The most clear example of this is issue
139+
[*#8565*](https://github.com/open-telemetry/opentelemetry-collector/issues/8565):
140+
When setting a variable to value `0123` and using it in a string-typed
141+
field, it will end up as the string `"83"` (where as the user would
142+
expect the string to be `0123`).
143+
144+
#### We are less restrictive than the Configuration WG
145+
146+
The Configuration WG defines an [*environment variable expansion feature
147+
for SDK
148+
configurations*](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/file-configuration.md#environment-variable-substitution).
149+
This accepts only non empty alphanumeric + underscore identifiers
150+
starting with alphabetic or underscore. If the Configuration WG were to
151+
expand this in the future (e.g. to include other features present in
152+
Bash-like syntax), we would not be able to expand our braces syntax to
153+
support new features without breaking users.
154+
155+
## Desired behavior
156+
157+
*This section is written as if the changes were already implemented.*
158+
159+
The Collector supports **two** different syntaxes for environment
160+
variable resolution:
161+
162+
1. The *braces syntax*, `${ENV}`.
163+
2. The *env provider syntax*, `${env:ENV}`.
164+
165+
These both have** the same character set and behavior**. They both use
166+
the env provider under the hood.
167+
168+
The naked syntax supported in BASH is not supported in the Collector.
169+
Escaping is supported by using two dollar signs. Escaping is also
170+
honored for unsupported identifiers like `${1}`.
171+
172+
### Type casting rules
173+
174+
The environment variable value is parsed by the yaml.v3 parser to an
175+
any-typed variable and the original representation as a string is stored
176+
for numeric types. The `yaml.v3` parser mostly follows the YAML v1.2
177+
specification with [*some
178+
exceptions*](https://github.com/go-yaml/yaml#compatibility). You can see
179+
how it works for some edge cases in this example:
180+
[*https://go.dev/play/p/RtPmH8aZA1X*](https://go.dev/play/p/RtPmH8aZA1X).
181+
182+
When unmarshalling, we use mapstructure with WeaklyTypedInput
183+
**disabled**. We check via a hook an AsString method from confmap.Conf
184+
and use its return value when it is valid and we are mapping to a string
185+
field. This method has default casting rules for unambiguous scalar
186+
types but may return the original representation depending on the
187+
construction of confmap.Conf
188+
189+
For using this notation in inline mode (e.g.`http://endpoint/${env:PATH}`), we
190+
use the AsString method from confmap.Conf.
191+
192+
### Character set
193+
194+
An environment variable identifier must be a nonempty ASCII alphanumeric
195+
or underscore starting with an alphabetic or underscore character. Its
196+
maximum length is 200 characters. Both syntaxes support recursive
197+
resolution.
198+
199+
When an invalid identifier is found, a warning log is emitted, and the
200+
string is kept as-is.
201+
202+
### Comparison table with current behavior
203+
204+
This is a comparison between the current and desired behavior for
205+
loading a field with the braces syntax, `env` syntax.
206+
207+
| Raw value | Field type | Current behavior, `${ENV}`, single field | Current behavior, `${env:ENV}` , single field | Desired behavior, entire field | Desired behavior, inline string field |
208+
|-------------- |------------ |------------------------------------------ |------------------------------------------------ |-------------------------------- |--------------------------------------- |
209+
| `123` | integer | 123 | 123 | 123 | n/a |
210+
| `0123` | integer | 83 | 83 | 83 | n/a |
211+
| `0123` | string | 0123 | 83 | 0123 (83?) | 0123 |
212+
| `0xdeadbeef` | string | 0xdeadbeef | 3735928559 | 0xdeadbeef (3735928559?) | 0xdeadbeef |
213+
| `"0123"` | string | "0123" | 0123 | 0123 | 0123 |
214+
| `!!str 0123` | string | !!str 0123 | 0123 | 0123 | 0123 (!!str 0123) |
215+
| `t` | boolean | true | true | Error: mapping string to bool | n/a |
216+
| `23` | boolean | true | true | Error: mapping integer to bool | n/a |

0 commit comments

Comments
 (0)