Skip to content

Commit fe582df

Browse files
mx-psicodebotenevan-bradley
authored
[docs/rfc] RFC about environment variables (#9854)
**Description:** Adds an RFC about how environment variable resolution should work **Link to tracking Issue:** Fixes #9515, relates to: - #8215 - #8565 - #9162 - #9531 - #9532 --------- Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com> Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
1 parent 7fd529f commit fe582df

File tree

1 file changed

+222
-0
lines changed

1 file changed

+222
-0
lines changed

docs/rfcs/env-vars.md

+222
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,222 @@
1+
# Stabilizing environment variable resolution
2+
3+
## Overview
4+
5+
The OpenTelemetry Collector supports three different syntaxes for
6+
environment variable resolution which differ in their syntax, semantics
7+
and allowed variable names. Before we stabilize confmap, we need to
8+
address several issues related to environment variables. This document
9+
describes:
10+
11+
- the current (as of v0.97.0) behavior of the Collector
12+
- the goals that an environment variable resolution should aim for
13+
- existing deviations from these goals
14+
- the desired behavior after making some changes
15+
16+
### Out of scope
17+
18+
CLI environment variable resolution has a single syntax (`--config env:ENV`)
19+
and it is considered out of scope for this document, focusing
20+
instead on expansion within the Collector configuration.
21+
22+
How to get from the current to desired behavior is also considered out
23+
of scope and will be discussed on individual PRs. It will likely involve
24+
one or multiple feature gates, warnings and transition periods.
25+
26+
## Goals of an expansion system
27+
28+
The following are considered goals of the expansion system:
29+
30+
1. ***Expansion should happen only when the user expects it***. We
31+
should aim to expand when the user expects it and keep the original
32+
value when we don't (e.g. because the syntax is used for something
33+
different).
34+
2. ***Expansion must have predictable behavior***.
35+
3. ***Multiple expansion methods, if present, should have similar behavior.***
36+
Switching from `${env:ENV}` to `${ENV}` or vice versa
37+
should not lead to any surprises.
38+
4. ***When the syntax overlaps, expansion should be aligned with***
39+
[***the expansion defined by the Configuration Working Group***](https://github.com/open-telemetry/opentelemetry-specification/blob/032213cedde54a2171dfbd234a371501a3537919/specification/configuration/file-configuration.md#environment-variable-substitution). See [opentelemetry-specification/issues/3963](https://github.com/open-telemetry/opentelemetry-specification/issues/3963) for the counterpart to this line of work in the SDK File spec.
40+
41+
## Current behavior
42+
43+
The Collector supports three different syntaxes for environment variable
44+
resolution:
45+
46+
1. The *naked syntax*, `$ENV`.
47+
2. The *braces syntax*, `${ENV}`.
48+
3. The *env provider syntax*, `${env:ENV}`.
49+
50+
These differ in the character set allowed for environment variable names
51+
as well as the type of parsing they return. Escaping is supported in all
52+
syntaxes by using two dollar signs.
53+
54+
### Type casting rules
55+
56+
A provider or converter takes a string and returns some sort of value
57+
after potentially doing some parsing. This gets stored in a
58+
`confmap.Conf`. When unmarshalling, we use [mapstructure](https://github.com/mitchellh/mapstructure) with
59+
`WeaklyTypedInput` enabled, which does a lot of implicit casting. The
60+
details of this type casting are complex and are outlined on issue
61+
[#9532](https://github.com/open-telemetry/opentelemetry-collector/issues/9532).
62+
63+
When using this notation in inline mode (e.g.
64+
`http://endpoint/${env:PATH}`) we also do manual implicit type
65+
casting with a similar approach to mapstructure. These are outlined
66+
[here](https://github.com/open-telemetry/opentelemetry-collector/blob/fc4c13d3c2822bec39fa9d9658836d1a020c6844/confmap/expand.go#L124-L139).
67+
68+
### Naked syntax
69+
70+
The naked syntax is supported via the expand converter. It is
71+
implemented using the [`os.Expand`](https://pkg.go.dev/os#Expand) stdlib
72+
function. This syntax supports identifiers made up of:
73+
74+
1. ASCII alphanumerics and the `_` character
75+
2. Certain special characters if they appear alone typically used in
76+
Bash: `*`, `#`, `$`, `@`, `!`, `?` and `-`.
77+
78+
You can see supported identifiers in this example:
79+
[`go.dev/play/p/YfxLtYbsL6j`](https://go.dev/play/p/YfxLtYbsL6j).
80+
81+
The environment variable value is taken as-is and the type is always
82+
string.
83+
84+
### Braces syntax
85+
86+
The braces syntax is supported via the expand converter. It is also
87+
implemented using the os.Expand stdlib function. This syntax supports
88+
any identifiers that don't contain `}`. Again, refer to the os.Expand
89+
example to see how it works in practice:
90+
[`go.dev/play/p/YfxLtYbsL6j`](https://go.dev/play/p/YfxLtYbsL6j).
91+
92+
The environment variable value is taken as-is and the type is always
93+
string.
94+
95+
### `env` provider
96+
97+
The `env` provider syntax is supported via the `env`
98+
provider. It is a custom implementation with a syntax that supports any
99+
identifier that does not contain a `$`. This is done to support recursive
100+
resolution (e.g. `${env:${http://example.com}}` would get the
101+
environment variable whose name is stored in the URL
102+
`http://example.com`).
103+
104+
The environment variable value is parsed by the yaml.v3 parser to an
105+
any-typed variable. The yaml.v3 parser mostly follows the YAML v1.2
106+
specification with [*some exceptions*](https://github.com/go-yaml/yaml#compatibility).
107+
You can see
108+
how it works for some edge cases in this example:
109+
[`go.dev/play/p/RtPmH8aZA1X`](https://go.dev/play/p/RtPmH8aZA1X).
110+
111+
### Issues of current behavior
112+
113+
#### Unintuitive behavior on unset environment variables
114+
115+
When an environment variable is empty, all syntaxes return an empty
116+
string with no warning given; this is frequently unexpected but can also
117+
be used intentionally. This is especially unintuitive when the user did
118+
not expect expansion to happen. Three examples where this is unexpected
119+
are the following:
120+
121+
1. **Opaque values such as passwords that contain `$`** (issue
122+
[#8215](https://github.com/open-telemetry/opentelemetry-collector/issues/8215)).
123+
If the $ is followed by an alphanumeric character or one of the
124+
special characters, it's going to lead to false positives.
125+
2. **Prometheus relabel config** (issue
126+
[`contrib#9984`](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/9984)).
127+
Prometheus uses `${1}` in some of its configuration values. We
128+
resolve this to the value of the environment variable with name
129+
'`1`'.
130+
3. **Other uses of $** (issue
131+
[`contrib#11846`](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/11846)).
132+
If a product requires the use of `$` in some field, we would most
133+
likely interpret it as an environment variable. This is not
134+
intuitive for users.
135+
136+
#### Unexpected type casting
137+
138+
When using the env syntax we parse its value as YAML. Even if you are
139+
familiar with YAML, because of the implicit type casting rules and the
140+
way we store intermediate values, we can get unintuitive results.
141+
142+
The most clear example of this is issue
143+
[*#8565*](https://github.com/open-telemetry/opentelemetry-collector/issues/8565):
144+
When setting a variable to value `0123` and using it in a string-typed
145+
field, it will end up as the string `"83"` (where as the user would
146+
expect the string to be `0123`).
147+
148+
#### We are less restrictive than the Configuration WG
149+
150+
The Configuration WG defines an [*environment variable expansion feature
151+
for SDK
152+
configurations*](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/file-configuration.md#environment-variable-substitution).
153+
This accepts only non empty alphanumeric + underscore identifiers
154+
starting with alphabetic or underscore. If the Configuration WG were to
155+
expand this in the future (e.g. to include other features present in
156+
Bash-like syntax as in [opentelemetry-specification/pull/3948](https://github.com/open-telemetry/opentelemetry-specification/pull/3948)), we would not be able to expand our braces syntax to
157+
support new features without breaking users.
158+
159+
## Desired behavior
160+
161+
*This section is written as if the changes were already implemented.*
162+
163+
The Collector supports **two** different syntaxes for environment
164+
variable resolution:
165+
166+
1. The *braces syntax*, `${ENV}`.
167+
2. The *env provider syntax*, `${env:ENV}`.
168+
169+
These both have **the same character set and behavior**. They both use
170+
the env provider under the hood. This means we support the exact same
171+
syntax as the Configuration WG.
172+
173+
The naked syntax supported in Bash is not supported in the Collector.
174+
Escaping is supported by using two dollar signs. Escaping is also
175+
honored for unsupported identifiers like `${1}` (i.e. anything that
176+
matches `\${[^$}]+}`).
177+
178+
### Type casting rules
179+
180+
The environment variable value is parsed by the yaml.v3 parser to an
181+
any-typed variable and the original representation as a string is stored
182+
for numeric types. The `yaml.v3` parser mostly follows the YAML v1.2
183+
specification with [*some
184+
exceptions*](https://github.com/go-yaml/yaml#compatibility). You can see
185+
how it works for some edge cases in this example:
186+
[*https://go.dev/play/p/RtPmH8aZA1X*](https://go.dev/play/p/RtPmH8aZA1X).
187+
188+
When unmarshalling, we use mapstructure with WeaklyTypedInput
189+
**disabled**. We check via a hook an `AsString` method from confmap.Conf
190+
and use its return value when it is valid and we are mapping to a string
191+
field. This method has default casting rules for unambiguous scalar
192+
types but may return the original representation depending on the
193+
construction of confmap.Conf (see the comparison table below for details).
194+
195+
For using this notation in inline mode (e.g.`http://endpoint/${env:PATH}`), we
196+
use the `AsString` method from confmap.Conf (see the comparison table below for details).
197+
198+
### Character set
199+
200+
An environment variable identifier must be a nonempty ASCII alphanumeric
201+
or underscore starting with an alphabetic or underscore character. Its
202+
maximum length is 200 characters. Both syntaxes support recursive
203+
resolution.
204+
205+
When an invalid identifier is found, an error is emitted. To use an invalid
206+
identifier, the string must be escaped.
207+
208+
### Comparison table with current behavior
209+
210+
This is a comparison between the current and desired behavior for
211+
loading a field with the braces syntax, `env` syntax.
212+
213+
| Raw value | Field type | Current behavior, `${ENV}`, single field | Current behavior, `${env:ENV}` , single field | Desired behavior, entire field | Desired behavior, inline string field |
214+
|--------------|------------|------------------------------------------|------------------------------------------------|--------------------------------|---------------------------------------|
215+
| `123` | integer | 123 | 123 | 123 | n/a |
216+
| `0123` | integer | 83 | 83 | 83 | n/a |
217+
| `0123` | string | 0123 | 83 | 0123 | 0123 |
218+
| `0xdeadbeef` | string | 0xdeadbeef | 3735928559 | 0xdeadbeef | 0xdeadbeef |
219+
| `"0123"` | string | "0123" | 0123 | 0123 | 0123 |
220+
| `!!str 0123` | string | !!str 0123 | 0123 | 0123 | 0123 |
221+
| `t` | boolean | true | true | Error: mapping string to bool | n/a |
222+
| `23` | boolean | true | true | Error: mapping integer to bool | n/a |

0 commit comments

Comments
 (0)