Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caddytls: Encrypted ClientHello (ECH) #6862

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

caddytls: Encrypted ClientHello (ECH) #6862

wants to merge 6 commits into from

Conversation

mholt
Copy link
Member

@mholt mholt commented Feb 24, 2025

This is the initial implementation for Encrypted ClientHello (ECH).

I have verified during testing with Firefox + Wireshark that only the outer name(s) appear on the wire in plaintext; the actual server names do not. (Firefox requires DoH enabled.)

Current features:

  • Adds a DNS provider configurable for the entire TLS app that acts as a default when a DNS module is needed, since most people use a single DNS provider; this can reduce redundancy in configs.
  • Generates an ECH config given only a public name
  • Automatically publishes the ECH configs to DNS (given a DNS provider plugin & in the config), which is the conventional/standard way of conveying ECH configs so that browsers will use them
  • Makes publication modular, so if you have a custom way of distributing ECH keys, Caddy can do it automatically regardless (just write a custom module)
  • Implements some of the RFC best practices, like for the choice of the config_id through random rejection sampling.
  • Infers server names from applications. For example, having Host matchers in the HTTP app is sufficient for those names to be protected by ECH when enabled. i.e. you do not need to redundantly specify hostnames in your config.

Still TODO:

  • Automatically rotate keys (blocked by proposal: crypto/tls: add GetEncryptedClientHelloKeys callback golang/go#71920 -- update: slated for Go 1.25, so deferring to a later PR)
  • Customizable TTL (related to rotating keys)
  • Caddyfile config
  • Note that on-demand TLS requires explicitly-configured names for ECH
  • Combine SvcParams with existing HTTPS records
  • Refactor publication to scale to many domains (async)
  • Only publish each config to each publisher at most once
  • TLS-app-scoped DNS provider should be applied to automation policies too (DNS challenges)

Here's a sample Caddyfile that should be the minimum required to get ECH to work:

{
	debug  # (plz use debug mode while testing)

	dns cloudflare {env.CLOUDFLARE_API_KEY}
	ech ech.example.net
}

example.com {
	respond "Hello there!"
}

(Be sure to replace with your actual values.)

This minimal, opinionated tells Caddy to serve your site, example.com, as usual (with automatic HTTPS, a certificate, the whole bit), but to also manage a certain for ech.example.net. This config is opinionated because it is so minimal, in that it enables ECH for all sites, protecting them behind the domain listed in the ech global option. (Each outer name correlates to an ECH config.) It also publishes all ECH configs (one in this case) to all publishers (one DNS publisher, in this case).

(The new dns global option specifies a DNS provider to use if none other is specified but one is needed. It does not enable the ACME DNS challenge in the way the acme_dns global option does.)

Similarly, here's a sample JSON config with debug logs enabled; be sure to fill out your actual domain names (both inner and outer) and set your DNS provider accordingly. I have a Cloudflare one here since I used it for testing.

{
	"logging": {
		"logs": {
			"default": {
				"level": "DEBUG"
			}
		}
	},
	"apps": {
		"http": {
			"servers": {
				"srv0": {
					"listen": [
						":443"
					],
					"routes": [
						{
							"match": [
								{
									"host": [
										"example.com"
									]
								}
							],
							"handle": [
								{
									"handler": "subroute",
									"routes": [
										{
											"handle": [
												{
													"handler": "static_response",
													"body": "Hello there!"
												}
											]
										}
									]
								}
							],
							"terminal": true
						}
					]
				}
			}
		},
		"tls": {
			"dns": {
				"name": "cloudflare",
				"api_token": "{env.CLOUDFLARE_API_KEY}"
			},
			"encrypted_client_hello": {
				"configs": [
					{
						"outer_sni": "ech.example.net"
					}
				]
			}
		}
	}
}

This tells Caddy to serve a site (example.com) over HTTPS with auto-managing certificates, as usual, and the TLS app has ECH enabled, so it will use the global DNS module (cloudflaer) to publish the config for the outer name (ech.example.net).

To test this, open Firefox and ensure DNS-over-HTTPS is enabled. Then go to about:networking#dns and "Clear DNS cache" to ensure your test has a clean slate.

When you run this config, wait about 1-2 seconds or for a cert to be provisioned, then, assuming an empty DNS cache, when you load your site (example.com) in Firefox, it will find the HTTPS-type DNS record and use that to employ ECH for the connection. Verify with Wireshark.

You can compile with this patch and a DNS plugin like so:

$ xcaddy build ech --with github.com/caddy-dns/cloudflare

All config APIs are subject to change. And no security guarantees at this time. Please try it out!! Thank you!

@mholt mholt linked an issue Feb 24, 2025 that may be closed by this pull request
@mholt mholt marked this pull request as draft February 24, 2025 21:56
@mholt
Copy link
Member Author

mholt commented Feb 27, 2025

I've added Caddyfile support, for those hoping for an easier way to try it out. (See edited post above with an example.)

@gucci-on-fleek
Copy link

(Sorry if these comments belong in a new issue/forum post instead)

I haven't tested this out yet, but it looks like the current implementation only sets the ech key in the HTTPS record and doesn't allow setting any additional keys/values:

_, err = dnsPub.provider.SetRecords(ctx, zone, []libdns.Record{
{
Type: "HTTPS",
Name: libdns.RelativeName(domain+".", zone),
Priority: 1, // allows a manual override with priority 0
Target: ".",
Value: echSvcParam(configListBin),
TTL: 1 * time.Minute, // TODO: for testing only
},
})

I'm currently using multiple other keys in my HTTPS records, so it would be nice if I could continue to do so while also supporting ECH.

I'm also currently using the caddy-l4 module to terminate the TLS for a DNS Server, so it would also be nice if SVCB records were supported as well. Other than the name and semantics for a resolving client, SVCB and HTTPS records are identical, so hopefully this shouldn't be too hard to add, although I'm not sure if it's in-scope here.

Also, 1 is the lowest valid non-alias priority for an HTTPS record, so the following comment isn't correct:

Priority: 1, // allows a manual override with priority 0

And this one is maybe better for a new issue, but if Caddy is setting HTTPS records, it might be a good idea for it to set the alpn and tls-supported-groups keys since it will always know the correct values for those.

@mholt
Copy link
Member Author

mholt commented Feb 27, 2025

@gucci-on-fleek Thanks for the great feedback.

I haven't tested this out yet, but it looks like the current implementation only sets the ech key in the HTTPS record and doesn't allow setting any additional keys/values:
...
I'm currently using multiple other keys in my HTTPS records, so it would be nice if I could continue to do so while also supporting ECH.

That's good to know. Correct; I guess I first have to GetRecords() and then augment the value of the HTTPS RR.

I'm also currently using the caddy-l4 module to terminate the TLS for a DNS Server, so it would also be nice if SVCB records were supported as well. Other than the name and semantics for a resolving client, SVCB and HTTPS records are identical, so hopefully this shouldn't be too hard to add, although I'm not sure if it's in-scope here.

We can do that. When are SVCB RRs to be used versus HTTPS RRs? (Should this be user-configurable or should we just set both?)

Also, 1 is the lowest valid non-alias priority for an HTTPS record, so the following comment isn't correct:

Oops, thanks for catching that!

And this one is maybe better for a new issue, but if Caddy is setting HTTPS records, it might be a good idea for it to set the alpn and tls-supported-groups keys since it will always know the correct values for those.

I will look into it... that might be a little trickier.

@gucci-on-fleek
Copy link

@mholt

(FYI, I'm just a hobbyist, so some of the things that I'm saying here might be wildly incorrect)

I haven't tested this out yet, but it looks like the current implementation only sets the ech key in the HTTPS record and doesn't allow setting any additional keys/values:
...
I'm currently using multiple other keys in my HTTPS records, so it would be nice if I could continue to do so while also supporting ECH.

That's good to know. Correct; I guess I first have to GetRecords() and then augment the value of the HTTPS RR.

The problem with just adding to an existing record is that automation tools like DNSControl, octoDNS, and Terraform DNS generally assume that they “own” any RRs that they set, so if you use them to set an HTTPS RR, they'll keep reverting Caddy's ech= addition. Or you can set them to ignore the HTTPS RR, but then you'd have no way to change any of its values in the future.

An alternative solution would be to support setting arbitrary key–value pairs from Caddy, but this doesn't quite seem right, since that would be both fairly complicated and completely out-of-scope for a web server. Or another option would be to just manually copy the ech= value into the automation tool, but that's essentially just key-pinning, which isn't really a good idea.

I actually can't think of any good solutions to this problem, but this is probably a fairly niche use case, so your solution of “GetRecords() and then augment the value” is probably good enough for most users.

I'm also currently using the caddy-l4 module to terminate the TLS for a DNS Server, so it would also be nice if SVCB records were supported as well. Other than the name and semantics for a resolving client, SVCB and HTTPS records are identical, so hopefully this shouldn't be too hard to add, although I'm not sure if it's in-scope here.

We can do that. When are SVCB RRs to be used versus HTTPS RRs? (Should this be user-configurable or should we just set both?)

It's not allowed to set a SVCB RR for HTTP/HTTPS, so SVCB RRs are really only relevant if you're using the l4 module. SVCB RRs are essentially a replacement for SRV RRs, so they always use the underscored “name prefixes” similar to how SRV RRs do. So something like the following should be valid:

;; HTTP
_http._tcp.www.example.com.  SRV    0  1  80  www.example.com.
; _http.www.example.com.     SVCB   1         www.example.com. ( port="80" )

;; HTTPS
_https._tcp.www.example.com.  SRV    0  1  443  www.example.com.
; _https.www.example.com.     SVCB   1          www.example.com. ( alpn="http/1.1" port="443" )
www.example.com.              HTTPS  1          www.example.com.

;; HTTP/3
_https._udp.www.example.com.  SRV    0  1  443  www.example.com.
; _https.www.example.com.     SVCB   1          www.example.com. ( alpn="h3" port="443" )
www.example.com.              HTTPS  1          www.example.com. ( alpn="h3" )

;; HTTPS, alternate port
_https._tcp.alt.example.com.   SRV    0  1  8443  alt.example.com.
; _https.alt.example.com.      SVCB   1           alt.example.com. ( alpn="http/1.1" port="8443" )
_8443._https.alt.example.com.  HTTPS  1           alt.example.com. ( port="8443" )

;; DNS
_dns._udp.ns.example.com.  SRV  0  1  53  ns.example.com.

;; DNS-over-TLS
_domain-s._tcp.ns.example.com.  SRV   0  1  853  ns.example.com.
_dns.ns.example.com.            SVCB  1          ns.example.com. ( alpn="dot" port="853" )

(The commented-out SVCB RRs would behave the same as the HTTPS RRs if they were valid.)

Anyways, I'd suggest Caddyfile syntax something like the following

# Long form
ech {
    # The hostname to use in the ECH wrapper connection.
    hostname "ech.example.com" # String, mandatory

    # The protocol name to use for the HTTPS/SVCB RR
    protocol "https"           # String, optional (defaults to "https")

    # Port is only needed if you're using a non-default port for that
    # specific protocol, and shouldn't be set otherwise.
    port     1234              # Integer, optional (defaults to null)
}

# Short form
ech ech.example.com # Maps to `ech { hostname "ech.example.com" }`

that behaves something like

ech_hostname = <ech.hostname Caddyfile value>
current_hostname = <the hostname for the domain where we're providing ECH>

if protocol == "https" or protocol == null then
    rr_type = "HTTPS"
    protocol = "https"
elseif protocol == "http" then
    -- There seems to be no valid way to set a SVCB record for HTTP, but
    -- even if there were, it wouldn't make sense for ECH
    throw "Invalid protocol"
else
    rr_type = "SVCB"
end

if protocol == "https" and (port == 443 or port == null) then
    -- target_hostname is where we'll create the DNS record
    target_hostname = current_hostname
elseif port == null then
    target_hostname = "_" + protocol + "." + current_hostname
else
    target_hostname = "_" + port + "._" + protocol + "." + current_hostname
end

But since this only matters if you're using the l4 module, this shouldn't hold up the rest of the ECH support and can easily wait for later.

And this one is maybe better for a new issue, but if Caddy is setting HTTPS records, it might be a good idea for it to set the alpn and tls-supported-groups keys since it will always know the correct values for those.

I will look into it... that might be a little trickier.

👍, thanks.


This doesn't affect me, but another problem that I thought of is what would happen if you're running two Caddy servers? Right now, you can just set two different A/AAAA RRs and get fairly basic round-robin load balancing. Regular TLS certificates are fine since both servers can independently request certificates for the same domain, but for ECH, you'd need to either set two HTTPS RRs (which would need to point at two different domains, each having only 1 A/AAAA record) and then somehow tell Caddy which RR to update, or you'd need to coordinate sharing the ECH private key between them.

Also, thanks for the quick reply to my messages, and thanks for developing Caddy!

@mholt
Copy link
Member Author

mholt commented Feb 28, 2025

Thanks for the great reply! There's a lot of really helpful information in there.

The problem with just adding to an existing record is that automation tools like DNSControl, octoDNS, and Terraform DNS generally assume that they “own” any RRs that they set, so if you use them to set an HTTPS RR, they'll keep reverting Caddy's ech= addition. Or you can set them to ignore the HTTPS RR, but then you'd have no way to change any of its values in the future.

Yep... I'm aware of those projects (having taken some inspiration from them when making libdns) - and it could be quite a problem. The "ownership" model is in conflict with this default approach.

However, this is the default approach, since most people are probably not using DNS ownership tools.

I actually can't think of any good solutions to this problem

I designed publishing to be modular, so there could be a third-party publisher module written for DNSControl, for example, which instructs DNSControl to publish the record, rather than Caddy publishing it directly. This would delegate control back to the "owner" software and should avoid problems.

Anyways, I'd suggest Caddyfile syntax something like the following

I like that a lot.

Since this PR is already going to be a lot of work, I might defer some of those advanced customization features for later, as you suggested; but I am almost done with the SvcParams parser that will at least help us avoid trampling existing HTTPS records.

@mholt
Copy link
Member Author

mholt commented Feb 28, 2025

@gucci-on-fleek Okay, I've pushed a commit that I've tested that will augment existing HTTPS records and only overwrite the ech SvcParamKey, and will leave others intact. I'm not entirely confident the parser (and serializer) is 100% correct but it's pretty good I think.

@gucci-on-fleek
Copy link

@mholt

Okay, I've pushed a commit that I've tested that will augment existing HTTPS records and only overwrite the ech SvcParamKey, and will leave others intact. I'm not entirely confident the parser (and serializer) is 100% correct but it's pretty good I think.

Alright, I've deployed the latest commit, and I've confirmed with Wireshark that ECH is working as expected. But it only works half of the time since instead of replacing the old HTTPS RRs, Caddy is duplicating each HTTPS RR and then adding the ech= value to only one of them:

$ dig +nostats +nocmd  @ns.maxchernoff.ca. www.maxchernoff.ca HTTPS
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4086
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.maxchernoff.ca.		IN	HTTPS

;; ANSWER SECTION:
www.maxchernoff.ca.	29	IN	HTTPS	1 . alpn="h3,h2" ipv4hint=152.53.36.213 ech=AE3+DQBJRQAgACBUJfUrC9bM8kOoM2P22kiTSBcSSllK6HJDlnF/X0AoewAMAAEAAQABAAIAAQADIhJlY2gubWF4Y2hlcm5vZmYuY2EAAA== ipv6hint=2a0a:4cc0:2000:172::1
www.maxchernoff.ca.	29	IN	HTTPS	1 . alpn="h3,h2" ipv4hint=152.53.36.213 ipv6hint=2a0a:4cc0:2000:172::1

And even weirder, it only seems to have set the ech= key for some of the subdomains, pretty much chosen at random. I'm also seeing lots of log lines like

2025/03/01 11:13:20.512	ERROR	tls	unable to publish ECH data to HTTPS DNS record	{"domain": "mta-sts.maxchernoff.ca", "error": "invalid record mta-sts: dns: bad SVCB priority: \"ipv4hint=152.53.36.213\" at line: 1:43"}

which probably has something to do with what's happening. I've attached the Caddy debug logs and a dump of the entire DNS zone, but let me know if you need anything else.

@mholt
Copy link
Member Author

mholt commented Mar 1, 2025

@gucci-on-fleek Hm, that may be a bug in the libdns package you're using (RFC2136 in this case) -- I had to patch the libdns/cloudflare package to properly support HTTPS records, since their Value field is a conglomerate of other fields like Priority and Target.

A couple years ago I also had to extend the libdns package to support SRV records for similar reasons.

In your case it looks like SetRecords is doing more like what AppendRecords is supposed to do.

Most libdns packages were made for ACME transactions which only Append and then Delete records; whereas ECH uses the other two methods, Get and Set. All packages are ideally supposed to implement all 4 interfaces (and from what I've observed, many do.) The cloudflare package properly did implement all 4, but still had to be patched slightly due to HTTPS records' structured Value field.

Anyway, that might be something to open a bug report with in libdns/rfc2136.

I'd be curious if you have any domains at Cloudflare, whether it works for you.

@gucci-on-fleek
Copy link

gucci-on-fleek commented Mar 4, 2025

@mholt

Hm, that may be a bug in the libdns package you're using (RFC2136 in this case)

[...]

In your case it looks like SetRecords is doing more like what AppendRecords is supposed to do.

Good catch, that definitely seems to be what's happening:

https://github.com/libdns/rfc2136/blob/5ee7f48743922d811ad6daf336345ea4bb059eaf/provider.go#L69-L71

I'll try and submit a PR to libdns/rfc2136 to fix that later this week when I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Encrypted Client Hello (formerly known as ESNI)
2 participants