-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add internal-frontend role #3706
Conversation
common/config/config.go
Outdated
@@ -561,6 +577,12 @@ func (c *Config) Validate() error { | |||
return err | |||
} | |||
|
|||
switch c.PublicClient.ForceTLSConfig { | |||
case ForceTLSConfigDefault, ForceTLSConfigInternode, ForceTLSConfigFrontend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is default (empty string) mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means use "internode" if there is an internal-frontend section, otherwise use "frontend". Maybe I should name it ForceTLSConfigAuto?
I really don't think anyone should ever have to use this, it's just for completeness.
if hasIFE && cfg.PublicClient.HostPort == "" { | ||
forceTLS = config.ForceTLSConfigInternode | ||
} else { | ||
forceTLS = config.ForceTLSConfigFrontend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if hasIFE && cfg.PublicClient.HostPort != "", then it will use public frontend config?
It means if publicClient.hostPort is configured it will be highest priority, and it will use public frontend, the internal-frontend will be ignored. Shall we error out if internal-frontend is configured with publicClient.hostPort?
We need clear documentation about this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I considered making that combination fail validation but thought maybe we should trust the user. But maybe failing on that would avoid some user error.
I'm trying to think of any reason you might want to have an internal-frontend config section present but also have an explicit hostPort... maybe someone doesn't like the membership-based resolver and wants to use k8s dns or something like that?
I think I added && cfg.PublicClient.HostPort == ""
because I considered the case where someone explicitly put membership://frontend
as HostPort. But on second thought that config doesn't really make sense either. Yeah, I'll simplify this.
// Frontend controls SDK Client to Frontend communication TLS settings. | ||
// Frontend controls frontend server TLS settings. To control system worker -> frontend | ||
// TLS, use the SystemWorker field. (Frontend.Client is accepted for backwards | ||
// compatibility.) | ||
Frontend GroupTLS `yaml:"frontend"` | ||
// SystemWorker controls TLS setting for System Workers connecting to Frontend. | ||
SystemWorker WorkerTLS `yaml:"systemWorker"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have inner node frontend, do we still need the system worker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a bunch more commits on a branch where I ripped out SystemWorker
, GetFrontendClientConfig
, and everything that touches those. Total diff stats: 11 files changed, 57 insertions(+), 365 deletions(-)
. And we could clean up even more. I'd like to push that through some day. However, in the interest of a smooth 1.20 release, I agree with Yimin that we should be backwards compatible for now, i.e. any existing config should not change behavior, we should only add new options.
I will note that this addition of SystemWorker (as a more flexible alternative to Frontend.Client) never made it to the docs, or docker/config_template.yaml. So probably only a very small number of users are using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I think we should wait, and deprecate-remove SystemWorker at a later point.
// we should remove "temporal-system" from here. Handling of call with | ||
// no namespace will need to be performed at the API level, so that data would | ||
// be filtered based of caller's permissions to namespaces and system. | ||
if target.Namespace == "temporal-system" || target.Namespace == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for reviewers: I'm confident that this handles the issues around temporal-system
. But this default authorizer also allowed all empty-namespace calls through. Now, those will require claims.System >= RoleReader
. This makes logical sense: System
is the field for stuff that applies outside of namespaces. The question is might this break anyone's expectations (in a way that's hard to fix)?
// Frontend controls SDK Client to Frontend communication TLS settings. | ||
// Frontend controls frontend server TLS settings. To control system worker -> frontend | ||
// TLS, use the SystemWorker field. (Frontend.Client is accepted for backwards | ||
// compatibility.) | ||
Frontend GroupTLS `yaml:"frontend"` | ||
// SystemWorker controls TLS setting for System Workers connecting to Frontend. | ||
SystemWorker WorkerTLS `yaml:"systemWorker"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a bunch more commits on a branch where I ripped out SystemWorker
, GetFrontendClientConfig
, and everything that touches those. Total diff stats: 11 files changed, 57 insertions(+), 365 deletions(-)
. And we could clean up even more. I'd like to push that through some day. However, in the interest of a smooth 1.20 release, I agree with Yimin that we should be backwards compatible for now, i.e. any existing config should not change behavior, we should only add new options.
I will note that this addition of SystemWorker (as a more flexible alternative to Frontend.Client) never made it to the docs, or docker/config_template.yaml. So probably only a very small number of users are using it.
common/config/config.go
Outdated
@@ -561,6 +577,12 @@ func (c *Config) Validate() error { | |||
return err | |||
} | |||
|
|||
switch c.PublicClient.ForceTLSConfig { | |||
case ForceTLSConfigDefault, ForceTLSConfigInternode, ForceTLSConfigFrontend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means use "internode" if there is an internal-frontend section, otherwise use "frontend". Maybe I should name it ForceTLSConfigAuto?
I really don't think anyone should ever have to use this, it's just for completeness.
if hasIFE && cfg.PublicClient.HostPort == "" { | ||
forceTLS = config.ForceTLSConfigInternode | ||
} else { | ||
forceTLS = config.ForceTLSConfigFrontend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I considered making that combination fail validation but thought maybe we should trust the user. But maybe failing on that would avoid some user error.
I'm trying to think of any reason you might want to have an internal-frontend config section present but also have an explicit hostPort... maybe someone doesn't like the membership-based resolver and wants to use k8s dns or something like that?
I think I added && cfg.PublicClient.HostPort == ""
because I considered the case where someone explicitly put membership://frontend
as HostPort. But on second thought that config doesn't really make sense either. Yeah, I'll simplify this.
), nil | ||
} | ||
|
||
func getFrontendConnectionDetails( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: I'm pretty unhappy about the existence of this function: tlsConfigProvider already has a whole bunch of switching logic, and now I'm adding another layer of switching on top of that. It would seem better to integrate this logic into tlsConfigProvider.
But the problem is that this needs the whole config (for PublicClient and Services) and tlsConfigProvider only gets the TLS section. Also, this deals with grpc endpoints and TLS config in a linked way, where tlsConfigProvider deals only with TLS. So it makes some sense to keep it separate.
@@ -357,10 +366,14 @@ namespaceDefaults: | |||
state: "disabled" | |||
URI: "file:///tmp/temporal_vis_archival/development" | |||
|
|||
{{- if or (.Env.USE_INTERNAL_FRONTEND) (and (not .Env.TEMPORAL_AUTH_AUTHORIZER) (not .Env.TEMPORAL_AUTH_CLAIM_MAPPER)) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for reviewers: is this logic sufficient? The idea is that if you're using internal-frontend, then you just use membership to find that and you're good. If using external frontend only, but you have no authorizer and claim mapper (no security or maybe using an external validating proxy or something), then you can use membership to find frontend and it'll work (even with TLS).
Otherwise you're on your own and you can set PUBLIC_FRONTEND_ADDRESS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
// Membership resolver (new in 1.18): Leave this empty, and other nodes will use the | ||
// membership service resolver to find the frontend. | ||
// TODO: remove this and always use membership resolver | ||
// PublicClient is the config for internal nodes (history/matching/worker) connecting to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, does history/matching ever need to connect to frontend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But it would be good to let @sergeybykov take a look.
// Membership resolver (new in 1.18): Leave this empty, and other nodes will use the | ||
// membership service resolver to find the frontend. | ||
// TODO: remove this and always use membership resolver | ||
// PublicClient is the config for internal nodes (history/matching/worker) connecting to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, not.
@@ -357,10 +366,14 @@ namespaceDefaults: | |||
state: "disabled" | |||
URI: "file:///tmp/temporal_vis_archival/development" | |||
|
|||
{{- if or (.Env.USE_INTERNAL_FRONTEND) (and (not .Env.TEMPORAL_AUTH_AUTHORIZER) (not .Env.TEMPORAL_AUTH_CLAIM_MAPPER)) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic looks good to me.
What changed?
internal-frontend
internal-frontend
(using internode TLS) if present, elsefrontend
(using frontend TLS)temporal-system
special case in the default authorizerWhy?
The server worker needs to make connections to the frontend with effective "admin" access (to all namespaces). With pluggable claim mappers and authorizers, we can't control the logic in them, and requiring all users to set up mTLS with special claims is a large burden. Adding internal frontends as an alternative is easier and more reliable.
How did you test it?
Updated unit tests, lots of manual testing with https://github.com/temporalio/samples-server/tree/main/tls/tls-full
Potential risks
Users who are currently using the default authorizer and relying on the
temporal-system
special case will need to switch their setup to use internal frontends (but this is good, since the special case could be a security hole). Other configurations that I haven't thought of could also break.Overall, this creates more configuration options, which increases the support burden. Eventually I'd like to eliminate most of them and require internal frontends (with internode TLS if used), to reduce the number of options and make things even simpler than they were before this change, but that can come later.
Is hotfix candidate?