Skip to content

Commit 71f9fea

Browse files
committed
docs: auth blog post
1 parent 02788af commit 71f9fea

10 files changed

+368
-20
lines changed

TODO.md

+6-13
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,13 @@
11
# TODO
22

33
- Panic handling needs some help. The server doesn't shut down, which is good -
4-
but it also doesn't disconnect, or tell the user anything - which is bad.
5-
There's also zero info output on panic using the dev dashboard.
4+
but it doesn't tell the user anything - which is bad. There's also zero info
5+
output on panic using the dev dashboard.
66

77
- Implement multi-cluster.
88

99
## Documentation
1010

11-
- Getting started needs help, in particular:
12-
- Granting your user should probably go before the install instructions.
13-
- Say something about the error when you don't have authorization.
14-
- Make the getting started on a real cluster instructions more clear. In
15-
particular, it seems like it is a little difficult to see the install commands
16-
and realize that's what you need to use.
17-
1811
## Authorization
1912

2013
- Groups are probably what most users are going to want to use to configure all
@@ -53,6 +46,9 @@
5346
- Does it make sense to do the `nsenter` trick for some use cases? This
5447
requires privileged mode to work.
5548

49+
- This is waiting on the next release of russh as `handle.open_channel_agent`
50+
just landed.
51+
5652
- There's some kind of lag happening when scrolling aggressively (aka, holding
5753
down a cursor). It goes fine for ~10 items and then has a hitch in the
5854
rendering.
@@ -77,16 +73,13 @@
7773
- Dashboard as a struct doesn't really make sense anymore, it should likely be
7874
converted over to a simple function.
7975

80-
- The initial coalesce in `Apex` is a little weird because of the initial
81-
loading screen - feels like it is jumping a couple frames.
82-
8376
- Move YAML over to viewport. Should viewport be doing syntax highlighting by
8477
default? How do we do a viewport over a set of lines that require history to
8578
do highlighting?
8679

8780
- There's a bug somewhere in `log_stream`. My k3d cluster restarted and while I
8881
could get all the logs, the stream wouldn't keep running - it'd terminate
89-
immediately. `stern` seemed to be working fine. Recreating the cluster caused
82+
immediately. `stern` seemed to be working fine. Recreating the cluster causedx
9083
everything to work again.
9184

9285
- Move over to something like

docs/components/blog.tsx

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
import Link from 'next/link'
2+
import { getPagesUnderRoute } from 'nextra/context'
3+
import { useRouter } from 'next/router'
4+
import clsx from 'clsx'
5+
import { FrontMatter, MdxFile } from 'nextra'
6+
7+
const Blog = () => {
8+
const { asPath } = useRouter()
9+
10+
let linkClasses = [
11+
'border',
12+
'border-zinc-200',
13+
'dark:border-[#414141]',
14+
'p-8',
15+
'lg:p-12',
16+
'bg-white',
17+
'dark:bg-neutral-800',
18+
'rounded-none',
19+
'hover:!border-primary',
20+
'hover:dark:bg-neutral-700/50',
21+
'hover:border-violet-300',
22+
'hover:shadow-2xl',
23+
'hover:shadow-primary/10',
24+
'dark:shadow-none',
25+
'transition-colors',
26+
'flex',
27+
'flex-col',
28+
]
29+
30+
const items = getPagesUnderRoute('/blog').map(
31+
({ route, frontMatter }: MdxFile) => {
32+
const { title, byline, date } = frontMatter as FrontMatter
33+
34+
return (
35+
<Link href={route} className={clsx(linkClasses)}>
36+
<div className="font-extrabold text-xl md:text-3xl text-balance">
37+
{title}
38+
</div>
39+
<div className="opacity-50 text-sm my-7 flex gap-2">
40+
<time dateTime={date.toISOString()}>
41+
{date.toLocaleDateString('en', {
42+
month: 'long',
43+
day: 'numeric',
44+
year: 'numeric',
45+
})}
46+
</time>
47+
<span className="border-r border-gray-500" />
48+
<span>by {byline}</span>
49+
</div>
50+
<span className="text-primary block font-bold mt-auto">
51+
Read more →
52+
</span>
53+
</Link>
54+
)
55+
},
56+
)
57+
58+
return (
59+
<div className="container grid md:grid-cols-2 gap-7 pb-10 pt-10">
60+
{items}
61+
</div>
62+
)
63+
}
64+
65+
export default Blog

docs/components/index.tsx

Whitespace-only changes.

docs/package.json

+12-4
Original file line numberDiff line numberDiff line change
@@ -25,17 +25,25 @@
2525
"@iconify-json/material-symbols": "^1.2.1",
2626
"@iconify-json/mdi": "^1.2.0",
2727
"@theguild/remark-mermaid": "^0.1.2",
28-
"next": "^14.2.9",
28+
"clsx": "^2.1.1",
29+
"next": "^14.2.12",
2930
"next-themes": "^0.3.0",
3031
"nextra": "alpha",
3132
"nextra-theme-docs": "alpha",
32-
"posthog-js": "^1.161.3",
33+
"posthog-js": "^1.162.0",
3334
"react": "^18.3.1",
34-
"react-dom": "^18.3.1"
35+
"react-dom": "^18.3.1",
36+
"remark-frontmatter": "^5.0.0"
3537
},
3638
"devDependencies": {
37-
"@types/node": "^22.5.4",
39+
"@types/mdast": "^4.0.4",
40+
"@types/mdx": "^2.0.13",
41+
"@types/node": "^22.5.5",
42+
"@types/react": "^18.3.8",
3843
"autoprefixer": "^10.4.20",
44+
"eslint-plugin-mdx": "^3.1.5",
45+
"eslint-plugin-prettier": "^5.2.1",
46+
"eslint-plugin-tailwindcss": "^3.17.4",
3947
"postcss": "^8.4.45",
4048
"tailwindcss": "^3.4.11",
4149
"typescript": "^5.6.2"

docs/pages/_meta.js

+11
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,17 @@ export default {
44
breadcrumb: false,
55
},
66
},
7+
blog: {
8+
type: 'page',
9+
title: 'Blog',
10+
theme: {
11+
layout: 'raw',
12+
typesetting: 'article',
13+
timestamp: false,
14+
breadcrumb: true,
15+
pagination: false,
16+
},
17+
},
718
index: {
819
title: 'Overview',
920
display: 'hidden',

docs/pages/blog.mdx

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
import Blog from 'components/blog'
2+
3+
<Blog />
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
---
2+
title: 'Stop Making Kubernetes Auth Hard'
3+
date: 2024-09-19
4+
byline: Thomas Rampelberg
5+
---
6+
7+
I've spent most of my time working with Kubernetes being afraid of auth. I
8+
understood how RBAC works and I knew that `.kube/config` is what's required to
9+
talk to an API server, but that's pretty much where my understanding stopped.
10+
Configuring the API server to use an auth plugin, getting tokens or certificates
11+
and setting up plugins made me think that it was all a monumental task. Just
12+
getting the environment setup correctly for myself _was_ a monumental task.
13+
Well, as part of implementing kty's oauth support, I've been forced to figure
14+
out how it all actually works. And, as turns out, it doesn't need to be nearly
15+
as complex as I thought it was.
16+
17+
## TL;DR
18+
19+
Use OpenID and grant groups or users the correct permissions in your cluster.
20+
Your organization already has an OpenID provider in place. Google, GitHub, Okta
21+
(and many more) can all be used. That's it, that's all you need. Don't bother
22+
with IAM, service accounts or any of that other stuff. Those are all reasonable
23+
for machines - not for users.
24+
25+
Take it from the folks over at Robinhood. Karen Tu and Sujith Katakam took their
26+
existing complexity and simplified it down to OpenID. The result was a system
27+
that is easier to maintain and keep secure. They've got a [Kubecon
28+
talk][robinhood-talk] that walks you through their journey and is well worth
29+
watching.
30+
31+
If you'd like to know how to do this yourself, jump down to
32+
[the instructions](#how-do-i-do-this-myself). However, I'd recommend reading
33+
through the rest of this post and demystifying what's going on behind the
34+
scenes. It is a good way to contextualize how all the pieces fit together.
35+
36+
[robinhood-talk]: https://youtu.be/aBUGtu-venk?si=KvF3H8PANOxeFwzl
37+
38+
## Authentication
39+
40+
Let's start out by splitting "auth" into two parts: [authentication][authn] and
41+
[authorization][authz]. Authentication is how you prove who you are. The result
42+
of the authentication process is an identity that can be used to see what you
43+
are, or aren't authorized to do. If we didn't actually care about verifying your
44+
identity, authentication could be nothing more than sending the username in
45+
cleartext to the API server. Obviously, we'd like a solution that is a little
46+
bit more secure than that.
47+
48+
[authn]: https://en.wikipedia.org/wiki/Authentication
49+
[authz]: https://en.wikipedia.org/wiki/Authorization
50+
51+
Kubernetes has a [whole bunch][auth-plugins] of ways to authenticate. Because it
52+
is the easiest to understand, let's start with the static token file. This is
53+
equivalent to having a password. You put the token (aka "password") into the
54+
file and then associate it with a username. If this sounds like `/etc/passwd`,
55+
that's because it is! Each request sent to the API server contains your token as
56+
a header. The API server looks up the token in its file and maps that to a user
57+
or set of groups. Very similar to sending the username to the API server, but
58+
now we've got a piece of shared data, the token, that verifies the identity.
59+
60+
Open ID Connect (OIDC) gets rid of the pre-shared secret and instead uses some
61+
[cryptography magic][pki] to do the same thing. This allows for identity to be
62+
created in a central location (a provider) and subsequently verified by anyone.
63+
When you authenticate with an OIDC provider, the end result of the process is an
64+
[ID token][oidc-id-token].
65+
66+
[pki]: https://en.wikipedia.org/wiki/Public-key_cryptography
67+
68+
The ID token is a [JSON web token][jwt] (JWT) that contains a bunch of
69+
information about your identity. The information in this token is effectively
70+
key/value pairs that are called "claims". Each claim is a piece of data that the
71+
provider has verified.
72+
73+
The token is signed using the private key of the provider and can be verified by
74+
anyone with the public key. Most importantly, OIDC providers publish their
75+
configuration so that anyone can verify the token. If you're interested in
76+
what's in that configuration, check it out for the [default
77+
provider][oidc-config] in kty.
78+
79+
[jwt]: https://jwt.io/introduction
80+
81+
With an ID token and the way to verify it in hand, the API server can extract an
82+
identity from the token and use that as part of RBAC to understand what you're
83+
allowed to do. The association between the token and either groups or users
84+
happens as part of a claim. If you've got a JWT, you can see the claims in your
85+
token by going to [jwt.io](https://jwt.io) and pasting it in. Here's a token
86+
that I've gotten for kty:
87+
88+
```JSON
89+
{
90+
"iss": "https://kty.us.auth0.com/",
91+
"aud": "P3g7SKU42Wi4Z86FnNDqfiRtQRYgWsqx",
92+
"iat": 1726784050,
93+
"exp": 1726820050,
94+
"sub": "github|123456",
95+
"email": "me@my-domain.com"
96+
}
97+
```
98+
99+
For this token, we could configure the API server to map the `email` claim to a
100+
user. This is just like the token file from above! Instead of using the
101+
pre-shared secret as the mapping, we've used the public key from the OIDC
102+
provider.
103+
104+
[auth-plugins]:
105+
https://kubernetes.io/docs/reference/access-authn-authz/authentication/
106+
[oidc-id-token]: https://auth0.com/docs/secure/tokens/id-tokens
107+
[oidc-config]: https://kty.us.auth0.com/.well-known/openid-configuration
108+
[rbac]: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
109+
110+
## Authorization
111+
112+
Here's where it gets interesting. Now that we have a verified identity,
113+
authorization can take place. We'll check a list of rules (or roles) and test
114+
whether the identity can do the action requested. Kubernetes' [role based access
115+
control system][rbac] doesn't care about how you authenticated. If the API says
116+
you're a user - then you are that user. All it cares about is your identity and
117+
what roles that identity is bound to. Let's look at a simple role:
118+
119+
```yaml
120+
apiVersion: rbac.authorization.k8s.io/v1
121+
kind: ClusterRole
122+
metadata:
123+
name: view
124+
rules:
125+
- apiGroups:
126+
- ''
127+
resources:
128+
- pods
129+
verbs:
130+
- get
131+
- list
132+
- watch
133+
```
134+
135+
Any identity that is bound to this role can get, list or watch pods in any
136+
namespace. How does an identity get associated with this role? That's where the
137+
`ClusterRoleBinding` comes into play.
138+
139+
```yaml
140+
apiVersion: rbac.authorization.k8s.io/v1
141+
kind: ClusterRoleBinding
142+
metadata:
143+
name: view
144+
roleRef:
145+
apiGroup: rbac.authorization.k8s.io
146+
kind: ClusterRole
147+
name: view
148+
subjects:
149+
- apiGroup: rbac.authorization.k8s.io
150+
kind: User
151+
name: me@my-domain.com
152+
```
153+
154+
Assuming that we're still talking about the token from above, this role binding
155+
associates all the permissions in the `view` role with the user
156+
`me@my-domain.com`. That's it! We've authenticated the identity and then
157+
verified that it can do some actions on the cluster. As RBAC is opt-in, you
158+
start off with no permissions and need to be granted them to do anything. There
159+
are some policies that come by default. In fact the `view` cluster role is one
160+
that comes out of the box (but simplified in this example). To see what can be
161+
granted, make sure to check out the [documentation][rbac].
162+
163+
For extra credit, you can also bind roles to groups. We can configure a claim
164+
from the JWT to be a group in addition to the email address. Imagine granting
165+
permissions on a cluster based on which teams a user is a part of. In fact, you
166+
can map almost anything from someone's GitHub profile directly over to a group.
167+
This way, you can setup permissions once and manage membership entirely through
168+
your OIDC provider. When using groups, the role binding ends up looking a little
169+
different:
170+
171+
```yaml
172+
apiVersion: rbac.authorization.k8s.io/v1
173+
kind: ClusterRoleBinding
174+
metadata:
175+
name: view
176+
roleRef:
177+
apiGroup: rbac.authorization.k8s.io
178+
kind: ClusterRole
179+
name: view
180+
subjects:
181+
- apiGroup: rbac.authorization.k8s.io
182+
kind: Group
183+
name: my-team
184+
```
185+
186+
[rbac]: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
187+
188+
## How do I do this myself?
189+
190+
We've implemented OIDC support directly into kty. This means that you can `ssh`
191+
into your cluster without managing any SSH keys. You're presented with a login
192+
screen that goes through OIDC to verify your identity. That identity uses the
193+
`email` claim by default to map an external identity onto a `kind: User` defined
194+
in your role bindings. Check out the [getting started guide](/getting-started)
195+
and see how simple OIDC can make accessing your cluster.
196+
197+
To get OIDC working directly with `kubectl`, you'll want to check out
198+
[kubelogin](https://github.com/int128/kubelogin?tab=readme-ov-file), it is a
199+
plugin that will do the OIDC dance for you. Add the plugin and your cluster's
200+
connection information to `~/.kube/config` and you're good to go. Note that if
201+
you can't make the modifications required for the API server, you'll want to use
202+
an [oidc-proxy](https://github.com/TremoloSecurity/kube-oidc-proxy). Luckily,
203+
most Kubernetes solutions (like [EKS][eks-oidc] or [GKE][gke-oidc]) support OIDC
204+
out of the box.
205+
206+
[eks-oidc]:
207+
https://docs.aws.amazon.com/eks/latest/userguide/authenticate-oidc-identity-provider.html
208+
[gke-oidc]: https://cloud.google.com/kubernetes-engine/docs/how-to/oidc
209+
210+
## Bringing it Together
211+
212+
So, what does this all mean? Well, it means that we've now got a central
213+
location to manage access to our cluster. If you're using groups, membership
214+
when the token is granted is mapped to a role binding that grants exactly what
215+
someone needs to work with your cluster. The IDs can be user friendly, so you
216+
can read through the `RoleBinding` YAML to understand what's allowed or not. If
217+
you're using `kty`, you don't even need any plugins or configuration! Your users
218+
can use `ssh` and immediately get access to the cluster.
219+
220+
Please don't be afraid of auth! Don't continue to use incredibly complex systems
221+
consisting of multiple plugins, webhooks, tokens and certificates. They're all
222+
hard to setup and/or easy to break. After all, security everyone can follow is
223+
the best security. Say no to services that require blanket permissions like the
224+
Kubernetes dashboard. Use OIDC and make sure that users have exactly the
225+
permissions they need.

0 commit comments

Comments
 (0)