Update Executor interface for Run and Exec #1563

coryb · 2020-07-09T23:11:48Z

First step for #749 refactor Executor interface to allow for Run (pid1) vs Exec'ing into running containers. Implements proposal outlined in: #749 (comment)

Signed-off-by: Cory Bennett <cbennett@netflix.com>

tonistiigi · 2020-07-09T23:53:48Z

worker/runc/runc_test.go

+		Meta: executor.Meta{
+			Args: []string{"ps", "-o", "pid,comm"},
+		},
+		Stdout: &nopCloser{stdout},


can we have a test that also does stdin?

tonistiigi · 2020-07-09T23:54:35Z

worker/runc/runc_test.go

 	require.NoError(t, err, fmt.Sprintf("stdout=%q, stderr=%q", stdout.String(), stderr.String()))
 	require.Equal(t, string(selfCmdline), stdout.String())
 }

+func TestRuncWorkerExec(t *testing.T) {


Could we reuse the same tests for runc/containerd. Shared package with a test function that takes executor interface instance.

okay, will look into sharing some testing code to dedup.

Just the test(s) itself. Don't try to dedupe the setup logic.

tonistiigi · 2020-07-09T23:55:14Z

worker/runc/runc_test.go

+
+	err = eg.Wait()
+	// we expect pid1 to get canceled after we test the exec
+	require.EqualError(t, errors.Cause(err), "context canceled")


nit: context.Canceled.Error()

tonistiigi · 2020-07-10T00:00:17Z

executor/executor.go

+type ProcessInfo struct {
+	Meta           Meta
+	Stdin          io.ReadCloser
+	Stdout, Stderr io.WriteCloser


I guess TODO in here is Resize <-chan Size. Can be next PR

tonistiigi · 2020-07-10T00:06:47Z

executor/runcexecutor/executor.go

@@ -310,6 +315,62 @@ func (w *runcExecutor) Exec(ctx context.Context, meta executor.Meta, root cache.
 	return nil
 }

+func (w *runcExecutor) Exec(ctx context.Context, id string, process executor.ProcessInfo) error {
+	// first verify the container is running, if we get an error assume the container


Could we do some sync with go mutex/cond/channel in here first. Eg. detect that Run() has been called for the ID and it has not returned yet. I guess there is still a race when runc binary is being called that needs to be checked with retries but first check and reacting to pid1 returning would be fast then. Also, maybeRun() should take a channel that is closed to indicate started event? Ideally, when pid1 returns while this method is running all the errors returned in here would be about pid1 closing, and not for example that open fails because state file has been deleted.

yeah, I struggled a bit with race conditions between pid1 and exec. I think a channel passed to Run seems reasonable. Since runc.Runc is blocking I can setup a short goroutine to poll for "running" state before sending on the channel to avoid the ~200ms delay between runc.Run being called and the container being running. I will try to rework it a bit to avoid the Exec retry loop.

Something like

Run(id string, started chan<- struct{}) { done = make(chan error, 1) e.mu.Lock() e.m[id] = done defer { e.mu.Lock() delete(e.m, id) e.mu.Unlock() done <- err }() e.mu.Unlock() close(started) } Exec(id string) { e.mu.Lock() done, ok := e.m[id] if !ok { return error404 } e.mu.Unlock() defer func() { select { case err := <-done: err = errors.Wrapf(err, "pid1 returned with error") default: } }() }

?

I still think retry is needed as well because of the way we call runc without getting the create event.

makes sense, I will update.

tonistiigi · 2020-07-10T00:12:34Z

executor/runcexecutor/executor.go

+	}
+	if len(process.Meta.Env) > 0 {
+		// merge exec env with pid1 env
+		spec.Process.Env = append(spec.Process.Env, process.Meta.Env...)


I don't think we want to merge.

just to confirm ... if PATH is set on pid1, and on a subsequent Exec we set just TERM, we expect the Exec to run without any PATH set?

PATH is kind-of special, eg. we have https://github.com/moby/buildkit/blob/master/solver/llbsolver/ops/exec.go#L728 and should probably do the same for exec. Do you see a case where exec would not know the env of the pid1 in case it wants to reuse some of its env and can't just resend it?

No, that seems fine to me. I was just not sure if we expected the clients to resend the entire env on exec.

It seems safer this way to let the client have the full control when they don't want the same env.

update run/exec tests for stdin and expected failures move common tests for runc and container to shared tests package Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb · 2020-07-10T22:31:00Z

@tonistiigi thanks for the help with the review. I think I have addressed all your issues, hopefully I captured what you wanted.

Looks like one test failed:
https://travis-ci.org/github/moby/buildkit/jobs/707044698#L2818

Not sure if this could be related to my PR or if the tests are sometimes flaky.

tonistiigi · 2020-07-10T23:02:52Z

executor/containerdexecutor/executor.go


 	resolvConf, err := oci.GetResolvConf(ctx, w.root, nil, w.dnsConfig)
 	if err != nil {
-		return err
+		return sendErr(done, err)


This is better if wrapped in a single defer so there is no risk that wrapping was forgotten.

Sure I can do that. I tend to avoid that pattern myself since I have always found the unintentional redefinition of err to be problematic and harder to find, ie:

func Exec(...) (err error) { defer func() { done <- err }() resolvConf, err := oci.GetResolvConf(ctx, w.root, nil, w.dnsConfig) if err != nil { return err }

would send nil on done if GetResolveConf fails.

would send nil on done if GetResolveConf fails.

No, it does not :)

Hah, whoa, TIL. I tried it on play just to sanity check, I must have misinterpreted the results, so much for sanity :(
I redid the test and of course you are correct, it works fine: https://play.golang.org/p/TvbDMw6q5Z2 I would have sworn I ran into this issue a while ago, maybe older Go versions had this issue? Dunno, anyway, yeah that will make the change much easier :)

maybe this is what you were thinking https://play.golang.org/p/WJKt7EDGJYD

I have no idea, maybe I ran into something like you suggest. I tried on an old go1.2 runtime, it was fine there also. So I guess I have just been wrong for years, o'well, glad you set me straight :)

executor/runcexecutor/executor.go

executor/containerdexecutor/executor.go

executor/runcexecutor/executor.go

worker/tests/common.go

Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb · 2020-07-11T02:34:48Z

I think I addressed all the issues for round2, let me know if you find more issues.

tonistiigi

LGTM

@hinshun Do you want to verify this as well?

hinshun · 2020-07-11T06:36:26Z

@tonistiigi Yes, I'm out on a trip atm, but I'll give this a review tmrw morning.

hinshun

Looks good overall, small nits.

hinshun · 2020-07-13T17:11:08Z

executor/containerdexecutor/executor.go

+
+	proc := spec.Process
+
+	// TODO how do we get rootfsPath for oci.GetUser in case user passed in username rather than uid:gid?


Do we need to fix this now? It looks like there is some existing code that deals with errors returned from oci.ParseUIDGID(meta.User) here:

buildkit/executor/containerdexecutor/executor.go

Lines 107 to 114 in 5909d16

uid, gid, err := oci.ParseUIDGID(meta.User)

if err != nil {

lm := snapshot.LocalMounterWithMounts(rootMounts)

rootfsPath, err := lm.Mount()

if err != nil {

return err

}

uid, gid, sgids, err = oci.GetUser(ctx, rootfsPath, meta.User)

Yeah, the problem is that we need the rootfsPath to call oci.GetUser, which I could not figure out how to access via containerd apis. It was available when creating the container, but I could not figure out how to determine it after creation. We could track it in the same map where we track the done channel, then I could use it later at Exec time?

hinshun · 2020-07-13T17:13:45Z

executor/executor.go

 type Executor interface {
-	// TODO: add stdout/err
-	Exec(ctx context.Context, meta Meta, rootfs cache.Mountable, mounts []Mount, stdin io.ReadCloser, stdout, stderr io.WriteCloser) error
+	Run(ctx context.Context, id string, rootfs cache.Mountable, mounts []Mount, process ProcessInfo, started chan<- struct{}) error


Can we add some godocs and details about the usage of started?

Signed-off-by: Cory Bennett <cbennett@netflix.com>

hinshun · 2020-07-13T22:10:23Z

Only change is godocs, so I'll go ahead and merge this now.

Update Executor interface for Run and Exec

6d58121

Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb force-pushed the executor-exec-interface branch from 2122ce9 to 6d58121 Compare July 9, 2020 23:40

tonistiigi reviewed Jul 10, 2020

View reviewed changes

fix error handling for exec when container fails to start

5e91dff

update run/exec tests for stdin and expected failures move common tests for runc and container to shared tests package Signed-off-by: Cory Bennett <cbennett@netflix.com>

tonistiigi reviewed Jul 10, 2020

View reviewed changes

coryb force-pushed the executor-exec-interface branch from 4f2fe1a to 36636d3 Compare July 11, 2020 00:04

simplify done channel handling, fix other pr comments.

5909d16

Signed-off-by: Cory Bennett <cbennett@netflix.com>

coryb force-pushed the executor-exec-interface branch from 36636d3 to 5909d16 Compare July 11, 2020 01:14

tonistiigi approved these changes Jul 11, 2020

View reviewed changes

hinshun reviewed Jul 13, 2020

View reviewed changes

update godoc for Executor interface

caac945

Signed-off-by: Cory Bennett <cbennett@netflix.com>

hinshun approved these changes Jul 13, 2020

View reviewed changes

hinshun merged commit 721eb7e into moby:master Jul 13, 2020

tonistiigi mentioned this pull request Jul 23, 2020

vendor: update buildkit to 4d1f260e8 moby/moby#41234

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Executor interface for Run and Exec #1563

Update Executor interface for Run and Exec #1563

coryb commented Jul 9, 2020

tonistiigi Jul 9, 2020

tonistiigi Jul 9, 2020

coryb Jul 10, 2020

tonistiigi Jul 10, 2020

tonistiigi Jul 9, 2020

tonistiigi Jul 10, 2020

tonistiigi Jul 10, 2020

coryb Jul 10, 2020

tonistiigi Jul 10, 2020

coryb Jul 10, 2020

tonistiigi Jul 10, 2020

coryb Jul 10, 2020

tonistiigi Jul 10, 2020

coryb Jul 10, 2020

tonistiigi Jul 10, 2020

coryb commented Jul 10, 2020

tonistiigi Jul 10, 2020

coryb Jul 10, 2020

tonistiigi Jul 10, 2020

coryb Jul 10, 2020

tonistiigi Jul 11, 2020 •

edited

Loading

coryb Jul 11, 2020

coryb commented Jul 11, 2020

tonistiigi left a comment

hinshun commented Jul 11, 2020

hinshun left a comment

hinshun Jul 13, 2020

coryb Jul 13, 2020

hinshun Jul 13, 2020

hinshun commented Jul 13, 2020


		proc := spec.Process

		// TODO how do we get rootfsPath for oci.GetUser in case user passed in username rather than uid:gid?

	uid, gid, err := oci.ParseUIDGID(meta.User)
	if err != nil {
	lm := snapshot.LocalMounterWithMounts(rootMounts)
	rootfsPath, err := lm.Mount()
	if err != nil {
	return err
	}
	uid, gid, sgids, err = oci.GetUser(ctx, rootfsPath, meta.User)

Update Executor interface for Run and Exec #1563

Update Executor interface for Run and Exec #1563

Conversation

coryb commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coryb commented Jul 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tonistiigi Jul 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coryb commented Jul 11, 2020

tonistiigi left a comment

Choose a reason for hiding this comment

hinshun commented Jul 11, 2020

hinshun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hinshun commented Jul 13, 2020

tonistiigi Jul 11, 2020 •

edited

Loading