Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Ryuk reaper container not properly re-initialized after it got terminated #764

Closed
mdonkers opened this issue Jan 17, 2023 · 2 comments · Fixed by #782
Closed

[Bug]: Ryuk reaper container not properly re-initialized after it got terminated #764

mdonkers opened this issue Jan 17, 2023 · 2 comments · Fixed by #782
Labels
bug An issue with the library

Comments

@mdonkers
Copy link
Contributor

Testcontainers version

0.15.0

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host arch

x86

Go version

1.19

Docker version

$ docker version
Client: Docker Engine - Community
 Version:           23.0.0-rc.1
 API version:       1.42
 Go version:        go1.19.4
 Git commit:        139e924
 Built:             Thu Dec 22 23:37:13 2022
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.0-rc.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.4
  Git commit:       cba986b
  Built:            Thu Dec 22 23:37:13 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.9.1
    Path:     /home/miel/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.14.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 2
  Running: 0
  Paused: 0
  Stopped: 2
 Images: 25
 Server Version: 23.0.0-rc.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.0.0-6-amd64
 Operating System: Debian GNU/Linux bookworm/sid
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 31.09GiB
 Name: housepaper
 ID: TKCM:VE5M:466R:AMPV:XPP3:CZ45:BZRN:N6WX:YR4E:ZBR5:VYDD:5SWD
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: miel
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

What happened?

Test failing with failed to create container: connecting to reaper failed: Connecting to Ryuk on localhost:32787 failed: dial tcp [::1]:32787: connect: connection refused

I noticed the failing test(s) always ran after one other specific test (which succeeded). The only difference was this test wasn't using a TestContainer.

Diving deeper, I noticed the reaper container is normally reused once initialized;

	// If reaper already exists re-use it
	if reaper != nil {
		return reaper, nil
	}

// If reaper already exists re-use it

This happened for my failing test as well. But trying to connect to the reaper, this failed;

conn, err := net.DialTimeout("tcp", r.Endpoint, 10*time.Second)

When the moby-ryuk runs, it only stays 'active' for as long as it has some active connection. After all connections are lost, it shuts down after a pre-determined timeout of 10 seconds, which cannot be configured:
https://github.com/testcontainers/moby-ryuk/blob/8f512d37699cb6e5a3b28f2433c2b3b894a78e5d/main.go#L152

So what happens in my situation;

  • Several integration tests run, creating their own container but reusing the Ryuk reaper container (as they are running in close succession, within the 10s timeout)
  • The integration test without test-container runs. This one takes about 20-30 sec, so the reaper container shuts down
  • The next integration test tries to create a container again, but aborts as it fails to connect to the 'existing' reaper container.

The code should detect whether the Ryuk container is actually still running, and if not running create a new one instead of trying to reuse the non-existing one.

Relevant log output

No response

Additional information

No response

@mdonkers mdonkers added the bug An issue with the library label Jan 17, 2023
@mdelapenya
Copy link
Member

Hi @mdonkers thanks for opening this issue, which I think it's related to #258. In there, the code initialises the singleton properly. I'm going to prioritise that PR in order to fix the reaper initialisation. Besides that, I think checking if Ryuk container is present could be part of the checks. Do you feel comfortable sending a PR for that?

@mdonkers
Copy link
Contributor Author

Hi @mdelapenya ,
It's indeed somewhat related. However in my case the initial initialisation succeeded, but the container was still invalid later on.
I can try creating a PR this week, including #258. Will link it here once I have something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue with the library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants