Skip to content
This repository was archived by the owner on Dec 7, 2019. It is now read-only.

Tests fail in CI with -race flag #44

Closed
raulk opened this issue Mar 1, 2019 · 2 comments
Closed

Tests fail in CI with -race flag #44

raulk opened this issue Mar 1, 2019 · 2 comments
Assignees

Comments

@raulk
Copy link
Member

raulk commented Mar 1, 2019

Even though there's a race, the race detector somehow provokes a test failure by panic:

*** go test -v -race -coverprofile=coverage.eaafb277f32e515d.txt -covermode=atomic github.com/libp2p/go-tcp-transport
go: downloading github.com/whyrusleeping/go-smux-multiplex v3.0.16+incompatible
go: downloading github.com/libp2p/go-mplex v0.0.1
go: downloading github.com/libp2p/go-buffer-pool v0.0.1
=== RUN   TestTcpTransport
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestProtocols
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestBasic
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestCancel
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestPingPong
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn1Stream1Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn1Stream100Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn100Stream100Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress50Conn10Stream50Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn1000Stream10Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn100Stream100Msg10MB
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress
==================
WARNING: DATA RACE
Write at 0x00c0000eeb43 by goroutine 854:
  testing.tRunner.func1()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:818 +0x355
  runtime.call32()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/runtime/asm_amd64.s:522 +0x3a
  testing.(*common).Fatal()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:628 +0x7c
  github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:355 +0x5aa
  github.com/libp2p/go-libp2p-transport/test.SubtestTransport.func1()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/utils.go:41 +0x11e
  testing.tRunner()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:827 +0x162
Previous read at 0x00c0000eeb43 by goroutine 789:
  testing.(*common).Fail()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:545 +0x9d
  testing.(*common).Error()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:616 +0x7c
  github.com/libp2p/go-libp2p-transport/test.fullClose()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:52 +0x1b5
  github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1.1()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:322 +0x76
Goroutine 854 (running) created at:
  testing.(*T).Run()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:878 +0x659
  github.com/libp2p/go-libp2p-transport/test.SubtestTransport()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/utils.go:40 +0x14c
  github.com/libp2p/go-tcp-transport.TestTcpTransport()
      /home/travis/gopath/src/github.com/libp2p/go-tcp-transport/tcp_test.go:25 +0xa4
  testing.tRunner()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:827 +0x162
Goroutine 789 (running) created at:
  github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:320 +0xc0
==================
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStreamReset
panic: Fail in goroutine after TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress has completed
goroutine 32895 [running]:
testing.(*common).Fail(0xc0000eeb00)
	/home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:546 +0x1c6
testing.(*common).Error(0xc0000eeb00, 0xc000a41f58, 0x1, 0x1)
	/home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:616 +0x7d
github.com/libp2p/go-libp2p-transport/test.fullClose(0xc0000eeb00, 0xc7e4a0, 0xc0009abc80)
	/home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:52 +0x1b6
github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1.1(0xc000026dc0, 0xc0000eeb00, 0xc7e4a0, 0xc0009abc80)
	/home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:322 +0x77
created by github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1
	/home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:320 +0xc1
FAIL	github.com/libp2p/go-tcp-transport	27.014s

(race flag was enabled in transports using the test suite herein: libp2p/go-tcp-transport#34 libp2p/go-ws-transport#40)

@raulk
Copy link
Member Author

raulk commented Mar 1, 2019

Ok, so the RACE is an artefact of the test (it's not in the code under test itself).

These lines both invoke the t *testing.T struct at the same time, leading to a data race.

go-libp2p-transport@v0.0.1/test/stream.go:52
  ^ error during fullClose()
  ^ invoked in post-test cleanup (defer)
go-libp2p-transport@v0.0.1/test/stream.go:355
  ^ since fullClose never ends, the waitgroup is never done, and a timeout fires,
    which also attempts to fail the test

@raulk
Copy link
Member Author

raulk commented Mar 1, 2019

Ok, found the issue. The affected test (SubtestStreamOpenStress) has a 10 second timeout. The race checker adds considerable overhead, sending the runtime of these tests way above 10 seconds. As a result, the test times out and begins cleanup while the workers are still running on separate goroutines. Increasing the timeout to 30 seconds was enough. I'll increase to 1 minute to be on the safe side, though. We know these tests succeed, so the tradeoff of waiting a minute for a real failure seems fair to me.

@raulk raulk transferred this issue from libp2p/go-tcp-transport Mar 1, 2019
@raulk raulk closed this as completed in #45 Mar 1, 2019
@ghost ghost assigned raulk Mar 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant