Tests fail in CI with -race flag #44

raulk · 2019-03-01T17:50:07Z

Even though there's a race, the race detector somehow provokes a test failure by panic:

*** go test -v -race -coverprofile=coverage.eaafb277f32e515d.txt -covermode=atomic github.com/libp2p/go-tcp-transport
go: downloading github.com/whyrusleeping/go-smux-multiplex v3.0.16+incompatible
go: downloading github.com/libp2p/go-mplex v0.0.1
go: downloading github.com/libp2p/go-buffer-pool v0.0.1
=== RUN   TestTcpTransport
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestProtocols
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestBasic
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestCancel
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestPingPong
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn1Stream1Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn1Stream100Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn100Stream100Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress50Conn10Stream50Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn1000Stream10Msg
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStress1Conn100Stream100Msg10MB
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress
==================
WARNING: DATA RACE
Write at 0x00c0000eeb43 by goroutine 854:
  testing.tRunner.func1()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:818 +0x355
  runtime.call32()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/runtime/asm_amd64.s:522 +0x3a
  testing.(*common).Fatal()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:628 +0x7c
  github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:355 +0x5aa
  github.com/libp2p/go-libp2p-transport/test.SubtestTransport.func1()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/utils.go:41 +0x11e
  testing.tRunner()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:827 +0x162
Previous read at 0x00c0000eeb43 by goroutine 789:
  testing.(*common).Fail()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:545 +0x9d
  testing.(*common).Error()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:616 +0x7c
  github.com/libp2p/go-libp2p-transport/test.fullClose()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:52 +0x1b5
  github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1.1()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:322 +0x76
Goroutine 854 (running) created at:
  testing.(*T).Run()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:878 +0x659
  github.com/libp2p/go-libp2p-transport/test.SubtestTransport()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/utils.go:40 +0x14c
  github.com/libp2p/go-tcp-transport.TestTcpTransport()
      /home/travis/gopath/src/github.com/libp2p/go-tcp-transport/tcp_test.go:25 +0xa4
  testing.tRunner()
      /home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:827 +0x162
Goroutine 789 (running) created at:
  github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1()
      /home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:320 +0xc0
==================
=== RUN   TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStreamReset
panic: Fail in goroutine after TestTcpTransport/github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress has completed
goroutine 32895 [running]:
testing.(*common).Fail(0xc0000eeb00)
	/home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:546 +0x1c6
testing.(*common).Error(0xc0000eeb00, 0xc000a41f58, 0x1, 0x1)
	/home/travis/.gimme/versions/go1.11.5.linux.amd64/src/testing/testing.go:616 +0x7d
github.com/libp2p/go-libp2p-transport/test.fullClose(0xc0000eeb00, 0xc7e4a0, 0xc0009abc80)
	/home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:52 +0x1b6
github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1.1(0xc000026dc0, 0xc0000eeb00, 0xc7e4a0, 0xc0009abc80)
	/home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:322 +0x77
created by github.com/libp2p/go-libp2p-transport/test.SubtestStreamOpenStress.func3.1
	/home/travis/gopath/pkg/mod/github.com/libp2p/go-libp2p-transport@v0.0.1/test/stream.go:320 +0xc1
FAIL	github.com/libp2p/go-tcp-transport	27.014s

(race flag was enabled in transports using the test suite herein: libp2p/go-tcp-transport#34 libp2p/go-ws-transport#40)

The text was updated successfully, but these errors were encountered:

raulk · 2019-03-01T18:01:00Z

Ok, so the RACE is an artefact of the test (it's not in the code under test itself).

These lines both invoke the t *testing.T struct at the same time, leading to a data race.

go-libp2p-transport@v0.0.1/test/stream.go:52
  ^ error during fullClose()
  ^ invoked in post-test cleanup (defer)
go-libp2p-transport@v0.0.1/test/stream.go:355
  ^ since fullClose never ends, the waitgroup is never done, and a timeout fires,
    which also attempts to fail the test

raulk · 2019-03-01T18:39:20Z

Ok, found the issue. The affected test (SubtestStreamOpenStress) has a 10 second timeout. The race checker adds considerable overhead, sending the runtime of these tests way above 10 seconds. As a result, the test times out and begins cleanup while the workers are still running on separate goroutines. Increasing the timeout to 30 seconds was enough. I'll increase to 1 minute to be on the safe side, though. We know these tests succeed, so the tradeoff of waiting a minute for a real failure seems fair to me.

raulk transferred this issue from libp2p/go-tcp-transport Mar 1, 2019

raulk mentioned this issue Mar 1, 2019

increase test timeout. #45

Merged

raulk closed this as completed in #45 Mar 1, 2019

ghost assigned raulk Mar 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests fail in CI with -race flag #44

Tests fail in CI with -race flag #44

raulk commented Mar 1, 2019 •

edited

Loading

raulk commented Mar 1, 2019

raulk commented Mar 1, 2019

Tests fail in CI with -race flag #44

Tests fail in CI with -race flag #44

Comments

raulk commented Mar 1, 2019 • edited Loading

raulk commented Mar 1, 2019

raulk commented Mar 1, 2019

raulk commented Mar 1, 2019 •

edited

Loading