Skip to content

Latest commit

 

History

History
520 lines (357 loc) · 20.6 KB

2018-07-18-windows-native-test-runner.md

File metadata and controls

520 lines (357 loc) · 20.6 KB
created last updated status reviewers title tracking issues author
2018-07-18
2018-08-03
To be reviewed
ulfjack (lead), lberki, dslomov
Test execution on Windows without Bash
#5508, #4691, #4319
laszlocsomor

Abstract

Let's change how Bazel runs tests on Windows to no longer require Bash.

Background

For every test, Bazel requires a complex Bash script to set up the environment and to run the test. Therefore Windows users need to install MSYS2 Bash to run tests. This is undesirable (see issue #4319).

Bazel should run tests without requiring Bash (see issue #5508). This document explains how.

Current design

Test execution

Bazel runs tests by executing TestRunnerActions. Test actions are similar to build actions: they take a set of input files, execute a command, and are expected to produce a set of output files.

The inputs of the test action are the test binary and its dependencies, plus some tools such as the test wrapper script (@bazel_tools//tools/test/test-setup.sh) and optionally the coverage collector and LCOV merger tools.

The command of the test action is the test wrapper script, plus additional user-specified arguments. This script initializes the environment for the actual test binary, then runs the test. See Test wrapper control flow.

The outputs of the test action are the XML test log and the "undeclared outputs" file. The XML test log is an XML file that records two things: metadata about the test (such as the test target name, whether the test passed or failed), and the textual test log (that is, the output to stdout and stderr). The undeclared outputs file is a zip archive of files that the test created and are potentially interesting to the user.

Test termination

Tests may terminate:

  • cleanly, if the test process terminates by itself (regardless of whether the test passed or failed)

  • abruptly, if the test process is killed:

    • by the user, interrupting test execution using Ctrl+C

    • by Bazel, when a test times out or another test fails and --test_keep_going is disabled).

In every case the test action produces an XML test log.

Bazel on Windows terminates tests abruptly in the following locations:

Test wrapper control flow

  1. Absolutizes and exports path-storing environment variables.

    Why: At the time of creating the TestRunnerAction (along with its environment), Bazel doesn't yet know the execution root the test will run under. test-setup.sh absolutizes the envvars by making them relative to $PWD and exports them for child processes.

  2. Creates some directories, e.g. for the undeclared outputs, the shard status file, the XML test log, and the test temp directory.

  3. Exports some environment variables.

    Why: Tests and test runners require envvars such as $TEST_TMPDIR and $TEST_SHARD_INDEX.

  4. Defines rlocation() to look up paths of data-dependencies.

    Why:

    • test-setup.sh itself looks up the test executable's path.

    • For sake of shell tests (in case the actual test is a sh_test). This use-case does not apply for the subset of shell tests that use the Bash runfiles library in @bazel_tools//tools/bash/runfiles.

  5. Defines encode_output_file().

    When $EXPERIMENTAL_SPLIT_XML_GENERATION is set to "1", this function is not used.

    Otherwise, this function runs perl and sed to sanitize the textual test log for the test XML test log's CDATA section. write_xml_output_file() calls this function.

  6. Defines write_xml_output_file().

    When $EXPERIMENTAL_SPLIT_XML_GENERATION is set to "1", this function is not used.

    Otherwise, this function:

    • Creates the test XML file, with the help of encode_output_file().

    • Removes ${XML_OUTPUT_FILE}.log.

      Why: ${XML_OUTPUT_FILE}.log is a temporary file containing the test's raw output.

  7. Changes the current directory to the test's runfiles directory (only when coverage collection is disabled).

    Why: Actions run in the execroot by default. Changing the directory prevents a locally executed, non-sandboxed test from accessing undeclared inputs files.

  8. Adds . to $PATH.

    Why: To run the test executable without having to add ./ if the binary is in the current directory.

    In fact this step is unnecessary, because the test executable's path is always absolute.

  9. Sets $TEST_PATH to the absolute path of the test executable.

    If $TEST_SHORT_EXEC_PATH is defined, it sets an alternative $TEST_PATH.

    Why: To avoid too long paths on Windows with remote execution.

  10. Traps all signals to be handled by write_xml_output_file().

    Why: If the test is abruptly terminated (e.g. the user interrupts test execution or the test times out), Bash executes the signal handler and write_xml_output_file() writes an output file, which records the fact that the test terminated abruptly.

  11. Runs the test:

    If it can, runs the test as a subprocess and redirects the test's output to ${XML_OUTPUT_FILE}.log while also streaming the output to stdout via less; otherwise runs the test directly and tee the test's output to ${XML_OUTPUT_FILE}.log.

    In both cases, runs the test via the tools/test/collect_coverage.sh if requested, which:

    1. Absolutizes and exports some path-storing envvars (e.g. $COVERAGE_MANIFEST, $COVERAGE_DIR).

    2. Changes the current directory to the test's workspace, runs the test, stores the exit code.

    3. exec()s the $LCOV_MERGER

    collect_coverage.sh runs in "legacy mode" if $LCOV_MERGER is undefined. This mode triggers Google-specific code paths that rely on /usr/bin/lcov. This use-case is unsupported on Windows.

  12. Resets all signal handlers.

    Why: The test terminated normally so it's safe to reset the default signal handlers.

  13. If $EXPERIMENTAL_SPLIT_XML_GENERATION is not set to "1", calls write_xml_output_file().

  14. Writes the manifest- and annotation files for the undeclared outputs.

    Why: Tests may produce valuable output files in the $TEST_UNDECLARED_OUTPUTS_DIR directory. These outputs are undeclared, they are not part of the test action's signature, so Bazel is unaware of them. Bazel archives the entire directory to retrieve these files from the from the sandbox or remote machine.

  15. Creates a zip file of the undeclared outputs.

Split XML generation

When --experimental_split_xml_generation is enabled, Bazel sets $EXPERIMENTAL_SPLIT_XML_GENERATION to "1", and runs tools/test/generate-xml.sh in a separate action after the TestRunnerAction. This script implements the same logic as encode_output_file and write_xml_output_file do in steps 5 and 6 above.

When --experimental_split_xml_generation is disabled, the test wrapper writes the XML.

Requirements of the solution

No extra software

Running tests with Bazel must require no extra software on a fresh Windows 7 desktop installation other than what the tested language requires (compiler, runtime, etc.).

We require Windows 7 compatibility because that is the oldest Windows version Bazel supports.

Drop-in replacement

The new solution must be a drop-in replacement for the old test execution mechanism, meaning all features that the old design provides — such as writing an XML test log even upon abrupt test termination, or running under a coverage collector — must either keep working under the new design, or be dropped for a good reason.

Guarded by a flag

The new solution must be guarded by a flag. This allows us to roll out the feature in multiple stages, and users to easily revert to the old behavior in case the new one is buggy.

Works with remote execution

The new solution must work with remote execution. Bazel currently executes test-setup.sh on the remote machine. The new test wrapper should be a drop-in replacement for test-setup.sh in this scenario too.

Non-requirements of the solution

It is beyond the scope of this design to discuss whether Bazel attempts to fetch any outputs of the test action from the remote machine, in case the test timed out or got interrupted. This question falls into the domain of general remote action execution.

Design

Constraints

Windows doesn't support signal handlers and processes cannot run custom cleanup routines upon termination. To interrupt a test, Bazel currently forcefully terminates the test process (see Test termination), leaving no chance for cleanup. Therefore in order to capture the textual test log and convert it to XML even when the test is interrupted, the test and the log capturer cannot run in the same process.

Windows doesn't support process replacement (exec(3) on Unixes), therefore in order to set up the test's environment, the test setup process must create the test process.

Processes

There will be two or three processes per test:

  • a parent process, which runs the test wrapper

  • a child process, which runs either the test binary or the coverage collector

  • in case coverage collection is requested and the coverage collector succeeded: a second child process that runs the $LCOV_MERGER

To ensure that terminating the test wrapper, in case it fails to shut down fast enough (see Abrupt test termination), also terminates the child processes, we assign the parent and child processes to the same Job Object.

To avoid terminating all test wrappers and tests, we create a new Job Object for each TestRunnerAction.

Timeout

For sake of a more self-contained test wrapper that can run both locally and on remote machines, the test wrapper will monitor the elapsed time since its start and initiate the shutdown protocol if the time exceeds the test's timeout.

This ensures that the test wrapper exits both when it runs locally and when it runs remotely.

However Bazel will also monitor the elapsed time, so that:

  • with local execution, it can kill the test wrapper in case the test wrapper hung beyond the timeout and failed to shut down

  • with remote execution, Bazel can terminate the connection and report with certainty that the test timed out.

Whether Bazel attempts retrieve the XML test log from the remote machine in case the test wrapper successfully wrote it, or how to attempt this, is beyond the scope of this design.

Interruption

Local test execution

In order to let the test wrapper finish writing the XML file even if the user or Bazel interrupts it, we change Bazel not to forcefully terminate the process upon interruption, but instead:

  1. notify the process about the interruption by sending it a control message on some channel

  2. wait some time for the process to complete its shutdown protocol

  3. forcefully terminate the process only when it's still running after a timeout, to avoid hanging because of a stuck test wrapper process.

For local test execution, the communication channel between Bazel and the test wrapper process will be the test wrapper's stdin.

Using stdin is simple and convenient: the only supported control message is the request for interruption. For now a single byte will suffice as this message. This communication protocol is easily extensible if necessary.

Using stdin is also safe: no other process has a handle to the test wrapper's stdin, so no other process will inadvertently send the interruption request.

Remote test execution

When Bazel or the user interrupt remotely running tests, Bazel will signal the fact of the interruption (provided the remote execution service supports such signalling), then Bazel will close the connection, and report that the test's status is unknown.

Shutdown protocol

When requested to interrupt and shut down, the test wrapper should exit as soon as possible.

The primary output of test execution is the XML test log: it carries crucial information for the user. The XML file records the test's status (passed or failed) and the test's textual output.

When $EXPERIMENTAL_SPLIT_XML_GENERATION is set to "1", the test wrapper should ensure it always writes the XML test log. (Otherwise another tool writes this file.)

Textual test outputs are typically around a few MBs, though at their extreme reach sizes of several GBs. To avoid having to write the whole XML file as part of the shutdown protocol, the test wrapper will continuously convert the log as the test is running. When requested to interrupt, the test wrapper only has to convert the tail of the test log, append the end of the XML file, and exit.

To make sure that the test wrapper has read access the child process' output, the test wrapper:

  1. creates a temporary file under $TEST_TMPDIR, opens it for writing and read sharing

  2. creates the child processes such that stdout and stderr are redirected to the temporary file.

After the test wrapper finished writing the XML test log, it starts archiving the undeclared outputs. This operation may not finish within the forceful termination timeout, but that's fine: the most important output is the XML file. If the test was interrupted, retrieving the undeclared outputs is done on a best-effort basis.

Shutdown timeout

The shutdown timeout should be long enough for the test wrapper to finish writing the XML file, terminate the active child process, and exit. We established that the test wrapper will continuously covert the test log, so we expect that finishing up the XML file is faster than writing the entire multi-GB test log.

A timeout of 1 second seems to suffice.

rlocation() support for sh_test rules

The new test execution mechanism will not define rlocation() for the benefit of sh_test rules. See Backward compatibility.

Implementation language

We'll implement the test wrapper in C++, compile it as a x86_64 Windows binary, and bundle it with Bazel as @bazel_tools//tools/test/windows:test-wrapper.exe.

Rationale:

  • we have experience with C++

  • Bazel already contains C++ code, so we introduce no new language to the code base

  • the binary only has to run on Windows on x86_64 CPUs: even if remote execution supported mixing Windows host with Linux executor or the other way around, there's no interpreted or JIT'ed language that runs on all platforms without requiring any third-party runtime

  • a statically linked binary requires no runtime and runs on a fresh Windows 7 installation without additional software

  • it's easy to interface with the Windows API

The alternative would be to build a .NET application: according to a Microsoft blog post all versions of Windows 7 and of Windows Server 2008 R2 (the equivalent server version) include the .NET framework 3.5.

Design adequacy

Features

Addressing every step in the current test wrapper design:

  • steps 1, 2, 3: We'll use Windows API functions or custom logic for these. We pass environment variables to CreateProcessW to export them for the child processes.

  • step 4: The new test wrapper will not do this, see Backward compatibility. To look up the test binary's path, we'll use the C++ runfiles library in @bazel_tools//tools/cpp/runfiles.

  • step 5: Same as steps 1, 2, 3.

  • step 6: We'll implement the encoding as custom logic.

  • step 7: We'll use SetCurrentDirectoryW. If the path is too long, we create a junction under $TEST_TMPDIR pointing at the path, and change the current directory to the junction.

  • step 8: This step is unnecessary. The new test wrapper will look up the test binary's path using the C++ runfiles library; see step 4.

  • step 9: If the path is too long, we'll create a junction like in step 7.

  • step 10: This step is unnecessary by design, see Interruption request.

  • step 11: The test wrapper opens the file that the test's textual output is redirected to with read sharing, enabling to stream the output to its own stdout.

    To run under the coverage collector, Bazel creates the child process to run the coverage collector, then a second child process to run the $LCOV_MERGER.

  • step 12: This step is unnecessary on Windows, see Constraints.

  • step 13: No special handling necessary. The shutdown protocol will respect the value of $EXPERIMENTAL_SPLIT_XML_GENERATION.

  • step 14: We'll use Windows API functions to list the directory (FindFirstFileW) and to write the files. It's enough to open these files with read sharing and without deletion sharing, because in case the test wrapper is forcefully terminated, the OS closes its file handles.

  • step 15: We'll use the zip compressor in //third_party/ijar.

Addressing split XML generation:

  • The test wrapper already implements the complete logic for @bazel_tools//tools/test:generate-xml.sh. We'll create a separate program (@bazel_tools//tools/test/windows:generate-xml.exe) that wraps the test-wrapper's XML-writing logic.

Compatibility with remote execution

The test wrapper works remotely as well as locally. The difference to local execution is, Bazel and the test wrapper run on different machines, therefore nothing connects to the test wrapper's stdin and nothing asks it to shut down in case the user or Bazel interrupts tests.

If the remotely running test wrapper notices that the test timed out, it will shut down the same way it would do locally, in case Bazel attempts to fetch the test XML from the remote machine.

Backward-compatibility

The new solution will not define rlocation() as a Bash function, potentially breaking existing sh_test rules that do not use the Bash runfiles library in @bazel_tools//tools/bash/runfiles. We anticipate no breakages though because in a survey conducted in March-April 2018, no Windows user reported using Bash. (Bazel's own shell tests are also unaffected, because the ones that run on Windows already use the Bash runfiles library.)

In the off-chance that someone is affected and are unable to migrate their tests to the Bash runfiles library, we'll update the Bash launcher in @bazel_tools//tools/launcher to load a Bash script before the main test script, which will define rlocation() with the same body as test-setup.sh does today.

In every other respect the new system will be a drop-in replacement for the old test execution mechanism. We will roll it out in several stages (see rollout plan) so users will have time to test it, report bugs, and revert to the old mechanism in case they discover bugs.

Rollout plan

We will roll out this feature over several Bazel minor versions:

  1. version 0.N.*: contains both the new and old test execution mechanisms and supports the --[no]windows_native_test_wrapper flag. By default the flag is disabled and Bazel uses the old (non-native, Bash-based) test execution. We ask users to enable the flag and report bugs. We move on to the next stage when all known bugs are fixed.

  2. version 0.N+k.* (k > 0): the flag is enabled by default. We ask users to file bugs whenever they find a use-case to disable the flag. We move on to the next stage if no new bugs are reporter for a version.

  3. version 0.N+m.* (m > k): the flag is a no-op and Bazel no longer contains the code for the old test execution. We ask users to remove the flag from their .bazelrc files. We move on to the next stage unconditionally.

  4. version 0.N+m+1.*: the flag is no longer supported.