Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows,tests: design Bash-less test execution #16

Closed
wants to merge 2 commits into from

Conversation

laszlocsomor
Copy link
Contributor

@laszlocsomor laszlocsomor commented Jul 24, 2018

Tracking bug: bazelbuild/bazel#5508

@laszlocsomor
Copy link
Contributor Author

Lead reviewer: @ulfjack
Optional reviewers: @lberki , @dslomov

@laszlocsomor
Copy link
Contributor Author

Thanks! To be clear, are you approving of the design itself, or just approving that the doc contains all important points and it's ready for review?

@dslomov
Copy link
Contributor

dslomov commented Jul 24, 2018

Ready for review, as per https://bazel.build/designs/index.html#create-a-pull-request


If `$TEST_SHORT_EXEC_PATH` is defined, it sets an alternative `$TEST_PATH`.

**Why**: To avoid too long paths on Windows with remote execution.
Copy link

@jasharpe jasharpe Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Microsoft has fixed the bug that required this so we should be able to drop this once the next Windows Server release is out (a few months from now likely).

I'm not sure if this means we can drop this from the new wrapper entirely. Depends on when it is usable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! What exactly did they fix: is it now possible to set the CWD to a path longer than MAX_PATH?
Btw I think using a junction ("Design adequacy" > "step 9") will beat the purpose of $TEST_SHORT_EXEC_PATH.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't a MAX_PATH issue, but an instability in Docker when passing commands above a certain length (around 107 characters or so).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. But it seems, according to MSDN, SetCurrentDirectoryW now also supports long paths.


# Non-requirements of the solution

The solution does not have to cover remote test execution, because Bazel does

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this isn't true. We run test-setup.sh on remote Windows machines now, and would likely run whatever wrapper binary replaces test-setup.sh on remote Windows machines as well.

That said, I don't see a reason why your proposal (which if I understand correctly is basically replacing test-setup.sh with a native program) wouldn't work with remote execution.

# Implementation language

We'll implement the test wrapper in C++, compile it as a x86\_64 Windows binary,
and bundle it with Bazel as `@bazel_tools//tools/test:test-wrapper.exe`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why bundle the binary rather than build it on the fly like the launchers are?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we want to require no C++ compiler. The launchers are actually pre-built: Bazel just appends data at the end of a pre-built binary.


**Why**: To avoid too long paths on Windows with remote execution.

1. Traps all signals to be handled by `write_xml_output_file()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the steps will move out of the test wrapper script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which ones?

# Non-requirements of the solution

The solution does not have to cover remote test execution, because Bazel does
not manage remote processes. To run a test remotely, Bazel sends a request to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bazel does upload the test-setup.sh script to the remote machine right now.

Copy link
Contributor Author

@laszlocsomor laszlocsomor Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, @jasharpe already mentioned that. I'll rework this section and ping @jasharpe 's comment thread.

* a child process, which runs either the test binary or the coverage collector

* optionally a second child process after the first one finished, which runs
the `$LCOV_MERGER` when coverage collection is requested
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only run LCOV_MERGER if the test exits successfully.


1. wait some time for the process to complete its shutdown protocol

1. forcefully terminate the process only if it's still running after a timeout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add explanation why you want to change that.

message. This communication protocol is easily exendable if necessary.

Using `stdin` is also safe: no other process has a handle to the test wrapper's
`stdin`, so no other process will inadvertently send the interruption request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to sync with any remote execution implementations on this.


The primary output of test execution is the XML test log: it carries the most
useful information for the user. The XML file records the test's status (passed
or failed) and the test's textual output. The test wrapper should ensure it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the process of making Bazel ensure that a test.xml exists, rather than relying on the test wrapper.

We will roll out this feature over several Bazel minor versions:

1. version `0.N.*`: contains both the new and old test execution mechanisms and
supports the `--[no]windows_bashless_test` flag. By default the flag is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double negation is not ideal.


There will be two or three processes per test (same numbers as today):

* a parent process, which runs the test wrapper
Copy link
Contributor

@lberki lberki Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered implementing this not as a separate executable, but as JNI code within the Bazel binary? This would bring a few advantages:

  • One less process to create (process creation is expensive on Windows)
  • No need to come up with a protocol between the Bazel process and the test wrapper
  • One less moving part (we already have JNI code for process management)

The downside is that a badly-written test wrapper could bring down the whole Bazel process, but I assume the code wouldn't be that complicated for that to be a great risk.


### Interruption request

The communication channel between Bazel and the test wrapper process will be the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the standard way to communicate with a child process on Windows?

order to set up the test's environment, the test setup process must create the
test process.

## Processes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these processes going to be wrapped in a Job object (like all the current subprocesses on Windows)?

@dslomov
Copy link
Contributor

dslomov commented Jul 25, 2018

Meta-points:

Markdown format is very bad at supporting the discussion, Google Docs are just so much better
I think the right way would be to file github issues for anything that needs clarification

@laszlocsomor
Copy link
Contributor Author

laszlocsomor commented Jul 25, 2018

Re: #16 (comment)

I hear you, but I think the process is broken. The reality is, for everything else we do on GitHub (PRs, issues) we are used to commenting here. Expecting that people will behave differently just for proposal PRs is futile.

  • review discussion here is quite difficult to follow :(

I agree with that. That's a problem with GitHub's code review tool, IMO.

Perhaps MarkDown isn't the right format for design discussions docs and we should stick to GDocs, and maybe only use MarkDown for archival purposes. WDYT?

@laszlocsomor
Copy link
Contributor Author

Thanks everyone for reviews!

Let's continue the review comments on this email thread: https://groups.google.com/d/msg/bazel-dev/Vnugr-wDTqU/0tKcYkkyBwAJ

That email also explains the reasons. I'll copy all comment threads from this PR to that email thread.

@laszlocsomor
Copy link
Contributor Author

Actually, instead of trying to adjust this PR, let me create a new one just for the changes to README.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants