Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsock: TCP test + fix a bug leading to an error in TCP connect call in guest #267

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mtjhrc
Copy link
Collaborator

@mtjhrc mtjhrc commented Feb 17, 2025

This PR adds a simple tests that attempts to connect to a TCP socket on host from the guest. However this would often fail, due to a race condition:

As far as I understand, this is what happens:
libkrun connects to the TCP socket on the host (this action is done upon the TSI_CONNECT request), but the guest is not connected to stream vsock socket corresponding to the actual transport for that connection. (We have not called confirm_connect). The problem is that since we assume the guest is already connected we attempt to recv here:

if evset.contains(EventSet::IN) {
debug!("process_event: IN");
if self.status == ProxyStatus::Connected {
let (signal_queue, wait_credit) = self.recv_pkt();
update.signal_queue = signal_queue;
if wait_credit && self.status != ProxyStatus::WaitingCreditUpdate {
self.status = ProxyStatus::WaitingCreditUpdate;
let rx = MuxerRx::CreditRequest {
local_port: self.local_port,
peer_port: self.peer_port,
fwd_cnt: self.tx_cnt.0,
};
update.push_credit_req = Some(rx);
}

This returns wait_credit = true, which causes us to send VSOCK_OP_CREDIT_REQUEST before sending the VSOCK_OP_RESPONSE to the VSOCK_OP_REQUEST which the guest kernel sends.
The kernel is not expecting this and returns EPROTO from connect call in the guest userspace. (specifically EPROTO is returned from the virtio_transport_recv_connecting in net/vmw_vsock/virtio_transport_common.c)

The fix here introduces a new state ConnectedUnconfirmed which it uses for TCP Proxy code in vsock. This state is used as a transitional state between the host socket being connected to libkrun and the confirm_connect being called, meaning the connection is actually established to the guest and we can actually send/recv.

A similar issue cannot happen with UDP sockets, because VSOCK_OP_REQUEST/VSOCK_OP_RESPONSE is only used for stream sockets. This also cannot currently happen with UNIX stream sockets, because currently these are only ever connected on the host and confirm_connect-ed at the same time here:

let update = unix.connect(pkt, tsi);
unix.confirm_connect(pkt);

Though, should the unix stream sockets also use the ConnectedUnconfirmed state too? For consistency of the semantics of the Proxy trait methods it would make sense...

Depends on: #258

@mtjhrc mtjhrc force-pushed the tsi-tests-and-fixes branch 3 times, most recently from 07ef524 to a5b8eed Compare February 18, 2025 18:05
mtjhrc added 3 commits March 4, 2025 11:48
Add a tests directory containing an E2E testing framework.
The tests are E2E in the sense that they use the public API of libkrun to
start a VM and run a test program in the guest.
Note that currently this is very limited, because there are no other userspace
executables or libraries (not even libc) in the guest apart from `guest-agent`.

`guest-agent` is a statically linked Rust executable, that executes the guest
part of each test.

In the future this can be extended to run tests with specific images which would
allow the use of system libraries by the test program.

This also introduces 2 tests:
test_vm_config - asserts the VM is constructed with the correct number of CPUs
                 and amount of memory. This also serves as an example how the
		 tests can be parameterized.
test_vsock_guest_connect - This tests the guest connecting to a vsock port
                           created by the `krun_add_vsock_port` API.

Signed-off-by: Matej Hrica <mhrica@redhat.com>
The install target should depend on generation of the libkrun.pc file,
because it installs it.
The libkrun.pc target should be .PHONY because libkrun.pc file might
already exists and be generated, but it could be made using a different PREFIX,
which would lead to the installed libkrun.pc pointing to the wrong prefix.

Signed-off-by: Matej Hrica <mhrica@redhat.com>
Intruduce a test target in Makefile, that runs the e2e tests agains the built binary.

Signed-off-by: Matej Hrica <mhrica@redhat.com>
@mtjhrc mtjhrc force-pushed the tsi-tests-and-fixes branch from a5b8eed to 78b3f5a Compare March 7, 2025 09:27
@mtjhrc mtjhrc marked this pull request as ready for review March 7, 2025 09:42
mtjhrc added 4 commits March 7, 2025 11:20
Attempt to run the tests in a new network namespace to allow the tests to use
fixed TCP ports.

Signed-off-by: Matej Hrica <mhrica@redhat.com>
Signed-off-by: Matej Hrica <mhrica@redhat.com>
Add another state to the Proxy trait called ConnectedUnconfirmed. This is used
when when the socket is connected (on the host), but confirm_connect hasn't
been called yet. Make the TCP implementation use this new transitional
state fixing the described bug:

`libkrun` connects to the TCP socket on the host (this action is done upon the
TSI_CONNECT request), but the guest is not yet connected to stream vsock socket
corresponding to the actual transport for that connection. (We have not called
`confirm_connect`). The problem is that since we assume the guest is already
connected we attempt to recv in process_event. Which can cause the
VSOCK_OP_CREDIT_REQUEST to be sent the guest, but the guest is expecting a
VSOCK_OP_RESPONSE as a response to a connection request. This makes the guest
kernel return EPROTO from the connect syscall.
(specifically EPROTO is returned from the `virtio_transport_recv_connecting`
 in `net/vmw_vsock/virtio_transport_common.c`)

Signed-off-by: Matej Hrica <mhrica@redhat.com>
…(vsock)

Signed-off-by: Matej Hrica <mhrica@redhat.com>
@mtjhrc mtjhrc force-pushed the tsi-tests-and-fixes branch from 78b3f5a to 5988a3d Compare March 7, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant