Skip to content

Commit 1260c5c

Browse files
author
Sotiris Nanopoulos
authored
Enable Windows workers (#17555)
Fixing an issue where Envoy workers are not picking up connections on Windows. The root cause of the issue is in the Windows kernel. There are two issues that we need to consider: If we want to listen from a duplicated socket then we need to duplicate it after we call listen on the original socket. If duplicated sockets try to accept at the same time, then one of the accept calls might block. Even if the sockets are non-blocking. The best way to work around that issue is to only listen/accept connections from the first worker thread and then use ExactConnectionBalance to dispatch the connection to another worker thread. This is a temporary solution until the underlying issue is fixed on Windows. Signed-off-by: Sotiris Nanopoulos <sonanopo@microsoft.com>
1 parent 917be3c commit 1260c5c

File tree

5 files changed

+83
-13
lines changed

5 files changed

+83
-13
lines changed

docs/root/intro/arch_overview/intro/threading_model.rst

+4
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,7 @@ to have Envoy forcibly balance connections between worker threads. To support th
2424
Envoy allows for different types of :ref:`connection balancing
2525
<envoy_v3_api_field_config.listener.v3.Listener.connection_balance_config>` to be configured on each :ref:`listener
2626
<arch_overview_listeners>`.
27+
28+
On Windows the kernel is not able to balance the connections properly with the async IO model that Envoy is using.
29+
Until this is fixed by the platfrom, Envoy will enforce listener connection balancing on Windows. This allows us to
30+
balance connections between different worker threads. This behavior comes with a performance penalty.

docs/root/version_history/current.rst

+1
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Bug Fixes
4242
* access log: fix ``%UPSTREAM_CLUSTER%`` when used in http upstream access logs. Previously, it was always logging as an unset value.
4343
* aws request signer: fix the AWS Request Signer extension to correctly normalize the path and query string to be signed according to AWS' guidelines, so that the hash on the server side matches. See `AWS SigV4 documentaion <https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html>`_.
4444
* cluster: delete pools when they're idle to fix unbounded memory use when using PROXY protocol upstream with tcp_proxy. This behavior can be temporarily reverted by setting the ``envoy.reloadable_features.conn_pool_delete_when_idle`` runtime guard to false.
45+
* listener: fixed an issue on Windows where connections are not handled by all worker threads.
4546
* xray: fix the AWS X-Ray tracer bug where span's error, fault and throttle information was not reported properly as per the `AWS X-Ray documentation <https://docs.aws.amazon.com/xray/latest/devguide/xray-api-segmentdocuments.html>`_. Before this fix, server error was reported under 'annotations' section of the segment data.
4647

4748
Removed Config or Runtime

source/server/listener_impl.cc

+31-9
Original file line numberDiff line numberDiff line change
@@ -168,25 +168,33 @@ void ListenSocketFactoryImpl::doFinalPreWorkerInit() {
168168
return;
169169
}
170170

171-
for (auto& socket : sockets_) {
172-
const auto rc = socket->ioHandle().listen(tcp_backlog_size_);
173-
#ifndef WIN32
171+
ASSERT(!sockets_.empty());
172+
auto listen_and_apply_options = [](Envoy::Network::SocketSharedPtr socket, int tcp_backlog_size) {
173+
const auto rc = socket->ioHandle().listen(tcp_backlog_size);
174174
if (rc.return_value_ != 0) {
175175
throw EnvoyException(fmt::format("cannot listen() errno={}", rc.errno_));
176176
}
177-
#else
178-
// TODO(davinci26): listen() error handling and generally listening on multiple workers
179-
// is broken right now. This needs follow up to do something better on Windows.
180-
UNREFERENCED_PARAMETER(rc);
181-
#endif
182-
183177
if (!Network::Socket::applyOptions(socket->options(), *socket,
184178
envoy::config::core::v3::SocketOption::STATE_LISTENING)) {
185179
throw Network::SocketOptionException(
186180
fmt::format("cannot set post-listen socket option on socket: {}",
187181
socket->addressProvider().localAddress()->asString()));
188182
}
183+
};
184+
// On all platforms we should listen on the first socket.
185+
auto iterator = sockets_.begin();
186+
listen_and_apply_options(*iterator, tcp_backlog_size_);
187+
++iterator;
188+
#ifndef WIN32
189+
// With this implementation on Windows we only accept
190+
// connections on Worker 1 and then we use the `ExactConnectionBalancer`
191+
// to balance these connections to all workers.
192+
// TODO(davinci26): We should update the behavior when socket duplication
193+
// does not cause accepts to hang in the OS.
194+
for (; iterator != sockets_.end(); ++iterator) {
195+
listen_and_apply_options(*iterator, tcp_backlog_size_);
189196
}
197+
#endif
190198
}
191199

192200
ListenerFactoryContextBaseImpl::ListenerFactoryContextBaseImpl(
@@ -545,6 +553,19 @@ void ListenerImpl::buildFilterChains() {
545553
void ListenerImpl::buildSocketOptions() {
546554
// TCP specific setup.
547555
if (connection_balancer_ == nullptr) {
556+
#ifdef WIN32
557+
// On Windows we use the exact connection balancer to dispatch connections
558+
// from worker 1 to all workers. This is a perf hit but it is the only way
559+
// to make all the workers do work.
560+
// TODO(davinci26): We can be faster here if we create a balancer implementation
561+
// that dispatches the connection to a random thread.
562+
ENVOY_LOG(warn,
563+
"ExactBalance was forced enabled for TCP listener '{}' because "
564+
"Envoy is running on Windows."
565+
"ExactBalance is used to load balance connections between workers on Windows.",
566+
config_.name());
567+
connection_balancer_ = std::make_shared<Network::ExactConnectionBalancerImpl>();
568+
#else
548569
// Not in place listener update.
549570
if (config_.has_connection_balance_config()) {
550571
// Currently exact balance is the only supported type and there are no options.
@@ -553,6 +574,7 @@ void ListenerImpl::buildSocketOptions() {
553574
} else {
554575
connection_balancer_ = std::make_shared<Network::NopConnectionBalancerImpl>();
555576
}
577+
#endif
556578
}
557579

558580
if (config_.has_tcp_fast_open_queue_length()) {

test/integration/integration_test.cc

+47
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,53 @@ TEST_P(IntegrationTest, PerWorkerStatsAndBalancing) {
157157
check_listener_stats(0, 1);
158158
}
159159

160+
// Make sure all workers pick up connections
161+
TEST_P(IntegrationTest, AllWorkersAreHandlingLoad) {
162+
concurrency_ = 2;
163+
initialize();
164+
165+
std::string worker0_stat_name, worker1_stat_name;
166+
if (GetParam() == Network::Address::IpVersion::v4) {
167+
worker0_stat_name = "listener.127.0.0.1_0.worker_0.downstream_cx_total";
168+
worker1_stat_name = "listener.127.0.0.1_0.worker_1.downstream_cx_total";
169+
} else {
170+
worker0_stat_name = "listener.[__1]_0.worker_0.downstream_cx_total";
171+
worker1_stat_name = "listener.[__1]_0.worker_1.downstream_cx_total";
172+
}
173+
174+
test_server_->waitForCounterEq(worker0_stat_name, 0);
175+
test_server_->waitForCounterEq(worker1_stat_name, 0);
176+
177+
// We set the counters for the two workers to see how many connections each handles.
178+
uint64_t w0_ctr = 0;
179+
uint64_t w1_ctr = 0;
180+
constexpr int loops = 5;
181+
for (int i = 0; i < loops; i++) {
182+
constexpr int requests_per_loop = 4;
183+
std::array<IntegrationCodecClientPtr, requests_per_loop> connections;
184+
for (int j = 0; j < requests_per_loop; j++) {
185+
connections[j] = makeHttpConnection(lookupPort("http"));
186+
}
187+
188+
auto worker0_ctr = test_server_->counter(worker0_stat_name);
189+
auto worker1_ctr = test_server_->counter(worker1_stat_name);
190+
auto target = w0_ctr + w1_ctr + requests_per_loop;
191+
while (test_server_->counter(worker0_stat_name)->value() +
192+
test_server_->counter(worker1_stat_name)->value() <
193+
target) {
194+
timeSystem().advanceTimeWait(std::chrono::milliseconds(10));
195+
}
196+
w0_ctr = test_server_->counter(worker0_stat_name)->value();
197+
w1_ctr = test_server_->counter(worker1_stat_name)->value();
198+
for (int j = 0; j < requests_per_loop; j++) {
199+
connections[j]->close();
200+
}
201+
}
202+
203+
EXPECT_TRUE(w0_ctr > 1);
204+
EXPECT_TRUE(w1_ctr > 1);
205+
}
206+
160207
TEST_P(IntegrationTest, RouterDirectResponseWithBody) {
161208
const std::string body = "Response body";
162209
const std::string file_path = TestEnvironment::writeStringToFileForTest("test_envoy", body);

test/server/listener_manager_impl_test.cc

-4
Original file line numberDiff line numberDiff line change
@@ -1837,9 +1837,6 @@ TEST_F(ListenerManagerImplTest, NotSupportedDatagramUds) {
18371837
"socket type SocketType::Datagram not supported for pipes");
18381838
}
18391839

1840-
// TODO(davinci26): See ListenSocketFactoryImpl::doFinalPreWorkerInit() for why this test is
1841-
// not run on Windows.
1842-
#ifndef WIN32
18431840
TEST_F(ListenerManagerImplTest, CantListen) {
18441841
InSequence s;
18451842

@@ -1870,7 +1867,6 @@ name: foo
18701867
1UL,
18711868
server_.stats_store_.counterFromString("listener_manager.listener_create_failure").value());
18721869
}
1873-
#endif
18741870

18751871
TEST_F(ListenerManagerImplTest, CantBindSocket) {
18761872
time_system_.setSystemTime(std::chrono::milliseconds(1001001001001));

0 commit comments

Comments
 (0)