Vector source/sink not propagating non-retryable failures #17895
Labels
sink: vector
Anything `vector` sink related
source: vector
Anything `vector` source related
type: bug
A code related bug.
Discussed in #17873
Originally posted by sbalmos July 5, 2023
I'm still trying to quantify exactly what the bug is, but it seems like the vector source/sink does not propagate back up non-retryable delivery failures from the end sink. In my setup, I have a Vector instance as a sort of post office delivery multiplexer, reading from Kafka and distributing to one or more exporter Vector instances, connected via the vector sink.
In exporters where some failures are non-retryable (e.g. HTTP sink with non-retryable errors like 400), it seems as if at the exporter level the event is appropriately dropped. However, this drop action apparently does not communicate back a hard failure acknowledgement or similar signal through the vector protocol, back to the Mux instance. The vector sink retry on the Mux instance apparently sees a delivery failure (or rather lack of acknowledgement) and repeatedly tries to retry delivery of the message to the exporter. This continues ad nauseam until the retry count is exceeded or (more likely) the exporter's vector sink buffer on the Mux is filled - which then has a follow-on bad behavior of stopping the whole mux in its tracks, stopping delivery of all messages to all destination exporters.
The text was updated successfully, but these errors were encountered: