-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove internal bulk processor retries #3739
Conversation
} | ||
|
||
switch httpStatusCode { | ||
case http.StatusBadRequest, http.StatusUnauthorized, http.StatusForbidden, http.StatusNotFound, http.StatusConflict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
httpStatus := client.HttpStatus(err) | ||
isRetryable := client.IsRetryableStatus(httpStatus) | ||
var httpStatus int | ||
if err, isElasticErr := err.(*elastic.Error); isElasticErr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Use errors.As
. It's safer because it works even when the errors are wrapped
var httpStatusCode int | ||
if err, isElasticErr := err.(*elastic.Error); isElasticErr { | ||
httpStatusCode = err.Status | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is safer because it works with wrapped errors, and we return early. Leaving the status code variable as the zero value works because of the default branch, but I think it's better to just return early so that it's clear that this value has no meaning when it's not an ES error.
var esErr *elastic.Error
if !errors.As(err, &esErr) {
return true
}
httpStatusCode := esErr.Status
case http.StatusBadRequest, http.StatusUnauthorized, http.StatusForbidden, http.StatusNotFound, http.StatusConflict: | ||
return false | ||
default: | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means we retry all non-ES errors. Is that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean "non-ES"? This error came from Elasticsearch and Elasticsearch uses http status codes to indicate error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand what you meant. Yes, non-ES errors are most likely some network failures and should be retryable. Also they might indicate some code bug (like bad formed url or missed required parameter). In this case it would be probably better not to retry but I don't know how to differentiate them. Generally, the idea is not to retry something that we know for sure is non-retryable and retry all the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, just a few comments
c670b47
to
81067a4
Compare
What changed?
Remove internal bulk processor retries.
Why?
To simplify retry logic because visibility task processor has its own retry logic and it is better rely on it.
How did you test it?
Local runs with different failures from Elasticsearch.
Potential risks
No risks.
Is hotfix candidate?
Yes.