-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"EOF" error on kinesis PutRecord, only if request repeated every 6-7s #206
Comments
Neither of these PRs affects it. On a hunch I tried putting the client creation into the for loop so that it's recreated every time, but the result is the same. Always something like this:
|
@mateusz if you can provide detailed logging by setting the LogLevel config option to 1, that would help.
|
Here you go.
|
I was thinking that perhaps it has something to do with the subsequent request coinciding with a keep-alive connection being closed by the server on the other side, and the timing being just right?.. |
That looks like an error coming from Go's socket library code. It could indeed be a connection issue with the service. Are you using a proxy, by any chance? |
I've tested buffering requests and sending them no more frequently than every 10s, and that did not result in any errors after 24 hours of doing requests. I'm not using any cache, it's pretty much a basic debian wheezy box (and aws cli tool doesn't seem to have that problem, so that'd rule out a proxy problem). I've just retested this bit of go code on darwin/amd64 on my local to rule out EC2-related issues too - same problem. Here is the bash script I used from that EC2 machine for doing the same thing with aws cli, which did not show this problem - tried both with sleep 6 and sleep 7: #!/bin/bash
for i in {1..20}; do
(aws kinesis put-record --stream-name playpen-metrics --data "a" --partition-key "1" --region ap-southeast-2) &
sleep 6
done
wait I guess we'd be looking for that magic timeout number of "about 5 or 6" somewhere in the aws sdk or go socket code? Do you know how to disable keep-alives on these aws sdk requests? Maybe force connection closure after every request just to see if that would fix it? |
You can pass in your own Just to clarify, are you using a proxy? |
Sorry, that was a typo. Instead of proxy I said cache - no, I'm not using any proxy. |
@mateusz have you looked into disabling KeepAlive? Did this help at all? |
Heh... disabling keep-alive seems to be fixing it - no failures after 30 sends. Here is the test code. Removing line 21 from this gist reintroduces the issue - fails within the first few sends again. I might be completely misled, but here is another guess: according to this it seems one needs to close the response Body when done with it. I have added some debug output to all occurrences of "Body.Close" in this SDK, and none seems to have been hit. Do you think it could possibly be missing Body.Close? |
Taking a look at the code it looks like Kinesis uses the jsonrpc Unmarshal for reading the response body from the service serializing it into the output structure, but does not close the response body reader. Though I don't think this is the cause of the problem you are seeing. Locally I added the Body.Close to the jsonrpc used by Kensis, but was still able to reproduce the issue you are seeing with your sample code after the second request. Changing the delay between send to 3s and 9s I no longer received any failures. We need to do more investigation, but I think you might be correct in that this is a situation where the connection is being terminated by the server or client at the perfectly wrong time. |
👍 Thanks for confirming the issue - happy to hear it's not my specific set-up going crazy. |
I verified this is a Go net/http Client or Transport issue. Gist: readTimeoutExample.go reproduces the issue outside of the SDK. Where Client will not open a new connection if Client doesn't know the server closed its pooled connection prior to Transport.RoundTrip() writing the request to the pooled connection. I'll submit this issues upstream to the Go repo. In the meantime I suggest disabling keep-alive, or alternatively add retry logic for requests which fail with the EOF error. // minimalist Client options to disable keep-alive.
client := &http.Client{
Transport: &http.Transport{
DisableKeepAlives: true,
}
} |
Cool, looks like the issue is already reported golang/go#8946 and tagged for Go 1.5Maybe. Though the current proposed change is limited to HEAD/GET methods at the moment because they are idempotent. |
Thanks, we'll wait until it's fixed upstream then. |
Maybe fixed by #227? |
You're correct @dpiddy that issue did allow retries to work correctly. Going to go ahead and close this issue. Thanks for the heads up. |
I'm getting "EOF" errors (url.Error, Op=Post, URL=https://kinesis.ap-southeast-2.amazonaws.com/, Err="EOF") when doing repeated kinesis PutRecord requests. The data is not coming through.
This seems only to happen when I repeat the put requests every 6 or 7 seconds (with 7 seconds being more severe). I'm fairly certain it's not a throttling error (I'm the sole user of that stream, and I'm putting a single byte every 6-7 seconds), and the problem does not occur if I reduce the delay to 5s or less.
Also, when I started handling this error and retrying these EOF requests, I never had to retry more than once.
I have distilled the code on my side to allow to reliably replicate this problem from an EC2 instance in us-west-1, putting into a stream in ap-southeast-2 (I have not experimented with other regions). The authentication is done through ~/.aws/credentials, but I have also used STS and temporary creds and the behaviour does not change.
This error does not seem to happen when I make put-record requests using the aws cli tool from a bash script from the same EC2 box. I'm out of ideas :-) Help?
The text was updated successfully, but these errors were encountered: