Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XXX characters being prepended to metrics when sent via UDP #241

Closed
petecheslock opened this issue Dec 19, 2016 · 5 comments
Closed

XXX characters being prepended to metrics when sent via UDP #241

petecheslock opened this issue Dec 19, 2016 · 5 comments

Comments

@petecheslock
Copy link

I've been seeing an odd issue when running the v2.3 version of carbon c relay.
If you send a metric to carbon-c-relay via UDP like:

echo "derp.herp 1 `date +%s`" | nc -u -w2 -q0  <carbon-c-host> <port>

the metric will land on the carbon cache as:

XXXXderp.herp 1 1482181602

If you send the same metric but as TCP:

echo "derp.herp 1 `date +%s`" | nc -w2  stats.dev.threatstack.net 2003 

Then the metric lands correctly

derp.herp 1 1482181752

I've done a packet capture on the carbon-c-relay port as well as the carbon cache port to see that when the metric hits the relay its fine, but when it hits the cache its prepended with XXX's What's interesting is some hosts send only a single X, others will send multiples like:

drwxr-xr-x  35 graphite graphite  4096 Dec 19 18:07 XXderp
drwxr-xr-x  15 graphite graphite  4096 Dec 19 20:18 XXXderp
drwxr-xr-x  75 graphite graphite  4096 Dec 19 19:52 XXXXderp
drwxr-xr-x   8 graphite graphite  4096 Dec 19 01:43 XXXXXderp
drwxr-xr-x  20 graphite graphite  4096 Dec 19 17:52 XXXXXXderp
drwxr-xr-x  10 graphite graphite  4096 Dec 19 18:07 XXXXXXXderp
drwxr-xr-x   5 graphite graphite  4096 Dec 19 01:43 XXXXXXXXderp
drwxr-xr-x   4 graphite graphite  4096 Dec 19 17:52 XXXXXXXXXderp
drwxr-xr-x  10 graphite graphite  4096 Dec 19 17:52 XXXXXXXXXXderp
drwxr-xr-x   7 graphite graphite  4096 Dec 19 17:52 XXXXXXXXXXXderp
drwxr-xr-x  11 graphite graphite  4096 Dec 19 20:24 XXXXXXXXXXXXderp
drwxr-xr-x  25 graphite graphite  4096 Dec 19 20:18 XXXXXXXXXXXXXderp
drwxr-xr-x   4 graphite graphite  4096 Dec 19 01:21 XXXXXXXXXXXXXXderp
drwxr-xr-x   4 graphite graphite  4096 Dec 19 00:51 XXXXXXXXXXXXXXXXderp
drwxr-xr-x   3 graphite graphite  4096 Dec 16 22:00 XXXXXXXXXXXXXXXXXXXXXXXXXXXXderp

Running a strace on the process we see the following.

[pid  5550] read(5, "X", 8188)          = 1
[pid  5550] read(6, 0x7ffbd9bc9475, 8191) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5550] read(5, "derpderpderpderpderp.herp 1 1482"..., 8187) = 39
[pid  5550] read(6, 0x7ffbd9bc9475, 8191) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5550] read(5, 0x7ffbd9bc502d, 8191) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5550] read(6, 0x7ffbd9bc9475, 8191) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5554] socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 8
[pid  5554] fcntl(8, F_GETFL)           = 0x2 (flags O_RDWR)
[pid  5554] fcntl(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  5554] connect(8, {sa_family=AF_INET, sin_port=htons(2023), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid  5554] poll([{fd=8, events=POLLIN|POLLOUT}], 1, 675) = 1 ([{fd=8, revents=POLLOUT}])
[pid  5554] fcntl(8, F_SETFL, O_RDWR)   = 0
[pid  5554] setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid  5554] setsockopt(8, SOL_SOCKET, SO_SNDTIMEO, "\n\0\0\0\0\0\0\0\260\255\1\0\0\0\0\0", 16) = 0
[pid  5554] write(8, "XXXXderpderpderpderpderp.herp 1 "..., 43) = 43
@petecheslock
Copy link
Author

We've done some more investigating and it appears that netcat might be sending X's as a part of the stream, but only when you run nc -v which you have to do otherwise the nc timeout for UDP is not respected if the upstream server is not responding.

Anyway, I'll close this. Its clearly a bug in bsd nc on linux

@grobian
Copy link
Owner

grobian commented Dec 20, 2016

Ah. netcat has too many implementations that aren't finished, and too many incarnations by distributions to finish those implementations. I feel your pain. Let me know if you suspect the relay might play a role in this.

@petecheslock
Copy link
Author

Interestingly enough - it seems that carbon-relay (python version) might ignore or strip these X's, at least we didn't see them come in with them until we switched over to carbon-c-relay. We did find this glorious bit of code that shows netcat will test the UDP port first before sending data. https://github.com/openbsd/src/blob/master/usr.bin/nc/netcat.c#L1385-L1402

We ended up changing and using the traditional nc and NOT the openbsd variant as this was the 2nd UDP related bug we found with it.

@grobian
Copy link
Owner

grobian commented Dec 20, 2016

That's really odd behaviour! I'm not sure I understand how carbon-relay could skip/ignore those. Perhaps they are supposed to be delivered as single package each, and hence discarded for being invalid. I wonder in that case why carbon-c-relay sees it as continued messages. Perhaps there's a bug in the handling (and proper termination) of UDP packages. I'll have a look.

grobian added a commit that referenced this issue Dec 25, 2016
As reported in issue #241, the relay was concatenating UDP packets until
it found a newline to terminate a line.  Since UDP packets don't
necessarily arrive (or in order), one can't rely on sending partial
fragments.  So instead, make sure we treat each received UDP packet as
a full message (and all we're going to get).
grobian added a commit that referenced this issue Dec 25, 2016
@grobian
Copy link
Owner

grobian commented Dec 25, 2016

Just for the record, I've fixed the X-probings to prepended to the metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants