Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: runtime.node.event_loop.delay.avg metric disappeared #5389

Open
EQuincerot opened this issue Mar 11, 2025 · 8 comments
Open

[BUG]: runtime.node.event_loop.delay.avg metric disappeared #5389

EQuincerot opened this issue Mar 11, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@EQuincerot
Copy link

EQuincerot commented Mar 11, 2025

Tracer Version(s)

5.41.1

Node.js Version(s)

v22.14.0

Bug Report

Since the v5.41.1, the metric runtime.node.event_loop.delay.avg disappeared from our dashboard.

The metrics comes back when we switch back to v5.41.0.

Reproduction Code

  • monitor a node process with dd-trace-js
  • add runtime.node.event_loop.delay.avg metric to a dashboard
  • with dd-trace-js v5.41.0, the metric is visible in the dashboard
  • upgrade dd-trace-js to 5.41.1, the metric is no longer sent
  • downgrade dd-trace-js to 5.41.0, the metric is sent again.

Error Logs

No response

Tracer Config

No response

Operating System

No response

Bundling

Unsure

@EQuincerot EQuincerot added the bug Something isn't working label Mar 11, 2025
@Grmiade
Copy link

Grmiade commented Mar 11, 2025

We have the same issue on our side.
We encounter some errors on the agent side when we try to send runtime metrics, like:

2025-03-11 13:49:51 UTC | CORE | ERROR | (comp/dogstatsd/server/server.go:623 in errLog) | Dogstatsd: error parsing metric message '"runtime.node.event_loop.delay.total:0[object Object]|c|#service:<service_name>,version:<version>"': could not parse dogstatsd metric values: strconv.ParseFloat: parsing "0[object Object]": invalid syntax

This also creates a memory leak on our services. Probably because the metrics are temporary stored in memory as long as they are not sent? By disabling DD_RUNTIME_METRICS_ENABLED, the memory leak is gone.

Could it be related to this recent change #5347?

@MMShep97
Copy link

MMShep97 commented Mar 11, 2025

I filed a ticket with their support staff; glad to see you might be experiencing the same thing @Grmiade (mem leak). I will link this thread there as well.

Don't think people can view this, but maybe for the devs: https://help.datadoghq.com/hc/requests/2068787

Image

@EQuincerot
Copy link
Author

We also had a memory recently, I suppose it's similar to #5360 but in my case, I don't have evidence that the memory leak comes from datadog.

@yo-shun-kou
Copy link

Tracer Version(s)
5.41.1

Node.js Version(s)
18.20.7

We also have a memory leak and got a lot of error messages about Datadog.

Image Image

@Joosakur
Copy link

Joosakur commented Mar 12, 2025

We also have the same error and the memory leak with 5.41.1. Had to revert back to 5.40.0.

@EQuincerot
Copy link
Author

For us, it seems that memory leak arrived with 5.41.0 and the issue with metric with 5.41.1.
Do you confirm?

@danopia
Copy link

danopia commented Mar 12, 2025

I'm seeing that 5.41.0 has some (possibly tolerable) memory impact, while 5.41.1 introduces excessive memory impact and also CPU impact requiring a rollback. Seems like there are two different issues at play.

@tommoor
Copy link
Contributor

tommoor commented Mar 13, 2025

Image

Memory usage after deploying 5.41.0 on our servers… honestly this version should be removed as it's a liability. Just glad I was able to find this thread

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants