Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add span events as a top level field for v0.4 encoding #5229

Merged
merged 38 commits into from
Mar 18, 2025

Conversation

wconti27
Copy link
Contributor

@wconti27 wconti27 commented Feb 7, 2025

What does this PR do?

Formatting Changes in this PR:

  • Changes variable naming in format.js to be less confusing

Actual Changes:

  • Updates format.js to format span events as a top-level field all the time.
  • Updates encoder classes to format span events given the encoder type. v0.5 encodes the events within the tags. V0.4 encodes the events within the tags if DD_TRACE_NATIVE_SPAN_EVENTS=true, else it encodes the span events within tags.
  • Adds DD_TRACE_NATIVE_SPAN_EVENTS env var that defaults to false

Motivation

Matches Otel Implementation. Being done in parallel across languages.

Plugin Checklist

Additional Notes

@wconti27 wconti27 requested a review from a team as a code owner February 7, 2025 18:54
@wconti27 wconti27 marked this pull request as draft February 7, 2025 18:54
@wconti27 wconti27 self-assigned this Feb 7, 2025
Copy link

github-actions bot commented Feb 7, 2025

Overall package size

Self size: 8.98 MB
Deduped: 101.51 MB
No deduping: 102.02 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.5.0 | 29.83 MB | 29.83 MB | | @datadog/native-appsec | 8.5.0 | 19.26 MB | 19.26 MB | | @datadog/native-iast-taint-tracking | 3.3.0 | 13.77 MB | 13.78 MB | | @datadog/pprof | 5.5.1 | 9.79 MB | 10.17 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.4.0 | 2.77 MB | 5.42 MB | | @datadog/native-iast-rewriter | 2.8.0 | 2.6 MB | 2.74 MB | | @datadog/native-metrics | 3.1.0 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.13.1 | 117.64 kB | 839.26 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.2 | 53.63 kB | 53.63 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | dc-polyfill | 0.1.6 | 24.56 kB | 24.56 kB | | shell-quote | 1.8.2 | 23.54 kB | 23.54 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

Copy link

codecov bot commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.52%. Comparing base (28eaa41) to head (d4e9fbe).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5229      +/-   ##
==========================================
+ Coverage   79.47%   79.52%   +0.05%     
==========================================
  Files         509      510       +1     
  Lines       22811    22867      +56     
==========================================
+ Hits        18128    18184      +56     
  Misses       4683     4683              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Feb 7, 2025

Datadog Report

Branch report: conti/serialize_span_events_as_top_level_field
Commit report: c66ae3c
Test service: dd-trace-js-integration-tests

✅ 0 Failed, 803 Passed, 0 Skipped, 17m 4.86s Total Time

Copy link
Member

@rochdev rochdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the /info endpoint doesn't work because it needs to be done on a per-connection basis since each request could otherwise end up on a different agent with a different version and different capabilities, and by that point it's too late to make the decision in most cases because the data has already been generated/formatted. Because of that, the only way this can be implemented for 0.4 is with a tag, and if we want a more efficient data model it needs to be on a newer protocol version that is configured manually by the user.

@marcotc marcotc requested a review from tlhunter February 7, 2025 21:26
@tlhunter
Copy link
Member

@rochdev do you mean that a customer might setup a round robin proxy in front of multiple agents of different versions?

Do we officially support swapping out agent versions on this fly like this?

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2025

Benchmarks

Benchmark execution time: 2025-03-18 14:14:27

Comparing candidate commit fce4d92 in PR branch conti/serialize_span_events_as_top_level_field with baseline commit 4f0481e in branch master.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 912 metrics, 19 unstable metrics.

scenario:startup-with-tracer-22

  • 🟥 cpu_user_time [+12.698ms; +19.566ms] or [+7.496%; +11.551%]
  • 🟥 execution_time [+18.232ms; +19.690ms] or [+8.866%; +9.575%]

@rochdev
Copy link
Member

rochdev commented Feb 10, 2025

do you mean that a customer might setup a round robin proxy in front of multiple agents of different versions?

That's one of the potential issues, but there are other problematic scenarios as well. It's not as rare as you'd think, and when we tried implementing the /info endpoint years ago we literally hit the issue in less than 24h.

Do we officially support swapping out agent versions on this fly like this?

We do, depending on your definition of "officially".

@tlhunter
Copy link
Member

This seems like such a weird edge case that we shouldn't worry about. It would be akin to a user leaving an application running that communicates with Postgres 14, then downgrading that database to Postgres 13, and hoping that the connection re-establishes and the app works without error. As a software developer I would never assume that scenario would work. If anything it's fantastic that we allow the in-place upgrade.

@rochdev
Copy link
Member

rochdev commented Feb 12, 2025

It's not a weird edge case, and as mentioned above, we hit an issue within 24h of trying to implement the /info endpoint. Doing this right requires a lot of complexity that doesn't even handle every use case properly, whereas alternatives are much simpler and will work 100% of the time, so I don't see why we should go with the subpar option.

@wconti27
Copy link
Contributor Author

wconti27 commented Mar 3, 2025

Checking the /info endpoint doesn't work because it needs to be done on a per-connection basis since each request could otherwise end up on a different agent with a different version and different capabilities, and by that point it's too late to make the decision in most cases because the data has already been generated/formatted. Because of that, the only way this can be implemented for 0.4 is with a tag, and if we want a more efficient data model it needs to be on a newer protocol version that is configured manually by the user.

@rochdev AFAIK this was settled with @marcotc, and the decision was that as long as there's a config option allowing us to turn this feature off, all is well. Can you re-review, I added the configuration so now this feature can be turned off if desired.

@wconti27 wconti27 requested review from tlhunter and rochdev March 5, 2025 20:40
@wconti27 wconti27 requested a review from rochdev March 17, 2025 18:03
Copy link
Member

@rochdev rochdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor things but otherwise LGTM.

@wconti27 wconti27 requested a review from rochdev March 18, 2025 14:26
@wconti27 wconti27 merged commit 0d3e671 into master Mar 18, 2025
422 checks passed
@wconti27 wconti27 deleted the conti/serialize_span_events_as_top_level_field branch March 18, 2025 15:49
juan-fernandez pushed a commit that referenced this pull request Mar 19, 2025
* adds span events at top level for v0.4 encoding

* only serializes span events as top level if DD_TRACE_NATIVE_SPAN_EVENTS=true

Co-authored-by: Thomas Hunter II <tlhunter@datadog.com>
@juan-fernandez juan-fernandez mentioned this pull request Mar 19, 2025
juan-fernandez pushed a commit that referenced this pull request Mar 19, 2025
* adds span events at top level for v0.4 encoding

* only serializes span events as top level if DD_TRACE_NATIVE_SPAN_EVENTS=true

Co-authored-by: Thomas Hunter II <tlhunter@datadog.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants