Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in ApolloServerPluginUsageReporting plugin #7639

Open
gabotechs opened this issue Jul 12, 2023 · 11 comments
Open

Memory leak in ApolloServerPluginUsageReporting plugin #7639

gabotechs opened this issue Jul 12, 2023 · 11 comments
Assignees

Comments

@gabotechs
Copy link

Issue Description

We've been running Apollo server for a while in a couple of APIs, and we have always noticed a memory leak in both, which appeared to be linearly proportional to the number of requests handled by each API.

While investigating the memory leak, v8 heap snapshots where taken from the running servers at two different timestamps, with a distance of 6 hours. The latter heap snapshot was compared to the previous one in order to track what new objects are in the JS heap that where not 6 hours before, and there are thousands of new retained Request-like objects that reference the "usage-reporting.api.apollographql.com" host, and hundreds of TLSSocket new objects that reference this same host.

Some objects that are leaking in the JS memory:

Request-like object
body::Object@13534193
cache::"default"@729🗖
client::Object@13537293
credentials::"same-origin"@54437🗖
cryptoGraphicsNonceMetadata::""@77🗖
destination::""@77🗖
done::system / Oddball@73🗖
headersList::HeadersList@13537317
historyNavigation::system / Oddball@75🗖
initiator::""@77🗖
integrity::""@77🗖
keepalive::system / Oddball@75🗖
localURLsOnly::system / Oddball@75🗖
map::system / Map@130579
method::"POST"@49427🗖
mode::"cors"@84517🗖
origin::system / Oddball@67🗖
parserMetadata::""@77🗖
policyContainer::Object@13537295
preventNoCacheCacheControlHeaderModification::system / Oddball@75🗖
priority::system / Oddball@71🗖
properties::system / PropertyArray@13537319
redirect::"follow"@53093🗖
referrer::"no-referrer"@85507🗖
referrerPolicy::system / Oddball@67🗖
reloadNavigation::system / Oddball@75🗖
replacesClientId::""@77🗖
reservedClient::system / Oddball@71🗖
responseTainting::"basic"@102749🗖
serviceWorkers::"none"@519🗖
taintedOrigin::system / Oddball@75🗖
timingAllowFailed::system / Oddball@75🗖
unsafeRequest::system / Oddball@75🗖
url::URL@13537301
<symbol context>::URLContext@13538143
fragment::system / Oddball@71🗖
host::"usage-reporting.api.apollographql.com"@13538145🗖
map::system / Map@135759
path::Array@13538147
port::system / Oddball@71🗖
query::system / Oddball@71🗖
scheme::"https:"@6945🗖
username::""@77🗖
__proto__::Object@135757
<symbol query>::URLSearchParams@13538149
map::system / Map@135741
__proto__::URL@135739🗖
urlList::Array@13537299
useCORSPreflightFlag::system / Oddball@75🗖
useCredentials::system / Oddball@75🗖
userActivation::system / Oddball@75🗖
window::"no-window"@87117🗖
__proto__
TLSSocket object
<symbol blocking>::system / Oddball@75🗖
<symbol client>::Client@131765
<symbol connect-options>::Object@13536139
<symbol error>::InformationalError@13536143
<symbol kBuffer>::system / Oddball@71🗖
<symbol kBufferCb>::system / Oddball@71🗖
<symbol kBufferGen>::system / Oddball@71🗖
<symbol kCapture>::system / Oddball@75🗖
<symbol kHandle>::system / Oddball@71🗖
<symbol kSetKeepAlive>::system / Oddball@75🗖
<symbol kSetNoDelay>::system / Oddball@73🗖
<symbol maxRequestsPerClient>::system / Oddball@67🗖
<symbol no ref>::system / Oddball@73🗖
<symbol parser>::system / Oddball@71🗖
<symbol pendingSession>::system / Oddball@71🗖
<symbol res>::system / Oddball@71🗖
<symbol reset>::system / Oddball@75🗖
<symbol timeout>::system / Oddball@71🗖
<symbol verified>::system / Oddball@73🗖
<symbol writing>::system / Oddball@75🗖
_SNICallback::system / Oddball@71🗖
_closeAfterHandlingError::system / Oddball@75🗖
_controlReleased::system / Oddball@73🗖
_events::Object@13536133
_hadError::system / Oddball@75🗖
_host::"usage-reporting.api.apollographql.com"@131813🗖
_maxListeners::system / Oddball@67🗖
_newSessionPending::system / Oddball@75🗖
_parent::system / Oddball@71🗖
_peername::Object@13536141
_pendingData::system / Oddball@71🗖
_pendingEncoding::""@77🗖
_readableState::ReadableState@13536135
_rejectUnauthorized::system / Oddball@73🗖
_requestCert::system / Oddball@73🗖
_secureEstablished::system / Oddball@73🗖
_securePending::system / Oddball@75🗖
_server::system / Oddball@71🗖
_sockname::system / Oddball@71🗖
_tlsOptions::Object@13536129
_writableState::WritableState@13536137
allowHalfOpen::system / Oddball@75🗖
alpnProtocol::system / Oddball@75🗖
authorizationError::system / Oddball@71🗖
authorized::system / Oddball@73🗖
connecting::system / Oddball@75🗖
domain::system / Oddball@71🗖
encrypted::system / Oddball@73🗖
map::system / Map@130053
properties::system / PropertyArray@13536145
secureConnecting::system / Oddball@75🗖
server::system / Oddball@67🗖
servername::"usage-reporting.api.apollographql.com"@13536131🗖
ssl::system / Oddball@71🗖
__proto__::Socket@147607🗖

Here is a chart showing the memory usage of the last two days for one of the APIs:

Screenshot 2023-07-12 at 09 17 21

The first left half of the chart (the first day) the Apollo server was running with the ApolloServerPluginUsageReporting enabled, and the memory kept increasing linearly, and the last half (the second day), exactly the same code was running but passing the ApolloServerPluginUsageReportingDisabled to the plugins, so that the usage reporting is disabled. In this last case no memory was being leaked.

We are using @apollo/server with version 4.3.0

Link to Reproduction

https://github.com/GabrielMusatMestre/apollo-server-memory-leak-repro

Reproduction Steps

Steps are described in the README.md of the reproduction repo.

This is not a reliable reproduction, as the memory leak might start being noticeable by running the server under heavy load for hours or days, and it needs a properly configured APOLLO_KEY and APOLLO_GRAPH_REF that will actually publish usage reports to Apollo.

@trevor-scheer trevor-scheer self-assigned this Jul 21, 2023
@trevor-scheer
Copy link
Member

Thanks for the great report, sorry it's taken me a week to get to this. I'll have a closer look tomorrow!

@trevor-scheer
Copy link
Member

@gabotechs have you had a chance to look at my PR and try out the built packages?

@juancarlosjr97
Copy link
Contributor

Any progress on this issue? We have a memory leak and we are trying to identify where it is coming from too, but we have disabled the reporting and still see a memory spike.

@trevor-scheer
Copy link
Member

@juancarlosjr97 are you able to create a minimal reproduction? That's certainly interesting that you see the issue with reporting disabled, they may be unrelated.

@juancarlosjr97
Copy link
Contributor

juancarlosjr97 commented Oct 3, 2023

I will create one @trevor-scheer and share it. The issue has been narrowed down to many unique operations for the same query.

query GetBestSellers($category1: ProductCategory) {
  bestSellers(category: $category1) {
    title
  }
}
query GetBestSellers($category2: ProductCategory) {
  bestSellers(category: $category2) {
    title
  }
}

The $category1 and $category2 could have the same value but they are identified as two different operations. We reverted yesterday to using the same operation and the memory increase issue has been resolved. However, it reveals that the memory used has not been released (nearly 12 hours after the change of the operations) which might point to another issue of garbage collection.

We have tested this behaviour on the Apollo v3 and Apollo v4, with all plugins disabled, and still the same behaviour.

Also, profiling revealed that what is taking more memory from and continuously growing is the graphql library, specifically from the node_modules/graphql/language.

This might be another issue, and if it is, I will raise a separate issue for it if has not been raised already internally by Apollo by then.

We have raised this internally with Apollo and the reference number is 9623.

@trevor-scheer
Copy link
Member

@juancarlosjr97 given what you've narrowed it down to so far this does seem unrelated, so a separate issue would be preferred.

How sure are you that:

for the same query

is relevant? Have you ruled out just "many unique operations" by itself? The "same query excluding variables" part would be a surprising twist to the issue.

The graphql/language leads us to parsing and printing from graphql-js so it would be interesting to run the set of queries that make up your reproduction against just the graphql-js parser directly and see if you still see the same issue with GC / memory usage increasing over time.

@glasser
Copy link
Member

glasser commented Oct 3, 2023

@trevor-scheer If they're saying that they change the name of the variable ($category1 vs $category2) then I think most of our systems (including usage reporting) will consider them as distinct operations.

@trevor-scheer
Copy link
Member

@glasser right, I'm asking for clarification on exactly that. Like I don't think that it matters that the operation is entirely the same minus the variable name. It sounds more like a "many different operations" problem, more generally. I would be surprised if the issue was limited to that specific nuance.

@glasser
Copy link
Member

glasser commented Oct 3, 2023

Right, I agree that the problem is likely "many different operations" — just that it might not be obvious that we treat those operations as distinct.

@juancarlosjr97
Copy link
Contributor

Thank you for the replies @trevor-scheer and @glasser. We can confirm at this point, that the memory has been steady since we the consumer changed the query to be identified as the same operation using the same queries.

I will work on reproducing the issue and when I have done it, I will raise another issue with all the details and a repository to clone and replicate the bug, as the memory is not getting released which is the problem

@juancarlosjr97
Copy link
Contributor

@trevor-scheer and @glasser I created a project with instruction that shows the memory leak issue https://github.com/juancarlosjr97/apollo-graphql-federation-memory-leak. I am going to be raising another issue with all the details of the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants