-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: fix potential memory leak in getDirectQueryConnections with node 20 #9575
Fix: fix potential memory leak in getDirectQueryConnections with node 20 #9575
Conversation
❌ Empty Changelog SectionThe Changelog section in your PR description is empty. Please add a valid changelog entry or entries. If you did add a changelog entry, check to make sure that it was not accidentally included inside the comment block in the Changelog section. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9575 +/- ##
==========================================
- Coverage 61.82% 61.81% -0.02%
==========================================
Files 3825 3825
Lines 92058 92058
Branches 14602 14602
==========================================
- Hits 56916 56902 -14
- Misses 31469 31482 +13
- Partials 3673 3674 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/plugins/data_source_management/server/routes/data_connections_router.ts
Outdated
Show resolved
Hide resolved
for await (const datasource of dataSources) { | ||
await getDirectQueryConnections(datasource.id, http!) | ||
.then((connections) => directQueryConnections.push(...connections)) | ||
.catch(() => directQueryConnections.push([])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to swallow exceptions here?
|
||
for await (const datasource of dataSources) { | ||
await getDirectQueryConnections(datasource.id, http!) | ||
.then((connections) => directQueryConnections.push(...connections)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this removes the parallel execution, would it cause perf issues?
@@ -222,7 +222,9 @@ export function registerDataConnectionsRoute(router: IRouter, dataSourceEnabled: | |||
let dataConnectionsresponse; | |||
if (dataSourceEnabled && dataSourceMDSId) { | |||
const client = await context.dataSource.opensearch.legacy.getClient(dataSourceMDSId); | |||
dataConnectionsresponse = await client.callAPI('ppl.getDataConnections'); | |||
dataConnectionsresponse = await client.callAPI('ppl.getDataConnections', { | |||
requestTimeout: 2000, // Enforce timeout to avoid hanging requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a normal opensearch call, do you know why it's timing out? is it when cluster is deleted or not reachable but the MDS reference still exists? also 2 seconds might be too small
const directQueryConnections = directQueryConnectionsResult.flat(); | ||
const directQueryConnections: DataSourceTableItem[] = []; | ||
|
||
for await (const datasource of dataSources) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call out: are we not changing this from parallel requests to synchronously making requests? Does this impact the experience at all?
for await (const datasource of dataSources) { | ||
await getDirectQueryConnections(datasource.id, http!) | ||
.then((connections) => directQueryConnections.push(...connections)) | ||
.catch(() => directQueryConnections.push([])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before, empty arrays were flattened out - is this handled downstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I just realized by applying correct timeout alone can fix the issue. If Promise timeout faster, it won't pile up memory with Promise.all and cause memory issue. I just reverted the change to keep original Promise.all() for sake of performance. Also, 5s timeout for the specific calls on DSM table page is reasonable, as it's just a call to verify if the endpoint is reachable, no need to wait for the default 30s timeout to call with a single datasource endpoint
cc: @joshuali925 @huyaboo
Signed-off-by: Zhongnan Su <szhongna@amazon.com>
705747f
to
dbf287e
Compare
@@ -41,7 +41,7 @@ export const configSchema = schema.object({ | |||
), | |||
}), | |||
clientPool: schema.object({ | |||
size: schema.number({ defaultValue: 5 }), | |||
size: schema.number({ defaultValue: 10 }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this value means we keep 10 active connections to opensearch cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at most 10, it has mechanism to clean up stale ones
Description
fix potential memory leak in getDirectQueryConnections with node 20
5s
for the legacy client call onppl.dataSource
10
. Note that if single page is loading with 10+ datasources, this could still lead to server crash due to a deprecatedparseUrl
function inurl
is used in elasticsearch legacy library. However, to address this, we will need to deprecate usage of elasticsearch client, which is not currently in scope of 3.0 release.Issues Resolved
#9459
Screenshot
Testing the changes
Changelog
Check List
yarn test:jest
yarn test:jest_integration