Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALS-4617] Performance improvements for GIC #71

Merged
merged 1 commit into from
Jul 24, 2023
Merged

Conversation

Luke-Sikina
Copy link
Member

  • Made QueryProcessor filter columns that DNE
  • Paralellized QueryProcessor::runQuery
  • Switch sout logging to standard slf4j logging
    • Faster
    • No blocking threads due to synchronized methods

@Luke-Sikina Luke-Sikina requested a review from ramari16 July 20, 2023 19:56
@@ -180,7 +184,7 @@ public String toString() {
break;
default:
//no logic here; all enum values should be present above
System.out.println("Formatting not supported for type " + expectedResultType);
log.warn("Formatting not supported for type {}", expectedResultType);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was showing up in the logs, so I switched it to a proper log statement.

@@ -627,7 +627,7 @@ public PhenoCube<?> load(String key) throws Exception {
inStream.close();
return ret;
}else {
System.out.println("ColumnMeta not found for : [" + key + "]");
log.warn("ColumnMeta not found for : [{}]", key);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This print statement was far more problematic. This was happening 1000s of times for some GIC queries. Because sout is synchronized, this would block all the worker threads in the fork join pool, grinding HPDS to a halt. This in turn caused timeouts for simpler requests, effectively breaking everything.

Comment on lines -67 to +74
int columnCount = paths.size() + 1;

List<ColumnMeta> columns = paths.stream()
List<ColumnMeta> columns = query.getFields().stream()
.map(abstractProcessor.getDictionary()::get)
.filter(Objects::nonNull)
.collect(Collectors.toList());
List<String> paths = columns.stream()
.map(ColumnMeta::getName)
.collect(Collectors.toList());
int columnCount = paths.size() + 1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We weren't accounting for non-normalized concept paths, which GIC has in spades. This filters out concept paths not present in this institution prior to the query.

- Made QueryProcessor filter columns that DNE
- Paralellized QueryProcessor::runQuery
- Switch `sout` logging to standard slf4j logging
  - Faster
  - No blocking threads due to synchronized methods
@Luke-Sikina Luke-Sikina merged commit 28ab343 into master Jul 24, 2023
@Luke-Sikina Luke-Sikina added enhancement New feature or request breaking-change and removed breaking-change labels Jul 25, 2023
@Luke-Sikina Luke-Sikina deleted the perf-fixes-bch branch September 25, 2023 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants