Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BWC for legacy detectors #69

Merged
merged 7 commits into from
Aug 24, 2021

Conversation

ohltyler
Copy link
Member

@ohltyler ohltyler commented Aug 17, 2021

Signed-off-by: Tyler Ohlsen ohltyler@amazon.com

Description

This PR fixes backward compatibility for realtime and historical detectors created on versions 1.0 or before.

The changes allow for the following use cases to occur:

  1. User creates realtime detector on 1.0 => upgrades to 1.1 => renders as a detector running a realtime job, with no associated historical analysis
  2. User creates historical detector on 1.0 => upgrades to 1.1 => renders as a detector with no realtime job, and a populated historical analysis

For 1.1, the backend detector data models & task data models have changed, but we maintain the same data models on the frontend. To handle this, the frontend needs to be able to properly parse data coming from either of the data models (either 1.1 or pre-1.1).

The solution consists of having separate logic for parsing 2 different kinds of detector data:

  1. Static info (detector ID / name / description / timestamp, etc). This is data that has remained in the same format across the different data models. It does not dynamically change based on a user starting or stopping a realtime or historical task. This type of data can be pulled and parsed in similar ways as before.
  2. Task-related & job-related info (curState / initProgress / taskId / taskProgress, etc.). Prior to 1.1, this data was pulled from different APIs (get detector API, profile API), and was parsed according to the formatted data that was returned. Because the returned format has changed, trying to parse this data from these responses can get extremely complex when considering all of the different scenarios, especially if Dashboards is communicating with a mixed cluster (e.g., old detector on old node, new detector on old node, old detector on new node, new detector on new node). To simplify where this data comes from, we can instead use the search task API to pull all of the latest realtime & historical tasks per detector. This decouples the implementation-specific formats with the task-and-job-related detector info.

One way that this search task strategy as described in point 2 fails, is when trying to fetch info for old realtime detectors. These types of detectors don't actually have any existing tasks. For consistency, the backend will perform a one-time backfilling task, which will create realtime tasks for any old realtime detector, once a node running on the new 1.1 version has joined a cluster. Before the backfilling is complete, however, the frontend has no way of knowing an old realtime detector's true state, without making excess API calls. To handle this, the frontend will estimate such detectors as Running or Stopped, based on any associated job set to enabled or not, respectively. If no job is found, the state is defaulted to Stopped.

Some of the other ways BWC is supported here includes:

  • searching over the legacy HISTORICAL task type to fetch legacy historical tasks
  • handling the possibility of multiple realtime tasks returned, and selecting the most recent one based on the execution_start_time (this case can possibly happen when performing the legacy realtime task backfilling)
  • pulling detection_date_range from 2 different possible places, since the field was refactored in the 1.1 changes

Testing done:

  • Created detectors on 1.0 => upgraded domain to 1.1-target-changes => make sure render properly. Specifically, the following scenarios were tested:
    • Realtime detector (never ran)
    • Realtime detector (running)
    • Realtime high-cardinality detector (never ran)
    • Realtime high-cardinality detector (running)
    • Historical detector (never ran)
    • Historical detector (ran)
    • Detector with no features
    • Sample detector
  • After upgrading, each was tested by running an additional realtime job and historical analysis to make sure the stored tasks were updated to show the latest.
  • New detectors were created on the new domain for all of the same detector types as listed above (except for the no-features-detector). Additional realtime jobs and historical analyses were ran for them as well, to make sure the stored tasks were updated to show the latest.
  • Added UT for the helper fns added in adHelpers

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ohltyler ohltyler marked this pull request as draft August 17, 2021 21:48
@ohltyler ohltyler marked this pull request as ready for review August 18, 2021 19:07
@ohltyler ohltyler requested review from ylwu-amzn and kaituo August 18, 2021 19:07
Copy link
Contributor

@ylwu-amzn ylwu-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the change!

@ohltyler ohltyler merged commit eae1c19 into opensearch-project:main Aug 24, 2021
ohltyler added a commit to ohltyler/anomaly-detection-dashboards-plugin-1 that referenced this pull request Sep 1, 2021
Signed-off-by: Tyler Ohlsen <ohltyler@amazon.com>
ohltyler added a commit that referenced this pull request Sep 1, 2021
Signed-off-by: Tyler Ohlsen <ohltyler@amazon.com>
@ohltyler ohltyler added infra Changes to infrastructure, testing, CI/CD, pipelines, etc. enhancement New feature or request and removed infra Changes to infrastructure, testing, CI/CD, pipelines, etc. labels Sep 2, 2021
@ohltyler ohltyler deleted the bwc-detector-fix branch September 2, 2021 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-compatibility enhancement New feature or request v1.1.0 Version 1.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants