Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error prevention stage 1 #579

Merged

Conversation

Angie-Zhang
Copy link
Contributor

@Angie-Zhang Angie-Zhang commented Oct 27, 2022

Issue #, if available:
#432

Description of changes:

  • Added error prevention basic framework
  • Added action validation for:
  • Rollover
  • Force merge
  • Replica count
  • Open
  • Read only
  • Read write

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

jowg-amazon and others added 30 commits August 18, 2022 23:13
Signed-off-by: Joanne Wang <jowg@amazon.com>
…situations (opensearch-project#419)

Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
…integration tests

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>
@deshsidd
Copy link
Contributor

@petardz Please review if you get the time. Thanks!

@deshsidd
Copy link
Contributor

Let's consider disabling the feature by default in the first iteration and enable it in the future releases as discussed. Thanks!

@@ -14,6 +14,7 @@ import java.util.function.Function
class ManagedIndexSettings {
companion object {
const val DEFAULT_ISM_ENABLED = true
const val DEFAULT_VALIDATION_SERVICE_ENABLED = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowenlan-amzn @khushbr
We are considering to set the validation service to disabled by default for 2.4. Please let us know your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions here:

  1. Is the Validation operation blocking / non-blocking for ISM actions ? On high level, if there are bugs/false positives from validation, will it prevent the ISM action ?
  2. Is there a kill switch to disable the Validation framework ?

I would prefer going "enabled" by default; given we have enough confidence in the code/feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khushbr

  1. It will block checked ISM actions if the validation fails.
  2. We use a parameter to enable/disable it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2.4 let's keep it disable by default. I don't see we are actually refactoring out the validation logic from the step execution but just duplicate the validation logic. This is the first step of this framework.
We can target 2.5 to enable by default

Signed-off-by: Angie Zhang <langelzh@amazon.com>
@@ -14,6 +14,7 @@ import java.util.function.Function
class ManagedIndexSettings {
companion object {
const val DEFAULT_ISM_ENABLED = true
const val DEFAULT_VALIDATION_SERVICE_ENABLED = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2.4 let's keep it disable by default. I don't see we are actually refactoring out the validation logic from the step execution but just duplicate the validation logic. This is the first step of this framework.
We can target 2.5 to enable by default

private fun skipRollover(indexName: String): Boolean {
val skipRollover = clusterService.state().metadata.index(indexName).getRolloverSkip()
if (skipRollover) {
validationStatus = ValidationStatus.PASSED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need another validationStatus for this kind of skipping logic. PASSED means ISM can continue to execute step, but here ISM actually should not execute step but directly skip executing.

val validationResult = validationService.validate(action.type, stepContext.metadata.index)
if (validationResult.validationStatus == Validate.ValidationStatus.RE_VALIDATING) {
logger.warn("Validation Status is: RE_VALIDATING")
publishErrorNotification(policy, managedIndexMetaData)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not gonna publish notification for validation message out of the box.
Need some modification to make it publish validationResult.validationMessage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO in next PR or release

@@ -115,8 +119,10 @@ class TransportExplainAction @Inject constructor(
private val enabledState: MutableMap<IndexName, Boolean> = mutableMapOf()
private val indexPolicyIDs = mutableListOf<PolicyID?>()
private val indexMetadatas = mutableListOf<ManagedIndexMetaData?>()
private val validationResults = mutableListOf<ValidationResult?>()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using a map here, managed index: validation result

@bowenlan-amzn
Copy link
Member

Provide examples of how user will see the validation results in the description

Signed-off-by: Angie Zhang <langelzh@amazon.com>
@codecov-commenter
Copy link

Codecov Report

Merging #579 (b455ce6) into main (63984b2) will decrease coverage by 0.56%.
The diff coverage is 51.43%.

@@             Coverage Diff              @@
##               main     #579      +/-   ##
============================================
- Coverage     75.84%   75.28%   -0.57%     
- Complexity     2482     2545      +63     
============================================
  Files           316      328      +12     
  Lines         14551    14940     +389     
  Branches       2250     2301      +51     
============================================
+ Hits          11036    11247     +211     
- Misses         2251     2381     +130     
- Partials       1264     1312      +48     
Impacted Files Coverage Δ
...ment/indexstatemanagement/util/RestHandlerUtils.kt 88.88% <ø> (ø)
...statemanagement/validation/ValidateReplicaCount.kt 0.00% <0.00%> (ø)
...agement/indexstatemanagement/ManagedIndexRunner.kt 45.55% <29.41%> (-0.64%) ⬇️
...nt/indexstatemanagement/validation/ValidateOpen.kt 29.62% <29.62%> (ø)
...ndexstatemanagement/validation/ValidateReadOnly.kt 44.00% <44.00%> (ø)
.../indexstatemanagement/validation/ValidateDelete.kt 47.05% <47.05%> (ø)
...dexstatemanagement/validation/ValidateReadWrite.kt 50.00% <50.00%> (ø)
...exstatemanagement/validation/ValidateForceMerge.kt 52.94% <52.94%> (ø)
...transport/action/explain/TransportExplainAction.kt 70.75% <58.62%> (-1.59%) ⬇️
...ndexstatemanagement/validation/ValidateRollover.kt 66.66% <66.66%> (ø)
... and 23 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Angie Zhang <langelzh@amazon.com>
@bowenlan-amzn bowenlan-amzn merged commit 2438f48 into opensearch-project:main Nov 3, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 3, 2022
* initial framework

Signed-off-by: Joanne Wang <jowg@amazon.com>

* Removed recursion from Explain Action to avoid stackoverflow in some situations (#419)

Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>

* enabled by default integrated

Signed-off-by: Joanne Wang <jowg@amazon.com>

* cleaned up comments and logs, created unit test and updated previous integration tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added delete validation logic

Signed-off-by: Joanne Wang <jowg@amazon.com>

* fixed rollover validation unit tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added validation info field to ManagedIndexMetaData

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed step context as input

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added validationmetadata class

Signed-off-by: Joanne Wang <jowg@amazon.com>

* restored old integration tests and changed validation service output

Signed-off-by: Joanne Wang <jowg@amazon.com>

* before integrated validation meta data into managed index meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* integrated validation meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* working version

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added validation mapping

Signed-off-by: Joanne Wang <jowg@amazon.com>

* fixed integ tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* renamed some values

Signed-off-by: Joanne Wang <jowg@amazon.com>

* before removing from managed index meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* created validation result object in explain

Signed-off-by: Joanne Wang <jowg@amazon.com>

* testing

Signed-off-by: Joanne Wang <jowg@amazon.com>

* run fails

Signed-off-by: Joanne Wang <jowg@amazon.com>

* integration test for delete + added framework for force merge

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed step validation metadata and still testing explain results

Signed-off-by: Joanne Wang <jowg@amazon.com>

* before removing from managed index runner

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed from managed index runner

Signed-off-by: Joanne Wang <jowg@amazon.com>

* clean up and tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* all validation tests pass

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed validation result from all managed index meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* restored old IT tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* fixed it tests, set explain validation to false

Signed-off-by: Joanne Wang <jowg@amazon.com>

* clean up

Signed-off-by: Joanne Wang <jowg@amazon.com>

* Implemented replica_count, open, read_only, read_write

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Implemented replica_count, open, read_only, read_write

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix Test cases

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix Test cases

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix messages

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix comments; set validation disabled by default

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Rename validation_service to action_validation; Fix some detekt issues

Signed-off-by: Angie Zhang <langelzh@amazon.com>

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Angie Zhang <langelzh@amazon.com>
Co-authored-by: Joanne Wang <jowg@amazon.com>
Co-authored-by: Petar <petar.dzepina@gmail.com>
Co-authored-by: Joanne Wang <109310487+jowg-amazon@users.noreply.github.com>
(cherry picked from commit 2438f48)
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 3, 2022
* initial framework

Signed-off-by: Joanne Wang <jowg@amazon.com>

* Removed recursion from Explain Action to avoid stackoverflow in some situations (#419)

Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Joanne Wang <jowg@amazon.com>

* enabled by default integrated

Signed-off-by: Joanne Wang <jowg@amazon.com>

* cleaned up comments and logs, created unit test and updated previous integration tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added delete validation logic

Signed-off-by: Joanne Wang <jowg@amazon.com>

* fixed rollover validation unit tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added validation info field to ManagedIndexMetaData

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed step context as input

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added validationmetadata class

Signed-off-by: Joanne Wang <jowg@amazon.com>

* restored old integration tests and changed validation service output

Signed-off-by: Joanne Wang <jowg@amazon.com>

* before integrated validation meta data into managed index meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* integrated validation meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* working version

Signed-off-by: Joanne Wang <jowg@amazon.com>

* added validation mapping

Signed-off-by: Joanne Wang <jowg@amazon.com>

* fixed integ tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* renamed some values

Signed-off-by: Joanne Wang <jowg@amazon.com>

* before removing from managed index meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* created validation result object in explain

Signed-off-by: Joanne Wang <jowg@amazon.com>

* testing

Signed-off-by: Joanne Wang <jowg@amazon.com>

* run fails

Signed-off-by: Joanne Wang <jowg@amazon.com>

* integration test for delete + added framework for force merge

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed step validation metadata and still testing explain results

Signed-off-by: Joanne Wang <jowg@amazon.com>

* before removing from managed index runner

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed from managed index runner

Signed-off-by: Joanne Wang <jowg@amazon.com>

* clean up and tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* all validation tests pass

Signed-off-by: Joanne Wang <jowg@amazon.com>

* removed validation result from all managed index meta data

Signed-off-by: Joanne Wang <jowg@amazon.com>

* restored old IT tests

Signed-off-by: Joanne Wang <jowg@amazon.com>

* fixed it tests, set explain validation to false

Signed-off-by: Joanne Wang <jowg@amazon.com>

* clean up

Signed-off-by: Joanne Wang <jowg@amazon.com>

* Implemented replica_count, open, read_only, read_write

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Implemented replica_count, open, read_only, read_write

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix Test cases

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix Test cases

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix messages

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Fix comments; set validation disabled by default

Signed-off-by: Angie Zhang <langelzh@amazon.com>

* Rename validation_service to action_validation; Fix some detekt issues

Signed-off-by: Angie Zhang <langelzh@amazon.com>

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Angie Zhang <langelzh@amazon.com>
Co-authored-by: Joanne Wang <jowg@amazon.com>
Co-authored-by: Petar <petar.dzepina@gmail.com>
Co-authored-by: Joanne Wang <109310487+jowg-amazon@users.noreply.github.com>
(cherry picked from commit 2438f48)
khushbr pushed a commit that referenced this pull request Nov 3, 2022
* Error prevention stage 1
* Removed recursion from Explain Action to avoid stackoverflow in some situations (#419)
* enabled by default integrated
* Fix Test cases
* Fix comments; set validation disabled by default
* Rename validation_service to action_validation; Fix some detekt issues

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Angie Zhang <langelzh@amazon.com>

Co-authored-by: Joanne Wang <jowg@amazon.com>
Co-authored-by: Petar <petar.dzepina@gmail.com>
Co-authored-by: Angie Zhang <langelzh@amazon.com>
khushbr pushed a commit that referenced this pull request Nov 3, 2022
* Removed recursion from Explain Action to avoid stackoverflow in some situations (#419)
* enabled by default integrated
* Fix Test cases
* Fix comments; set validation disabled by default
* Rename validation_service to action_validation; Fix some detekt issues

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Angie Zhang <langelzh@amazon.com>

Co-authored-by: Joanne Wang <jowg@amazon.com>
Co-authored-by: Petar <petar.dzepina@gmail.com>
Co-authored-by: Angie Zhang <langelzh@amazon.com>
wuychn pushed a commit to ochprince/index-management that referenced this pull request Mar 16, 2023
…#591)

* Error prevention stage 1
* Removed recursion from Explain Action to avoid stackoverflow in some situations (opensearch-project#419)
* enabled by default integrated
* Fix Test cases
* Fix comments; set validation disabled by default
* Rename validation_service to action_validation; Fix some detekt issues

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Angie Zhang <langelzh@amazon.com>

Co-authored-by: Joanne Wang <jowg@amazon.com>
Co-authored-by: Petar <petar.dzepina@gmail.com>
Co-authored-by: Angie Zhang <langelzh@amazon.com>
ronnaksaxena pushed a commit to ronnaksaxena/index-management that referenced this pull request Jul 19, 2023
…#591)

* Error prevention stage 1
* Removed recursion from Explain Action to avoid stackoverflow in some situations (opensearch-project#419)
* enabled by default integrated
* Fix Test cases
* Fix comments; set validation disabled by default
* Rename validation_service to action_validation; Fix some detekt issues

Signed-off-by: Joanne Wang <jowg@amazon.com>
Signed-off-by: Petar Dzepina <petar.dzepina@gmail.com>
Signed-off-by: Angie Zhang <langelzh@amazon.com>

Co-authored-by: Joanne Wang <jowg@amazon.com>
Co-authored-by: Petar <petar.dzepina@gmail.com>
Co-authored-by: Angie Zhang <langelzh@amazon.com>
Signed-off-by: Ronnak Saxena <ronsax@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants