Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete workflow execution from visibility store even if it was deleted from main store #3962

Merged

Conversation

alexshtin
Copy link
Member

@alexshtin alexshtin commented Feb 16, 2023

What changed?
Delete workflow execution from visibility store even if it was deleted from main store.

Why?
DeleteExecutions workflow uses visibility store to list workflow executions to delete and if workflow execution was manually deleted from main store but not for visibility store, it gets stuck. With this change, in case of NotFound error, DeleteExecutions workflow will try to delete execution directly from visibility. If even this operation fails, workflow will exit with error and let caller (ReclaimResources workflow) to retry according to retry policy.

How did you test it?
Added functional test.

Potential risks
No risks.

Is hotfix candidate?
No.

@alexshtin alexshtin requested a review from a team as a code owner February 16, 2023 01:38
if nextPageToken == nil {
// If nextPageToken is nil then there are no more workflow executions to delete.
// If SuccessCount == 0 then no progress were made with this run: exit here and let caller retry and eventually fail.
if nextPageToken == nil || result.SuccessCount == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If success count is 0 and error count is not 0, we should continue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both counters are 0, it means nothing was deleted. And if nextPageToken is not nil it means there is still something in visibility.

Copy link
Member

@yycptt yycptt Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my confusion is what happens if nextPageToken != nil && SuccessCount == 0 && ErrorCount != 0, like deletions failed for all workflows in the page case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if nextPageToken is not nil (there is something else left) it won't fall inside this if, and will do "continue as new". It will fall inside this if (and exit) only if there is nothing else left (nextPageToken is nil) or no progress were made (SuccessCount is 0). And this is what I explicitly said there in a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After second thought I think it is unneeded pre-caution. I removed SuccessCount check for now.

@alexshtin alexshtin force-pushed the fix/delete-execution-inconsisten-db branch 3 times, most recently from 82b397b to f8473bb Compare February 17, 2023 00:04
@alexshtin alexshtin force-pushed the fix/delete-execution-inconsisten-db branch from f8473bb to c108f8c Compare February 17, 2023 00:32
@alexshtin alexshtin enabled auto-merge (squash) February 17, 2023 00:43
@alexshtin alexshtin merged commit b313b7f into temporalio:master Feb 17, 2023
@alexshtin alexshtin deleted the fix/delete-execution-inconsisten-db branch February 17, 2023 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants