-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KCL 1.X Fix for ShardEnd corruption and preventing lease table interference in multi-app JVM #776
Conversation
…uding HashRange in seralizing and deserializing to/from DDB 2. Fix for making LeaseCleanupManager non-singleton to avoid cross-table interference in multiple apps running in same JVM 3. Fixing updateMetaInfo method to not update other lease table fields 4. Preventing shard deletion in LeaseCleanupManager if a valid shard does not have child shards in lease table and in Kinesis Service 5. Adding childshards update support in updateMetaInfo 6. Fixing LeaseCleanupManager to call updateMetaInfo instead of update for childshard update in lease 7. Fixing unit tests to accommodate HashRange changes
5e51f67
to
9cb5020
Compare
static final long LEASE_TABLE_CHECK_FREQUENCY_MILLIS = 3 * 1000L; | ||
static final long MIN_WAIT_TIME_FOR_LEASE_TABLE_CHECK_MILLIS = 1 * 1000L; | ||
static final long MAX_WAIT_TIME_FOR_LEASE_TABLE_CHECK_MILLIS = 30 * 1000L; | ||
static long LEASE_TABLE_CHECK_FREQUENCY_MILLIS = 3 * 1000L; | ||
static long MIN_WAIT_TIME_FOR_LEASE_TABLE_CHECK_MILLIS = 1 * 1000L; | ||
static long MAX_WAIT_TIME_FOR_LEASE_TABLE_CHECK_MILLIS = 30 * 1000L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we marking these as non-final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, I see it's for unit test. Ideally we would be using builder pattern to avoid lengthy constructors, how bad is the build time if we don't override these values in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should be refactoring the long arg constructors. temporarily giving this package level access. this saves more than 3 minutes of build time.
if (CollectionUtils.isNullOrEmpty(childShardKeys)) { | ||
LOG.error("No child shards returned from service for shard " + shardInfo.getShardId()); | ||
// If no children shard is found in DDB and from service, then do not delete the lease |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this adds additional safety since this is just retried either way on the next deletion run, are we adding this for unit testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will prevent the the lease deletion and log error as this closed and valid shard could not retrieve children info from anywhere. This is a final guard to protect against bad server response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.