-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autodelete StatefulSet PVCs #2440
Conversation
@mattcary: GitHub didn't allow me to request PR reviews from the following users: kk-src. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @krmayankk |
/cc @kow3ns |
Hi @mattcary, 1.21 Enhancements Lead here. As part of this PR, please remember to create a Please make sure to assign someone from the prod-readiness-approval in both the prr yaml file and in the kep yaml file. Once you have the requirements, please reach out in #prod-readiness slack channel to get PRR reviewed asap. Thank you! |
dd9aace
to
30e30a2
Compare
Thank you Anna! We started this KEP before the PRR process changed and I missed that step. I have added the file. Fortunately @janetkuo is already on this PR so I have assigned her for the PRR. I will not take offense if we rebalance :-)
|
30e30a2
to
be25416
Compare
Oh, sorry Janet isn't on the prr-reviewers list. I've added @wojtek-t as they've been auto-assigned anyway. I will confirm on slack. |
be25416
to
b4c3907
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple small comments from me
scale-down. The user does not want to pay for volumes that are not in use, and so wants | ||
them to be reclaimed as soon as possible, including during scale-down. On scale-up a new | ||
volume will be provisioned and the new pod will have to re-intitialize. However, for | ||
short-lived interruptions like a rolling update or after pod rescheduling, the data on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: technically there isn't anything like "pod rescheduling" in k8s.
Once scheduled, pod remain on that node until end of life. The way we handle e.g. node disruptions is creating another pod and killing the previous one.
Note that it doesn't change anything in the proposal - it's just a wording comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I've clarified that this for when pods are killed & recreated, and removed the fictitious rescheduling
* **How can this feature be enabled / disabled in a live cluster?** | ||
- [x] Feature gate (also fill in values in `kep.yaml`) | ||
- Feature gate name: StatefulSetAutoDeletePVC | ||
- Components depending on the feature gate: kube-controller-manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically, there will also be kube-apiserver involved in it
as you will need to handle new field there (i.e. dropDisableFields pattern, e.g.:
https://github.com/kubernetes/kubernetes/blob/97d40890d00acf721ecabb8c9a6fec3b3234b74b/pkg/api/pod/util.go#L442
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. I took the liberty of adding a breadcrumb to dropDisabledFields as well.
gate is re-enabled, the new behavior will start working. | ||
|
||
There are new corner cases here. For example, if a StatefulSet deletion is in process | ||
during a StatefulSet deletion, the appropriate ownerRefs will not have been added and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo:
"if a StatefulSet deletion is in process during a StatefulSet deletion"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
* **What happens if we reenable the feature if it was previously rolled back?** | ||
The reconcile loop which removes references on disablement will not come into | ||
action. Since the StatefulSet field would persist through the disablment we will have to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I don't understand the first sentence - can you please clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified.
ensure that the required references get set in the next set of reconcile loops. | ||
|
||
* **Are there any tests for feature enablement/disablement?** | ||
Feature enablement and disablement tests will be added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And transitions please :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Yes and no. This feature will result in additional resource deletion calls, which will | ||
scale like the number of pods in the stateful set (ie, one PVC per pod and possibly one | ||
PV per PVC depending on the reclaim policy). There will not be additional watches, | ||
because the existing pod watches will be used (note: this is to be confirmed in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - the same has to be reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
However, anyone who uses this feature would have made those resource deletions anyway: | ||
those PVs cost money. So overall there shouldn't be much increase beyond the | ||
second-order effect of this feature allowing more automation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the "OnScaleDown" policy, an additional patch to update ownerRef is called.
[Which is fine - I just want have it mentioned explicitly.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
* **Will enabling / using this feature result in increasing time taken by any | ||
operations covered by existing SLIs/SLOs?** | ||
No. (are there StatefulSet SLOs?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not (yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the scale down policy impact pod startup latency in the scale down/up case, since we need to wait for the old PVC to be deleted and then recreated? This is definitely a niche scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-) clarified
* `Retain` - this is the default policy and is considered in cases where no policy is | ||
specified. This would be the existing behaviour - when a StatefulSet is deleted, no | ||
action is taken with respect to the PVCs created by the StatefulSet. | ||
* `DeleteOnScaledown` - When a pod is deleted on scale down, the corresponding PVC is deleted as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explicitly state that this option also implies the next? Nit, maybe order this one last since it's the most pervasive of the options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
### Notes/Constraints/Caveats (optional) | ||
|
||
This feature applies to PVCs which are dynamically provisioned from the volumeClaimTemplate of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we tell the difference between dynamically provisioned and pre-provisioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarified
that the PVCs remain intact by default and only a conscious choice made by user will | ||
involve any persistent data being deleted. | ||
|
||
PVCs associated with the StatefulSet will be more durable than ephemeral volumes would be, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this note intended to compare with the "generic ephemeral volumes" feature? If so, it may be good to reference it more specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a request to clarify why this feature was needed, because it seems to duplicate ephemeral volumes (generally, including emptyDir). I've tried to make the context more clear.
the Pod ownership has `blockOwnerDeletion` set to `true` pods will get deleted before the | ||
StatefulSet is deleted. The `blockOwnerDeletion` for PVCs will be set to `false` which | ||
ensures that PVC deletion happens only after the StatefulSet is deleted. This chain of | ||
ownership ensures that Pod deletion occurs before the PVCs are deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note that this is necessary because of the storage protection feature which doesn't allow pvc deletion until all pods referencing it are deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
are the only PVCs modified with an ownerRef. Other PVCs referenced by the StatefulSet Pod | ||
template are not affected by this behavior. | ||
|
||
#### `DeleteOnScaledown` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe in this section how this the StatefulSet deletion case is handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I think I'm correct that multiple ownerRefs will work as I expect?
|
||
#### `DeleteOnStatefulSetDeletion` | ||
|
||
When `PersistentVolumeClaimDeletePolicy` is set to `DeleteOnStatefulSetDeletion` the owner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does the ownerref get added to the PVC? Is it reconciling or only if the StatefulSet controller is the one that's creating the PVC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified to say when the StatefulSet controller creates the PVC, and what happens if there's already an existing one.
This features adds a new field to the StatefulSet. The default value for the new field | ||
maintains the existing behavior of StatefulSets. | ||
|
||
On a downgrade, the PersistentVolumeClaimReclaimPolicy field will be removed from any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about downgrade behavior for the StatefulSet Deletion case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the deletion case, is it ok if we don't try to actively remove the ownerRefs if the feature gets disabled? The user turned off the feature, but they still end up with the feature behavior since we added the ownerRef while the feature was enabled.
We had another feature (storage protection) that had an issue like this, but the ramification there was that PVC deletion would hang after disablement/rollback, since there's nothing to operate on the finalizer anymore. We went and added logic to remove the finalizer when the feature is disabled.
|
||
* **Will enabling / using this feature result in increasing size or count of | ||
the existing API objects?** | ||
- PVC, new ownerRef. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and StatefulSet, new field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
* **Will enabling / using this feature result in increasing time taken by any | ||
operations covered by existing SLIs/SLOs?** | ||
No. (are there StatefulSet SLOs?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the scale down policy impact pod startup latency in the scale down/up case, since we need to wait for the old PVC to be deleted and then recreated? This is definitely a niche scenario
|
||
## Implementation History | ||
|
||
- KEP created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a milestone here to date this? (ie 1.21)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
b4c3907
to
adee194
Compare
f86caf4
to
fa71f09
Compare
PVCs are identified by the static naming scheme used by StatefulSets. If a user | ||
pre-provisions a PVC with the same name, it will be treated as an | ||
auto-provisioned one and used accordingly. It is not clear what happens today if | ||
a user has pre-provisioned a PVC matching one that the StatefulSet will try to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-creating a PVC with the Statefulset pvc template naming syntax is a valid and supported operation. The StatefulSet controller won't create the PVC in that case since one already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thank you.
I've clarified this. I think that the best thing to do is have clear behavior about what happens based on the state of the world, and to not try to infer semanics based on the history of how things got created. That seems to be the most k8s/reconciling way to do it IMHO at least.
field will represent the user indication on whether the associated PVCs can be | ||
automatically deleted or not. The default policy would be `Retain`. | ||
|
||
In alpha release we intend to keep the `PersistentVolumeClaimDeletePolicy` immutable after |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should allow it to be mutable. We have PV reclaimPolicy as mutable because folks may originally have decided on Delete, but sometime later need to do some sort of manual repair/rollback but still keep the data, and want to temporarily switch to Retain.
If we make it immutable + reconciling, then it will be very difficult to do this. Of course, if we do make it mutable, there is still going to be a time lag for the "onDelete" policy because the controller has to go and remove the ownerRef on all the PVCs. Maybe that's another argument to only add the ownerRef on demand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, you raise a good point. Mutability is a good idea. Since we've also decided that the ownerRefs will be reconciled and not just made on PVC creation, this seems to not introduce too many complexities.
scale-up or scale-down occurs, the PVC is unchanged. PVCs previously in use | ||
before scale-down will be used again when the scale-up occurs. | ||
|
||
Note that the ownerRef will be set even when a VolumeClaimTemplate PVC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an immutable policy, and setting ownerRef on creation, I don't think you could get into this state from a controller crash. You would only get here if the user manually created the PVC object.
I do think that it's better to be reconciling to support the manually created PVCs case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're correct, thanks for pointing that out. I've clarified how reconcilation and ownerRefs work.
This features adds a new field to the StatefulSet. The default value for the new field | ||
maintains the existing behavior of StatefulSets. | ||
|
||
On a downgrade, the PersistentVolumeClaimReclaimPolicy field will be removed from any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the deletion case, is it ok if we don't try to actively remove the ownerRefs if the feature gets disabled? The user turned off the feature, but they still end up with the feature behavior since we added the ownerRef while the feature was enabled.
We had another feature (storage protection) that had an issue like this, but the ramification there was that PVC deletion would hang after disablement/rollback, since there's nothing to operate on the finalizer anymore. We went and added logic to remove the finalizer when the feature is disabled.
fa71f09
to
e36cce7
Compare
2d511c5
to
f67deec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this PR modulo the two super minor comments (not even blocking).
I will stamp it as soon as it is approved by the SIG.
as possible, although it will still occur. The former case can be mitigated by | ||
manually deleting PVCs. The latter case will result in lost data, but only in | ||
PVCs that were originally declared to have been deleted. Life does not always | ||
have an undo button. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:D
scale-down is in process, remaining PVCs are given an ownerRef to their Pod (by | ||
index, as above). | ||
|
||
When mutating from the `Retain` policy to a deletion policy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: misplaced :)
- Components depending on the feature gate | ||
- kube-controller-manager, which orchestrates the volume deletion. | ||
- kube-apiserver, to manage the new policy field in the StatefulSet | ||
resource (eg [dropDisabledFields](https://github.com/kubernetes/kubernetes/blob/97d40890d00acf721ecabb8c9a6fec3b3234b74b/pkg/api/pod/util.go#L433)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would remove the link - it's misleading, because it is for a pod.
Just mentioning dropDisabledFields is enough
/hold cancel [we won't merge without SIG apps approval anyway; PRR looks fine, waiting with stamping for SIG approval] |
/lgtm |
wow - what this is it that you're still around @janetkuo ? :) /approve PRR |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: janetkuo, mattcary, msau42, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Wow Janet thank you :-)
I will address the remaining comments in a follow-up PR.
Thank you everyone for your help in getting the KEP merged.
…On Tue, Feb 9, 2021 at 3:50 AM Maciej Szulik ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In keps/sig-apps/1847-autoremove-statefulset-pvcs/README.md
<#2440 (comment)>
:
> +existing behavior is retained and no PVC will be deleted. Only the StatefulSet resource
+will be affected.
+
+
+### Cluster role change for statefulset controller
+
+In order to update the PVC ownerReference, the `buildControllerRoles` will be updated with
+`patch` on PVC resource.
+
+### Test Plan
+
+1. Unit tests
+
+1. e2e tests
+ - DeleteOnScaleDown
+ 1. Create 2 pod statefulset, scale to 1 pod, confirm PVC deleted
Integration is for controllers fine, but here you'll be directly
interacting with pods created by the controller to verify its volumes. I'm
not sure if integration is a good place for them.
------------------------------
In keps/sig-apps/1847-autoremove-statefulset-pvcs/README.md
<#2440 (comment)>
:
> +`PersistentVolumeClaimDeletePolicy` and updated accordingly. This includes PVCs
+that have been manually provisioned. It will be most consistent and easy
+to reason about if all VolumeClaimTemplate PVCs are treated uniformly rather
+than trying to guess at their provenance.
+
+When the StatefulSet is deleted, these PVCs will also be deleted, but only after
+the Pod gets deleted. Since the Pod StatefulSet ownership has
+`blockOwnerDeletion` set to `true`, pods will get deleted before the StatefulSet
+is deleted. The `blockOwnerDeletion` for PVCs will be set to `false` which
+ensures that PVC deletion happens only after the StatefulSet is deleted. This is
+necessary because of PVC protection which does not allow PVC deletion until all
+pods referencing it are deleted.
+
+`Retain` `PersistentVolumeClaimDeletePolicy` will ensure the current behaviour: no PVC
+deletion is performed as part of StatefulSet controller.
+
For pod case, you're gonna need to check both name and uid. The former is
stable for sts, the latter always changes. That's how the GC will know
which exactly pod owns the right PVC. I doesn't have to be explicitly
called out here, but it will be critical in the controller code.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2440 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIJCBAF33K4EWU6IAGLWMITS6EOR5ANCNFSM4XFZJTQQ>
.
|
If a PVC is created manually (with the name SS controller expects), will the deletion of PVC still work when Thanks! |
Hey sorry I'm a bit late to the party and I know this KEP is already merged. I would like to advocate for a use case this KEP could cover but currently doesn't. In short: I would like my PVCs do be automatically deleted on StatefulSet scaledown, but I would like them to not be deleted on StatefulSet deletion. Overall I think about this enhancement as having 2 dimensions, with 4 possible combinations:
IIUC this KEP implements 3 out of the 4 possible options. The missing one is the one I'm the most interested in. @mattcary @msau42 do you think this would be worth considering? |
I think that's reasonable from a feature point of view; while it adds a
little complexity to the UX it is, as you point out, very logical. Our
original design was trying to keep things as simple as possible and not
create features without a real use case.
I'm in the middle of doing the implementation. Since deleting on scaledown
is about making PVC ownerrefs on pods, whereas deleting with the
statefulset is about making PVC ownerrefs on the stateful set, implementing
your suggestion is probably trivial.
If that's the case I'd be in favor of adding it. I do want to see how the
implementation shakes out, in case there are more complications than we
expected around the planned dynamic ownerrefs.
Any other opinions?
…On Thu, Feb 25, 2021 at 3:05 AM Sebastien Guilloux ***@***.***> wrote:
Hey sorry I'm a bit late to the party and I know this KEP is already
merged. I would like to advocate for a use case this KEP could cover but
currently doesn't.
In short: I would like my PVCs do be automatically deleted on StatefulSet
scaledown, but I would like them to *not* be deleted on StatefulSet
deletion.
Elasticsearch use case: when I'm scaling a StatefulSet down, I'm first
making sure data is migrated away from Pods that are removed. So I have no
benefit in keeping their PVCs around. However, I'd like the possibility to
bring the entire Elasticsearch cluster down temporarily by deleting the
StatefulSet, and bring it back up later by re-creating it, which involves
reusing retained PVCs with existing data.
Overall I think about this enhancement as having 2 dimensions, with 4
possible combinations:
Retain PVCs on downscale Delete PVCs on downscale
Retain PVCs on StatefulSet deletion Retain Missing from the KEP
Delete PVCs on StatefulSet deletion DeleteOnStatefulSetDeletion DeleteOnScaledown
IIUC this KEP implements 3 out of the 4 possible options. The missing one
is the one I'm the most interested in.
@mattcary <https://github.com/mattcary> @msau42
<https://github.com/msau42> do you think this would be worth considering?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2440 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIJCBAHO43NBIAHIKQDDWQTTAYVFXANCNFSM4XFZJTQQ>
.
|
Seems reasonable. In terms of API design, maybe the policy should be a list so you can specify any combination of them? |
+1 - I agree this usecase makes sense to be supported. |
I've got the implementation almost done, I don't think it will add much
complexity. Establishing the API as a list now will be better as the change
will be awkward for beta.
Thanks for the suggestion @sebgl! I'll propose the change to the KEP.
…On Mon, Mar 1, 2021 at 4:57 AM Wojciech Tyczynski ***@***.***> wrote:
+1 - I agree this usecase makes sense to be supported.
Though, if the implementation changes are significant, we may want to push
it to Beta (and here just ensure that API will be expressive enough to
avoid changing API for Beta).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2440 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIJCBACEIQF5ZO4ID46XKELTBOFLVANCNFSM4XFZJTQQ>
.
|
I updated the proposal at #2545. Please comment! |
@mattcary I have operators that manage StatefulSets. These operators set controller references to |
It depends what your operators are setting references to. The statefulset controller will put references from the PVC to the StatefulSet (for OnSetDeletion) or from the PVC to the Pod (for OnScaledown, only for the condemned pods during a scaledown). If your controller adds additional ownerrefs on the PVCs then that will inhibit deletion as usual, which may or may not be what you want. The statefulset controller won't touch those references, and will clean up its own references appropriately. Although, once a pod is deleted, if the PVC is not deleted, the stale reference will not be removed (the statefulset controller won't know about that PVC since there's no longer an active pod in the set). But I think that's okay. At any rate, as you said, if the policy is not set or is set explicitly to Retain, then no behavior changes. I could imagine adding a validation webhook that prevents users from changing the statefulset policy when your controller is active, if doing so would muck things up. |
Thank you very much! |
A KEP for autodeletion of StatefulSet PVCs.
Issue #1847
This is a clone of #1915 in order to make progress as the original PR author has gotten swamped with other work.
I have tried to address all outstanding comments on that PR. Please let me know what I missed!
/cc @msau42
/cc @janetkuo
/cc @kk-src
/cc @ krmayankk