-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(custom-resources): StateNotFoundError: State functionActiveV2 not found #24358
Comments
related to #23862 (comment) |
Same error if I change the runtime to Python 3.9 const onEventHandler = new lambda.Function(this, 'OnEventHandler', {
runtime: lambda.Runtime.PYTHON_3_9,
code: lambda.Code.fromInline(`def on_event(event, context): return {}`),
handler: 'index.on_event',
}) |
Same error in eu-central-2 region, except that we are using EKS (deployed with CDK), but we have more than 10 custom resources. |
I am reaching out to the relevant team internally. |
It seems like this CDK commit is using the functionActiveV2 state from the JS SDK: Then the change in CDK v2.60.0 switched off installLatestAwsSdk by default for custom resources, and from the PR it seems the default SDK version packaged by Lambda is 2.1055.0: But the functionActiveV2 state seems to have been introduced in the SDK in version v2.1080.0 from this commit in the JS SDK repo: So it seems the installLatestAwsSdk change is the culprit for the issue we are experiencing |
I got around this for now by reverting to a CDK version older than 2.60.0 |
Is there a way forward? I cannot go back to an older CDK version and I have to use isolated subnets. |
Any updates on this? |
Thank you @robert-carbmee for the details. I have brought it up to the CDK core team. Will keep the update posted here whenever possible. |
This seems like the right root cause as far as I can tell. I guess the reason this shows in ap-south-2 and not other regions is for some reason deploying certain number of custom resources causes some backup in lambda function creation that causes the functions to be pending, where in other regions we actually just never call I guess the easiest fix would be changing to use |
FYI: same here in |
We're seeing this issue today in us-east-1. We're using cdk 2.63.1 with 3 custom resources built using the provided:al2 docker image |
Seems to be related to concurrency - as a workaround, if you introduce (artificial) dependencies (using addDependency for example) it will reconcile successfully. Here's an example (resources should be ordered in desired creation order): [
...oidcClusterRoleManifests,
clusterAutoscalerNamespace,
clusterAutoscalerServiceAccount,
clusterAutoscalerHelmChart,
fluentBitNamespace,
fluentBitServiceAccount,
awsEbsCsiDriverNamespace,
awsEbsCsiDriverServiceAccount,
pixieHelmChart,
externalDnsNamespace,
externalDnsServiceAccount,
externalDnsHelmChart,
pmmNamespace,
pmmServiceAccount,
argocdHelmChart,
deploymentRepoManifest,
clusterBootstrapManifest,
clusterAutoscalerAppManifest,
argocdAppManifest,
pixieAppManifest,
externalDnsAppManifest,
]
.reverse()
.forEach((resource, index, resources) => {
const nextResource = resources[index + 1];
nextResource && resource.node.addDependency(nextResource);
}); Obviously this is far from ideal as it makes the reconcilliation slower (you could however increase the concurrency by removing just enough dependencies to stay below the threashold when errors happen, but it's fiddly and fragile). Would be nice to have a fix within the CDK though... |
We are also facing this issue in |
For science (not as a proper solution), does the issue go away if you add to your construct: cr1.node.addDependency(cr2) just so you force the CDK to not provision them concurrently. |
related to #24916 |
Replaces `functionActiveV2` with `functionActive`. `functionActiveV2` is not available in SDK versions < 2.1080.0, but the one that Lambda currently installs by default is 2.1055.0. The version that Lambda installs by default is the same that the CDK uses. Closes #24358 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
…Cloud regions (#25215) Reopening this PR because #25170 was closed by accident. As ECR Public is not available in China regions and GovCloud, `AmazonElasticContainerRegistryPublicReadOnly` IAM managed policy would not be available in those affected regions and should not be attached to the role. This PR implements a CfnCondition to determine if ECR public is available based on `Aws.Partition` of the deploying region and conditionally attach `AmazonElasticContainerRegistryPublicReadOnly` to the kubectl-provider handler role. This PR has been tested in the following regions: - [x] *cn-north-1 - [x] *cn-northwest-1 - [x] us-east-1 * I can confirm the role is created correctly in cn regions but due to - #24358 - #24696 The cluster and nodegroup are still failing to create in CN. Closes #24743 #24808 #25178
Replaces `functionActiveV2` with `functionActive`. `functionActiveV2` is not available in SDK versions < 2.1080.0, but the one that Lambda currently installs by default is 2.1055.0. The version that Lambda installs by default is the same that the CDK uses. Closes #24358 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
CLI notice for #24358. Inactive custom resource provider framework lambdas will cause errors due to Lambda's default SDK version (2.1055.0) not having `functionActiveV2` which only exists on 2.1080.0 and up.
Please add your +1 👍 to let us know you have encountered this
Status: RESOLVED
Overview:
Any customer using custom resources may encounter the error when the custom resource handler lambda becomes
INACTIVE
.Root cause: Lambda installs the SDK at 2.1055.0, but
functionActiveV2
doesn't exist until 2.1080.0. It was reported in ap-south-2 but can happen in any region.The error occurs anytime the custom resource provider framework fails to invoke the custom resource handler lambda. In that event, the framework will use
functionActiveV2
to wait for the lambda to become active again. However, the call tofunctionActiveV2
will fail in the provider lambda because of root cause described above.Complete Error Message:
Workaround:
Solution:
use
functionActive
instead.#25228
Related Issues:
#23862 (comment)
Original Issue
Describe the bug
When deploying 10+ custom resources in
ap-south-2
, it fails withStateNotFoundError: State functionActiveV2 not found
error as below:us-east-1
orap-northeast-1
. Onlyap-south-2
fails in this case.ap-south-2
will be fine with no error.Expected Behavior
The provided code above should deploy in
ap-south-2
.Current Behavior
Reproduction Steps
Possible Solution
There might be some restrictions in
ap-south-2
.Additional Information/Context
No response
CDK CLI Version
2.66.1 (build 539d036)
Framework Version
No response
Node.js Version
v16.17.0
OS
Linux
Language
Typescript
Language Version
No response
Other information
No response
The text was updated successfully, but these errors were encountered: