Slow /.well-known/openid-configuration endpoints #17

4865783a5d · 2025-01-28T08:55:10Z

4865783a5d
Jan 28, 2025

Which version of Duende IdentityServer are you using?
7.1.0
Which version of .NET are you using?
NET8
Describe the bug
The endpoints
/.well-known/openid-configuration
/.well-known/openid-configuration/jwks

have a 99th percentile performance of ~1 sec with peaks up to 20 secs during ~ 5.45 AM GMT+1

App Insights Performance

Trace

During peaks to ~20 secs, other endpoints slow down considerably as well

To Reproduce
Deploy to Azure App Service

Expected behavior
99th percentile performance of ~20 ms.

Log output/exception with stacktrace

09:23:09.119	Request	Name: GET /.well-known/openid-configuration/jwks, Successful request: true, Response time: 1.0 s, URL: https://123/.well-known/openid-configuration/jwks
09:23:09.119	Dependency	Name: CachingCorsPolicyService.IsOriginAllowed, Type: Other, Call status: true, Duration: 7.3 μs
09:23:09.121	Trace	Severity level: Warning, Message: CorsPolicyService did not allow origin: https://zzz
09:23:09.122	Dependency	Name: IdentityServerProtocolRequest, Type: Other, Call status: true, Duration: 594.3 μs
09:23:09.122	Dependency	Name: DiscoveryEndpoint, Type: Other, Call status: true, Duration: 355.9 μs
09:23:09.122	Dependency	Name: DiscoveryResponseGenerator.CreateJwkDocument, Type: Other, Call status: true, Duration: 347.4 μs
09:23:09.122	Dependency	Name: DefaultKeyMaterialService.GetValidationKeys, Type: Other, Call status: true, Duration: 333.9 μs
09:23:09.122	Dependency	Name: KeyManager.GetAllKeys, Type: Other, Call status: true, Duration: 307.1 μs
09:23:09.122	Trace	Message: Getting all the keys.
09:23:09.122	Trace	Message: Cache hit when loading all keys.
09:23:09.122	Trace	Message: Looking for active signing keys.
09:23:09.122	Trace	Message: Looking for an active signing key for alg RS256.
09:23:09.122	Trace	Message: Checking if key with kid xxxxx is active (respecting activation delay).
09:23:09.122	Trace	Message: Key with kid xxxxx is active.
09:23:09.122	Trace	Message: Active signing key found (respecting the activation delay) with kid: xxxxx.
09:23:09.122	Trace	Message: Found active signing key for alg RS256 with kid xxxxx.
09:23:09.122	Trace	Message: Checking if key with kid xxxxx is active (respecting activation delay).
09:23:09.122	Trace	Message: Key with kid xxxxx is active.
09:23:09.122	Trace	Message: Active signing key found (respecting the activation delay) with kid: xxxxx.
09:23:09.122	Trace	Message: Key rotation not required for alg RS256; New key expected to be created in 25.23:20:06.2274435
09:23:09.122	Dependency	Name: WriteJson, Type: Other, Call status: true, Duration: 175.9 μs

Additional context
Our infrastructure consists of Azure App Services in West- and North Europe, load balanced through Azure Front Door. Our Azure Sql Server is a Business Critical Gen5 / 8 vCores (40 GB Ram). The PaaS resource usage is less than 5%.

We found DuendeArchive/Support#1361 but no improvement occurred after updating Azure.Core to 1.44.
The data protection is configured with EF Core and protected with Azure Key Vault.

NuGet Versions with ~20ms 99th percentile (Pre 8th January)

> Azure.Core                                                                         1.37.0
> Azure.Data.AppConfiguration                                                        1.3.0
> Azure.Extensions.AspNetCore.Configuration.Secrets                                  1.3.1
> Azure.Extensions.AspNetCore.DataProtection.Keys                                    1.2.3
> Azure.Identity                                                                     1.10.4
> Azure.Messaging.EventGrid                                                          4.7.0
> Azure.Security.KeyVault.Keys                                                       4.2.0
> Azure.Security.KeyVault.Secrets                                                    4.3.0
> MdS.Azure.Identity                                                                 3.0.1
> Microsoft.Azure.AppConfiguration.AspNetCore                                        7.0.0
> Microsoft.Azure.KeyVault                                                           2.3.2
> Microsoft.Azure.KeyVault.WebKey                                                    2.0.7
> Microsoft.Azure.Services.AppAuthentication                                         1.0.3
> Microsoft.Extensions.Configuration.AzureAppConfiguration                           7.0.0
> Microsoft.Extensions.Configuration.AzureKeyVault                                   3.1.24
> Microsoft.Rest.ClientRuntime.Azure                                                 3.3.7
> Microsoft.SourceLink.AzureRepos.Git                                                8.0.0

> Microsoft.IdentityModel.Abstractions                                               7.1.2
> Microsoft.IdentityModel.Clients.ActiveDirectory                                    3.14.2
> Microsoft.IdentityModel.JsonWebTokens                                              7.1.2
> Microsoft.IdentityModel.Logging                                                    7.1.2
> Microsoft.IdentityModel.Protocols                                                  7.1.2
> Microsoft.IdentityModel.Protocols.OpenIdConnect                                    7.1.2
> Microsoft.IdentityModel.Tokens                                                     7.1.2
> System.IdentityModel.Tokens.Jwt                                                    7.1.2

> Duende.IdentityServer                                                              7.0.4
> Duende.IdentityServer.AspNetIdentity                                               7.0.4
> Duende.IdentityServer.EntityFramework                                              7.0.4
> Duende.IdentityServer.EntityFramework.Storage                                      7.0.4
> Duende.IdentityServer.Storage                                                      7.0.4

NuGet Versions with ~1sec 99th percentile (Post 8th January)

> Azure.Core                                                                         1.44.1
> Azure.Data.AppConfiguration                                                        1.3.0
> Azure.Extensions.AspNetCore.Configuration.Secrets                                  1.3.1
> Azure.Extensions.AspNetCore.DataProtection.Keys                                    1.3.0
> Azure.Identity                                                                     1.10.4
> Azure.Messaging.EventGrid                                                          4.7.0
> Azure.Security.KeyVault.Keys                                                       4.6.0
> Azure.Security.KeyVault.Secrets                                                    4.3.0
> MdS.Azure.Identity                                                                 3.0.1
> Microsoft.Azure.AppConfiguration.AspNetCore                                        7.0.0
> Microsoft.Azure.KeyVault                                                           2.3.2
> Microsoft.Azure.KeyVault.WebKey                                                    2.0.7
> Microsoft.Azure.Services.AppAuthentication                                         1.0.3
> Microsoft.Extensions.Configuration.AzureAppConfiguration                           7.0.0
> Microsoft.Extensions.Configuration.AzureKeyVault                                   3.1.24
> Microsoft.Rest.ClientRuntime.Azure                                                 3.3.7
> Microsoft.SourceLink.AzureRepos.Git                                                8.0.0

> Microsoft.IdentityModel.Abstractions                                               7.1.2
> Microsoft.IdentityModel.Clients.ActiveDirectory                                    3.14.2
> Microsoft.IdentityModel.JsonWebTokens                                              7.1.2
> Microsoft.IdentityModel.Logging                                                    7.1.2
> Microsoft.IdentityModel.Protocols                                                  7.1.2
> Microsoft.IdentityModel.Protocols.OpenIdConnect                                    7.1.2
> Microsoft.IdentityModel.Tokens                                                     7.1.2
> System.IdentityModel.Tokens.Jwt                                                    7.1.2

> Duende.IdentityServer                                                              7.1.0
> Duende.IdentityServer.AspNetIdentity                                               7.1.0
> Duende.IdentityModel                                                               7.0.0
> Duende.IdentityServer.EntityFramework                                              7.1.0
> Duende.IdentityServer.EntityFramework.Storage                                      7.1.0
> Duende.IdentityServer.Storage                                                      7.1.0

We have added aggressive caching with a custom DiscoveryResponseGenerator and are now seeing the following behavior:

During the time between 14:30 - 14:40, I ran a dummy app to poll the discovery endpoint in an interval - latency was perfectly fine there. After stopping the dummy app, the regular traffic calling the endpoint started seeing latencies of ~1 sec again.

The relevant caching code for CreateDiscoveryDocumentAsync is here, CreateJwkDocumentAsync is similarly implemented.

  var key = $"{_memoryKeyPrefix}{baseUrl}-{issuerUri}";

  if (_memoryCache.TryGetValue(key, out Dictionary<string, object>? discoveryDocument) && discoveryDocument != null)
      return discoveryDocument;

  discoveryDocument = await base.CreateDiscoveryDocumentAsync(baseUrl, issuerUri);

  var expiration = _timeProvider.GetUtcNow().AddSeconds(Options.Discovery.ResponseCacheInterval ?? _defaultExpirationInSeconds);
  _memoryCache.Set(key, discoveryDocument, expiration);

  return discoveryDocument;

AndersAbel · 2025-01-28T20:51:43Z

AndersAbel
Jan 28, 2025
Collaborator

It indeed looks like something happened in that upgrade, yes.

Looking at the example trace for the GET request indicates that the problem is outside of the IdentityServer pipeline. The IdentityServerProtocolRequest activity wraps (nearly*) all the processing done by IdentityServer. In the screen shot we can see that the GET request took a total of 1 sec, but that the IdentityServerProtocolRequest was done in 565 micro seconds. That leaves 99.5 milliseconds to explain by something else.

*) To be 100% correct, there are a few infrastructure level things that are done outside of that block. The first is that the activity only fires if the path matches an IdentityServer endpoint => the endpoint resolution happens outside of the block. It's a simple for loop with only in memory dependencies and I cannot imagine how that would take close to a second.

Also if you are using the dynamic providers feature, handling of those is outside of the IdentityServerProtocolRequest activity.

My overall feeling here based on the diagnostics shared is that it's something happening before or after the actual IdentityServer code is invoked. Did you do any code changes as part of the upgrade? Any changes to infrastructure?

0 replies

4865783a5d · 2025-01-28T21:36:50Z

4865783a5d
Jan 28, 2025
Author

Thanks for your response. There is no pre / post processing there but I'll try to get more diagnostics from the NET Core pipeline. The strange thing is, that its only affecting one region in Azure - North Europe has about 20% traffic of West Europe but no issues at all. I'll try to reproduce locally as we see the problem in 3 different environments. I'll also further investigate transitive dependencies

Also note, all other protocol endpoints (/introspect, /token etc.) behave normal. Its just the well-known one.

0 replies

4865783a5d · 2025-01-29T07:54:50Z

4865783a5d
Jan 29, 2025
Author

Checking App Insights today, we've seen the following:

/well-known Endpoints have very high latency while other Duende Protocol Endpoints (Introspect, Token) increase as well, just not as much. I'll investige further within the ASP.NET Core pipeline but if you could check on your end as well @AndersAbel to see if there is a difference between the endpoints.

We're using YARP as proxy in front of identity server, I'll check if there is a problem there.

0 replies

AndersAbel · 2025-01-30T21:35:10Z

AndersAbel
Jan 30, 2025
Collaborator

Thanks for sharing those stats. The ./well-known endpoints are actually the ones I consider most simple in their implementation. There is less code to run and less storage/database access. The token endpoint in comparison is more complex, but also (as far as I remember) utilizes all storage/config that the ./well-known endpoints to.

I do not doubt that this is a problem, but to properly troubleshoot we would need full activity traces that shows timing all the way from the client's requests to how it is handled on the server side. The only tangible data point we have so far is the one I referenced above and that one shows that the execution of the discovery endpoint class only takes up a fraction of the total time.

Are you using the dynamic providers feature?

0 replies

4865783a5d · 2025-01-31T06:45:21Z

4865783a5d
Jan 31, 2025
Author

I've opened another issue for YARP and after collecting metrics and checking with the team there, it seems that YARP does not cause the latency. Metrics suggest that the request is immediatly forwarded to the network stack.

We do not use dynamic providers.

As I've mentioned above, we started service the ./well-known endpoints response from memory cache, as it is very static. Still, the 99P is 1.03 secs for both endpoints (App Insights from today morning):

In comparison, the /connect/token is way more complex in terms of logic and PaaS access (DB calls to PersistedGrant table, which in our case is ~30 GB).

I'll enable ASP.NET core telemetry to see where the additional time comes from.

0 replies

AndersAbel · 2025-01-31T17:08:06Z

AndersAbel
Jan 31, 2025
Collaborator

We do not use dynamic providers.
Thank you for that confirmation. The dynamic providers is a feature that runs outside of the IdentityServerProtocolRequest activity and that has database access and runs code that potentially be slow. I just wanted to make sure that we can exclude those from the possible causes.

I'll enable ASP.NET core telemetry to see where the additional time comes from.

I think that is the right next step. Right now we don't know where the time is spent and for any performance issue metrics is the only way to solve it. There are things in the discovery endpoint as well as in the IdentityServer endpoint selection/routing that could potentially cause timing issues (never say never in these cases until it is proved). The only thing I can say is that the numbers shared so far indicate that the issue is outside of the IdentityServer middleware. That doesn't mean IdentityServer is not to blame - we won't know until we have metrics that show where the issue is.

0 replies

4865783a5d · 2025-01-31T17:36:27Z

4865783a5d
Jan 31, 2025
Author

I'll set metrics logging up, that will require some code changes to use the new OpenTelemetry packages. We noticed that those long running requests come in pairs within a range of 10 ms from the same client

/.well-known/openid-configuration
/.well-known/openid-configuration/jwks

one of them finishes within the expected latency, the other is at 1 sec, as if there is a lock / resource contest.

0 replies

maartenba · 2025-02-05T12:09:34Z

maartenba
Feb 5, 2025
Maintainer

(note: we're moving this issue to our new community discussions)

3 replies

4865783a5d Feb 13, 2025
Author

We suspect that this might be a problem with the Azure Networking infrastructure as the root endpoint '/' also adds random 1 sec latencies. We'll keep you updated once we involved Azure Support.

AndersAbel Feb 20, 2025
Collaborator

@4865783a5d do you have any feedback on this that could help shed a light on the root cause? We are working with @mathitharmalingam-aqi that has a similar issue reported below that we try to solve.

4865783a5d Feb 20, 2025
Author

@AndersAbel Not yet, we're still gathering data and network traces to rule certain things in or out.

mathitharmalingam-aqi · 2025-02-13T21:59:07Z

mathitharmalingam-aqi
Feb 13, 2025

Any solution to this?
We are trying to update to .Net8 and also experiencing this issue.
Initially the request to the well-known endpoint is fast but starts getting slower as the days go by.
This request takes 1.4 seconds.

I added a custom middleware to log the request to well-known endpoint, just before handing the request to the identity server.

As you can see, almost all the time is spent in the identity server middleware.

Our app is running in Azure ServiceFabric V10.1.2493.9590. I updated the Azure.Core to 1.45 and tried the preview identity server package V7.2.0-preview.1.

7 replies

AndersAbel Feb 20, 2025
Collaborator

Thank you. With the logs supplied previously and this code, we can narrow down the problem some more.

Internally, app.UseIdentityServer() adds multiple middleware to the pipeline:

        app.ConfigureCors();
        app.UseMiddleware<DynamicSchemeAuthenticationMiddleware>();
        app.UseMiddleware<MutualTlsEndpointMiddleware>();
        app.UseMiddleware<IdentityServerMiddleware>();

What we can see in the logs you showed is that the Cors middleware reports completion at 4:49:06.093. Then the next log message is from right after entry in the IdentityServerMiddleware, at 4:49:07.453 - about 1.5 seconds later.

Looking at the pipeline above you've already responded that you do not use the dynamic providers. Do you use MutualTls in your setup? Specificially, is IdentityServerOptions.MutualTls.Enabled == true?

I think that the next step would be to replace the call to app.UseIdentityServer with the lines above (there are actually a few more lines needed too, I removed the non-middleware lines). Then you could add a log message like your AQI-8697 right above the UseIdentityServer() call to further narrow down where the time is spent.

mathitharmalingam-aqi Feb 24, 2025

I updated the startup with the following code

       // This will create a redirect rule for our home page using the KeyVault identity server uri
       var options = new RewriteOptions().Add(new RedirectRule(Configuration[Constant.KeyVaultIdentityServerUri]));
       app.UseRewriter(options);

       app.Use(async (context, next) =>
       {
           if (context.Request.Path.ToString().Contains(".well-known/openid-configuration"))
           {
               _logger.LogInformation("AQI-8697: Before BaseUrl: " + context.Request.Path);
           }
           
           await next();
       });

       app.UseMiddleware<BaseUrlMiddleware>();

       app.ConfigureCors();

       app.Use(async (context, next) =>
       {
           if (context.Request.Path.ToString().Contains(".well-known/openid-configuration"))
           {
               _logger.LogInformation("AQI-8697: After CORS: " + context.Request.Path);
           }

           await next();
       });

       //app.UseMiddleware<DynamicSchemeAuthenticationMiddleware>();

       // it seems ok if we have UseAuthentication more than once in the pipeline --
       // this will just re-run the various callback handlers and the default authN 
       // handler, which just re-assigns the user on the context. claims transformation
       // will run twice, since that's not cached (whereas the authN handler result is)
       // related: https://github.com/aspnet/Security/issues/1399
       new IdentityServerMiddlewareOptions().AuthenticationMiddleware(app);

       app.Use(async (context, next) =>
       {
           if (context.Request.Path.ToString().Contains(".well-known/openid-configuration"))
           {
               _logger.LogInformation("AQI-8697: After Authentication: " + context.Request.Path);
           }

           await next();
       });

       app.UseMiddleware<MutualTlsEndpointMiddleware>();

       app.Use(async (context, next) =>
       {
           if (context.Request.Path.ToString().Contains(".well-known/openid-configuration"))
           {
               _logger.LogInformation("AQI-8697: After MutualTls: " + context.Request.Path);
           }

           await next();
       });

       app.UseMiddleware<IdentityServerMiddleware>();

This request took ~9 seconds to respond.

The request starts at 7:38:20.676 PM local time, the MutualTls middleware is complete 7:38:20.676 PM, and enters the IdentityServer middleware. Most of the request time ~8 seconds is spent in the middleware before logging "Duende.IdentityServer.Endpoints.DiscoveryEndpoint for /.well-known/openid-configuration".

4865783a5d Feb 28, 2025
Author

We implemented additional logging for each of the Identity Server middlewares (BaseUrlMiddleware, DynamicSchemeAuthenticationMiddleware, MutualTlsEndpointMiddleware, IdentityServerMiddleware) and determined that the problem lies in our Azure Infrastructure between the Proxy and Identity Server App. We tested this in a non-production environment which is similarly setup as the production.

Up to timestamp 14:09, requests were flowing from client -> Azure Front Door -> App Service (Proxy) -> App Service (Identity Server), after we went directly from client -> App Service (Identity Server)

Requests taking longer than ~800ms had most of their time spent on the network:

We also ran network traces, I'm in the process of examining them now.

mathitharmalingam-aqi Mar 3, 2025

@AndersAbel - any update? Is there anything else I can try out?

AndersAbel Mar 6, 2025
Collaborator

@mathitharmalingam-aqi I have involved our product development team in diagnosing this and I hope that we will have some information to share soon.

RolandGuijt · 2025-03-18T16:28:30Z

RolandGuijt
Mar 18, 2025
Collaborator

IdentityServer 7.2 was just released with a preview feature that can cache the output of the discovery endpoint. Can you please try that and see if it makes a difference?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duende Software

Slow /.well-known/openid-configuration endpoints #17

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments 10 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Duende Software

Slow /.well-known/openid-configuration endpoints #17

4865783a5d Jan 28, 2025

Replies: 10 comments · 10 replies

AndersAbel Jan 28, 2025 Collaborator

4865783a5d Jan 28, 2025 Author

4865783a5d Jan 29, 2025 Author

AndersAbel Jan 30, 2025 Collaborator

4865783a5d Jan 31, 2025 Author

AndersAbel Jan 31, 2025 Collaborator

4865783a5d Jan 31, 2025 Author

maartenba Feb 5, 2025 Maintainer

4865783a5d Feb 13, 2025 Author

AndersAbel Feb 20, 2025 Collaborator

4865783a5d Feb 20, 2025 Author

mathitharmalingam-aqi Feb 13, 2025

AndersAbel Feb 20, 2025 Collaborator

mathitharmalingam-aqi Feb 24, 2025

4865783a5d Feb 28, 2025 Author

mathitharmalingam-aqi Mar 3, 2025

AndersAbel Mar 6, 2025 Collaborator

RolandGuijt Mar 18, 2025 Collaborator

4865783a5d
Jan 28, 2025

Replies: 10 comments 10 replies

AndersAbel
Jan 28, 2025
Collaborator

4865783a5d
Jan 28, 2025
Author

4865783a5d
Jan 29, 2025
Author

AndersAbel
Jan 30, 2025
Collaborator

4865783a5d
Jan 31, 2025
Author

AndersAbel
Jan 31, 2025
Collaborator

4865783a5d
Jan 31, 2025
Author

maartenba
Feb 5, 2025
Maintainer

4865783a5d Feb 13, 2025
Author

AndersAbel Feb 20, 2025
Collaborator

4865783a5d Feb 20, 2025
Author

mathitharmalingam-aqi
Feb 13, 2025

AndersAbel Feb 20, 2025
Collaborator

4865783a5d Feb 28, 2025
Author

AndersAbel Mar 6, 2025
Collaborator

RolandGuijt
Mar 18, 2025
Collaborator