Skip to content

Resilience improvements for instance discovery#5811

Open
bgavrilMS wants to merge 9 commits into
mainfrom
bogavril/5809
Open

Resilience improvements for instance discovery#5811
bgavrilMS wants to merge 9 commits into
mainfrom
bogavril/5809

Conversation

@bgavrilMS

@bgavrilMS bgavrilMS commented Mar 5, 2026

Copy link
Copy Markdown
Member

Fixes #5804 #5805

If instance discovery fails due to 404 or 502, it should not be attempted again
Instance discovery should have a reasonble timeout

@bgavrilMS bgavrilMS requested a review from a team as a code owner March 5, 2026 14:46
Copilot AI review requested due to automatic review settings March 6, 2026 12:04

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves AAD instance discovery resilience and performance by avoiding repeated network instance discovery attempts when the endpoint is failing/unreachable, and by bounding discovery latency with a dedicated timeout.

Changes:

  • Cache a fallback instance discovery entry when network instance discovery fails (non-invalid_instance) to avoid retrying discovery on subsequent token requests.
  • Add a per-instance-discovery timeout (default 10s) by linking a timeout CancellationToken into the discovery request flow.
  • Add unit tests covering caching-on-failure and timeout fallback behavior; add a rules doc for cross-SDK reference.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/Microsoft.Identity.Test.Unit/PublicApiTests/InstanceDiscoveryTests.cs Adds tests ensuring instance discovery failures/timeouts are cached and not retried.
src/client/Microsoft.Identity.Client/Internal/RequestContext.cs Makes UserCancellationToken settable to support temporary override during instance discovery.
src/client/Microsoft.Identity.Client/Instance/Discovery/NetworkMetadataProvider.cs Adds instance discovery timeout and links it into the outgoing request.
src/client/Microsoft.Identity.Client/Instance/Discovery/InstanceDiscoveryManager.cs Caches fallback metadata on discovery failure; updates warning text.
docs/instance-discovery-rules.md Adds a detailed description of instance discovery behavior and error-handling rules.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/client/Microsoft.Identity.Client/Internal/RequestContext.cs
Copilot AI review requested due to automatic review settings March 24, 2026 12:12
@bgavrilMS bgavrilMS linked an issue Mar 24, 2026 that may be closed by this pull request

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

Comment thread tests/Microsoft.Identity.Test.Unit/PublicApiTests/InstanceDiscoveryTests.cs Outdated
Copilot AI review requested due to automatic review settings March 25, 2026 13:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Copilot AI review requested due to automatic review settings March 31, 2026 16:44

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

@bgavrilMS bgavrilMS changed the title Bogavril/5809 Resilience improvements for instance discovery Mar 31, 2026
Copilot AI review requested due to automatic review settings April 1, 2026 17:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread docs/instance-discovery-rules.md
@gladjohn

gladjohn commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

looks good, one follow-up test needed.

Copilot AI review requested due to automatic review settings April 13, 2026 22:29

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment on lines +231 to +235
httpManager.AddMockHandler(new MockHttpMessageHandler()
{
ExpectedMethod = HttpMethod.Get,
ExceptionToThrow = new TaskCanceledException("simulated timeout")
});

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test simulates an instance discovery timeout, but the mock GET handler does not set ExpectedUrl / expected query params. Adding them would make the test assert that the failing call is actually the instance discovery endpoint (and not some other GET) before verifying the fallback caching behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +277 to +285
httpManager.AddMockHandler(new MockHttpMessageHandler()
{
ExpectedMethod = HttpMethod.Get,
ResponseMessage = new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new StringContent("{}")
},
AdditionalRequestValidation = _ => callerCts.Cancel()
});

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance discovery mock here cancels the caller token via AdditionalRequestValidation, but it does not constrain the request target. Setting ExpectedUrl (and expected query params) would ensure the cancellation is happening during the intended instance discovery call and not another GET in the flow.

Copilot uses AI. Check for mistakes.
Comment on lines +176 to +183
httpManager.AddMockHandler(new MockHttpMessageHandler()
{
ExpectedMethod = HttpMethod.Get,
ResponseMessage = new HttpResponseMessage(errorStatusCode)
{
Content = new StringContent("error")
}
});

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance-discovery mock handler is very permissive (only ExpectedMethod = GET), so any unexpected GET (e.g., to a different endpoint) could consume this handler and make the test less precise. Consider also setting ExpectedUrl (and, if feasible, the expected query params) for the instance discovery request so the test validates the correct endpoint is called before the failure is cached.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings May 22, 2026 00:07

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

bgavrilMS and others added 9 commits June 2, 2026 16:38
Set ExpectedUrl on all instance discovery mock handlers to ensure
the tests validate that the correct endpoint is called before
verifying fallback/caching behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Instance discovery should timeout [Bug] If instance discovery fails due to 404 or 502, it should not be attempted again

5 participants