.NET: Surface x-ms-served-model header as ChatResponse.ModelId for Foundry agents#5979
Merged
rogerbarreto merged 5 commits intoMay 21, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Surfaces the Azure OpenAI Responses API x-ms-served-model response header (the real served snapshot model, e.g. gpt-5-nano-2025-08-07) as ChatResponse.ModelId / ChatResponseUpdate.ModelId for Foundry-based agents, so telemetry and callers see the actual model instead of the deployment alias from the JSON body.
Changes:
- Add an internal SCM
PipelinePolicy(ServedModelPolicy) +AsyncLocal<StrongBox<string?>>carrier (ServedModelScope) to capturex-ms-served-modelafter the HTTP roundtrip. - Add an internal
DelegatingChatClient(ServedModelChatClient) that overwritesModelIdon responses/updates using the captured header. - Wire this behavior into Foundry agent creation paths and add unit + integration coverage.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| dotnet/src/Microsoft.Agents.AI.Foundry/ServedModelScope.cs | AsyncLocal StrongBox carrier for propagating the served-model header across async boundaries. |
| dotnet/src/Microsoft.Agents.AI.Foundry/ServedModelPolicy.cs | SCM pipeline policy capturing x-ms-served-model into the scope after ProcessNext. |
| dotnet/src/Microsoft.Agents.AI.Foundry/ServedModelChatClient.cs | Delegating chat client that applies the captured served model onto ModelId for non-streaming and streaming. |
| dotnet/src/Microsoft.Agents.AI.Foundry/FoundryAgent.cs | Wires served-model policy/client into agent creation (responses + agent endpoint paths). |
| dotnet/src/Microsoft.Agents.AI.Foundry/AzureAIProjectChatClientExtensions.cs | Wires served-model policy/client into AIProjectClient.AsAIAgent extension creation paths. |
| dotnet/tests/Microsoft.Agents.AI.Foundry.UnitTests/ServedModelTests.cs | New unit tests covering scope/policy/client behavior and end-to-end behavior via real OpenAI SCM pipeline + mock HTTP. |
| dotnet/tests/Microsoft.Agents.AI.Foundry.UnitTests/Microsoft.Agents.AI.Foundry.UnitTests.csproj | Excludes ServedModelTests for legacy TFMs (net472). |
| dotnet/tests/Foundry.IntegrationTests/ResponsesAgentServedModelTests.cs | New live integration tests validating ChatResponse.ModelId reflects served-model header. |
Comments suppressed due to low confidence (1)
dotnet/src/Microsoft.Agents.AI.Foundry/ServedModelChatClient.cs:69
- GetStreamingResponseAsync sets ServedModelScope.Current but never restores it after enumeration completes or errors. Even if AsyncLocal sometimes restores across awaited boundaries, async iterators can complete synchronously or be disposed early, leaving the scope set unexpectedly. Wrap the await-foreach in try/finally (saving/restoring the previous ServedModelScope.Current) so the scope is always cleared/restored when the enumerator is disposed.
{
var box = new StrongBox<string?>(null);
ServedModelScope.Current = box;
await foreach (var update in base.GetStreamingResponseAsync(messages, options, cancellationToken).ConfigureAwait(false))
{
if (box.Value is { } servedModel)
{
update.ModelId = servedModel;
}
yield return update;
}
}
westey-m
reviewed
May 21, 2026
westey-m
approved these changes
May 21, 2026
SergeyMenshykh
approved these changes
May 21, 2026
…undry agents Mirrors Python PR microsoft#5910. Adds an internal SCM PipelinePolicy that reads the x-ms-served-model HTTP response header on Azure OpenAI Responses calls and writes it into an AsyncLocal box. A DelegatingChatClient sits between OpenTelemetry and the MEAI OpenAIResponsesChatClient and overwrites ChatResponse.ModelId with the served snapshot so OTel spans report the actual model rather than the deployment alias. Wired through all AsAIAgent paths in Microsoft.Agents.AI.Foundry.
- Restore previous ServedModelScope in finally to avoid AsyncLocal leak into caller execution context. - Make served-model integration test assertion robust to deployment names that already match the snapshot pattern. - Broaden UnitTests csproj comment to cover all conditional removals (net8.0+ requirement).
Split the combined ServedModelTests.cs into one test class per SUT: - ServedModelScopeTests.cs (AsyncLocal carrier) - ServedModelPolicyTests.cs (SCM pipeline policy) - ServedModelChatClientTests.cs (delegating client, with regions for Non-streaming / Streaming / End-to-end) Shared helpers and fake clients moved into ServedModelTestHelpers.cs. Csproj net8.0+ exclusion list updated accordingly.
Move x-ms-served-model header capture from the standalone ServedModelChatClient decorator directly into FoundryChatClient, eliminating a separate wrapper that had to be applied at every Foundry entry point via WireServedModel(). - Register ServedModelPolicy in FoundryChatClient constructors (alongside the existing AgentFrameworkUserAgentPolicy registration) - Add StrongBox push/read logic to FoundryChatClient.GetResponseAsync and GetStreamingResponseAsync - Delete ServedModelChatClient.cs and its unit tests - Remove WireServedModel() from FoundryAgent and AIProjectClientExtensions - Update ServedModelPolicy/Scope XML docs to reference FoundryChatClient - Simplify ServedModelTestHelpers to use FoundryChatClient directly Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
lokitoth
approved these changes
May 21, 2026
72e1d0c to
6c2f8e4
Compare
SergeyMenshykh
approved these changes
May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
.NET twin of python PR #5910.
Foundry Responses API send back
x-ms-served-modelheader. Header carry real snapshot (e.g.gpt-5-nano-2025-08-07). JSON body only carry deployment alias. OTel span andChatResponse.ModelIdshow wrong name without fix.Fix:
ServedModelPolicy(SCMPipelinePolicy) read header afterProcessNext.AsyncLocal<StrongBox<string?>>carry value across async boundary (box mutation propagate, plain AsyncLocal write do not).ServedModelChatClient(internalDelegatingChatClient) push box before inner call, overwriteChatResponse.ModelIdafter. Sit between OTel and MEAIOpenAIResponsesChatClient.AsAIAgentpath inMicrosoft.Agents.AI.Foundry(Responses direct, hosted agent endpoint, versioned, AgentRecord, AgentReference).internal. MEAI.Core untouched (provider-agnostic).Tests:
ModelIddiffer from deployment alias.All green: 283 unit pass, 2 integration pass, build
--warnaserrorclean, CI paritydotnet formatclean.