Split the provider into focused auth, onboarding, CLI, runtime, and shared modules so the Entra ID flow is easier to review and maintain. Add Foundry-specific tests, preserve Azure CLI error details, move token refresh off the synchronous request path, and dedupe concurrent Entra token refreshes so onboarding and GPT-5 runtime behavior stay reliable.
Only retry Azure login with an explicit tenant when the CLI failure actually points to tenant or subscription scope, keep HTTP 400 connection checks informative without treating them as a silent success, and move the model-selection hook onto the provider so manual Foundry setups can preserve GPT-5 family hints and resolve the right runtime endpoint.
Prefer the currently selected model hint during runtime auth refresh so switching Foundry deployments cannot reuse stale onboarding metadata and route requests to the wrong GPT-5 or non-GPT-5 endpoint.
Fix the renamed workspace path in pnpm-lock, make onboarding test and runtime routing respect the underlying model family for GPT-5 deployments, configure the API-key path with provider metadata, replace shell-built az commands with argument-based execution, and scope the Entra token cache by tenant/subscription so the provider behaves correctly across Foundry setups.
Handle Windows az device-code login, tenant fallback, and Azure resource/deployment discovery so onboard can reuse existing Microsoft Foundry setups without manual endpoint entry. Normalize GPT-5 deployments onto the Foundry responses base URL at selection and runtime auth time, and fall back to hard links for Windows plugin staging so local builds and chats work end-to-end.