Coach Guide — Advanced: Deploy as a Hosted Agent
Coach-only. Do not share with students. This guide holds the verified deployment path, the failure modes teams hit with
azd ai agent, and the facilitation arc.
⚠️ This challenge was rewritten away from Prompt Flow. The old version deployed a Prompt Flow to a managed online endpoint and bolted on a Flask app. All of that is gone. If a team is following an old printout that mentions “package the Prompt Flow”, “managed online endpoint”, or a Flask UI, stop them — that content is deprecated. The artifact here is a hosted Foundry agent deployed with
azd ai agent. (A Flask/Streamlit UI is now its own Extra — Build a UI — that targets this endpoint.)
What this challenge proves
A team finishes when the Northfield IQ Assistant runs as a hosted, containerized agent with its own endpoint, its own version, and a per-agent managed identity, and they can invoke it over the production Responses protocol with auth enforced and runs visible in App Insights. This is the “ship it” capstone — real container deployment, not the “next steps only” hand-wave the reference labs stop at.
Assumes the Foundations end-state (or bootstrap). If the agent isn’t grounded locally, that’s a Foundations problem first.
The deployment pipeline at a glance
agent.yaml + Dockerfile + main.py → az acr build (or local docker) → image in ACR → azd ai agent create/deploy → hosted version provisions + per-agent identity → invoke Responses endpoint → run history + traces. The most common mistake is treating a successful azd ai agent deploy exit code as “done” — the version provisions asynchronously, so Step 2’s checkpoint waits for status == active.
Step-by-step coaching
Step 1 — agent.yaml + entrypoint
-
Protocol + port are load-bearing. Hosted agents must listen on
0.0.0.0:8088and declare theresponsesprotocol (v1.0.0) inagent.yaml. A container that binds127.0.0.1or a different port will deploy but never become healthy. -
Reuse the Foundations persona. The
instructions:block should be the same grounded, cite-your-sources persona from Foundations Step 3 — don’t let teams rewrite it here. -
The MAF server host (
AzureAIAgentServerHostor the equivalent in the currentagent-frameworkrelease) implements the Responses contract for them. Teams that try to hand-roll a Flask/responsesroute can do it, but it’s a time sink — steer them to the framework host. Verified reference:foundry-samples/samples/python/hosted-agents/agent-framework/responses/. -
Local smoke test before any cloud work:
docker build --platform linux/amd64 -t niq:test hosted/ && docker run -p 8088:8088 niq:test, thencurl -X POST http://localhost:8088/responses -d '{"input":"Where is the registrar?"}'. If this fails locally it will fail in ACR — fix it here.
Step 2 — Containerize + deploy
-
Cloud build is the default — most devcontainers don’t have a working Docker daemon. The
--source-acr-auth-id "[caller]"flag is mandatory; omitting it is the #1az acr buildfailure (auth context missing). Verified command is in the README. -
Unique image tag every build. Reusing
latest/v1serves a stale layer and changes won’t roll out. The README uses$(date +%Y%m%d%H%M). -
ACR pull permission: the Foundry project managed identity needs repository-scoped pull on the ACR (Foundry hosted agents use ABAC mode —
Container Registry Repository Reader, not registry-levelAcrPull).azd ai agentusually wires this; if the version fails to pull, this is why. -
activeis the gate.az ai agent show --query "version,status"; provisioning can take a couple of minutes. Don’t let teams move to Step 3 on aprovisioningversion — invokes will return424 FailedDependency/session_not_ready.
Step 3 — Invoke + identity
-
Two identities, keep them straight: (1) the caller (the student’s
DefaultAzureCredentialbearer token) authenticates into the endpoint; (2) the per-agent managed identity is what the agent uses to reach the model and knowledge base. The teaching point is that the agent no longer rides on the student’s credentials. -
Required role: the caller needs
Foundry User(formerlyAzure AI User) on the project to invoke. A403on an authenticated call is almost always a missing role assignment, not bad code. -
Auth-enforced check: an anonymous call (no
Authorizationheader) must return401/403. If it returns200, something is misconfigured — escalate, don’t ship. -
Responses route:
{endpoint}/agents/{agentName}/endpoint/protocols/openai/responses. Teams often fat-finger this path; have them printbase_urlbefore debugging deeper.
Step 4 — Monitoring back to Tracing
-
Hosted agents inherit the project’s App Insights, so the spans land in the same tables the team queried in the Tracing challenge. The only new dimension is
cloud_RoleName, which carries the agent/container name — that’s how they scope KQL to hosted runs. -
If a team skipped the Tracing challenge, they can still pass Step 4 via the portal Run history + Tracing tab; the
correlate.kqlreuse is the richer path but not required.
Timing (60 min)
- 0–20 min: Step 1 — agent.yaml, entrypoint, local container smoke test (biggest time sink).
- 20–40 min: Step 2 — ACR build +
azd ai agent deploy+ wait foractive. - 40–50 min: Step 3 — invoke + identity/auth verification.
- 50–60 min: Step 4 — run history + traces.
If time is tight, prioritize a working authenticated invoke (Steps 1–3). Step 4 can be a quick portal walkthrough.
Expected questions
-
“
azd ai agent deploysucceeded but invoke returns 424.” → version still provisioning. Wait forstatus == active. - “
az acr buildfails with an auth error.” → missing--source-acr-auth-id "[caller]". - “My code change didn’t take effect.” → reused image tag; rebuild with a fresh timestamp tag.
- “403 on an authenticated call.” → caller missing
Foundry User(formerlyAzure AI User) role on the project. -
“Where’s the Flask app / managed endpoint from the old challenge?” → removed. This is a hosted agent now; a UI is the separate Build a UI extra.
- “Container deploys but never goes healthy.” → not listening on
0.0.0.0:8088, or wrong protocol inagent.yaml.
Cleanup discipline
Remind teams to azd down (or az ai agent delete) after the event — a running hosted agent and its ACR image incur cost. Good hygiene to call out at the showcase wrap-up.
Success definition
validate.py --step 4 passes; the agent has an active hosted version; an authenticated Responses call returns a grounded answer; an anonymous call is rejected; and the team can point to the run in both run history and App Insights. No Prompt Flow, no managed online endpoint anywhere in their solution.