Coach Guide — Advanced: Deploy as a Hosted Agent

Coach-only. Do not share with students. This guide holds the verified deployment path, the failure modes teams hit with azd ai agent, and the facilitation arc.

⚠️ This challenge was rewritten away from Prompt Flow. The old version deployed a Prompt Flow to a managed online endpoint and bolted on a Flask app. All of that is gone. If a team is following an old printout that mentions “package the Prompt Flow”, “managed online endpoint”, or a Flask UI, stop them — that content is deprecated. The artifact here is a hosted Foundry agent deployed with azd ai agent. (A Flask/Streamlit UI is now its own Extra — Build a UI — that targets this endpoint.)

What this challenge proves

A team finishes when the Northfield IQ Assistant runs as a hosted, containerized agent with its own endpoint, its own version, and a per-agent managed identity, and they can invoke it over the production Responses protocol with auth enforced and runs visible in App Insights. This is the “ship it” capstone — real container deployment, not the “next steps only” hand-wave the reference labs stop at.

Assumes the Foundations end-state (or bootstrap). If the agent isn’t grounded locally, that’s a Foundations problem first.

The deployment pipeline at a glance

agent.yaml + Dockerfile + main.pyaz acr build (or local docker) → image in ACR → azd ai agent create/deploy → hosted version provisions + per-agent identity → invoke Responses endpoint → run history + traces. The most common mistake is treating a successful azd ai agent deploy exit code as “done” — the version provisions asynchronously, so Step 2’s checkpoint waits for status == active.

Step-by-step coaching

Step 1 — agent.yaml + entrypoint

  • Protocol + port are load-bearing. Hosted agents must listen on 0.0.0.0:8088 and declare the responses protocol (v 1.0.0) in agent.yaml. A container that binds 127.0.0.1 or a different port will deploy but never become healthy.

  • Reuse the Foundations persona. The instructions: block should be the same grounded, cite-your-sources persona from Foundations Step 3 — don’t let teams rewrite it here.

  • The MAF server host (AzureAIAgentServerHost or the equivalent in the current agent-framework release) implements the Responses contract for them. Teams that try to hand-roll a Flask /responses route can do it, but it’s a time sink — steer them to the framework host. Verified reference: foundry-samples/samples/python/hosted-agents/agent-framework/responses/.

  • Local smoke test before any cloud work: docker build --platform linux/amd64 -t niq:test hosted/ && docker run -p 8088:8088 niq:test, then curl -X POST http://localhost:8088/responses -d '{"input":"Where is the registrar?"}'. If this fails locally it will fail in ACR — fix it here.

Step 2 — Containerize + deploy

  • Cloud build is the default — most devcontainers don’t have a working Docker daemon. The --source-acr-auth-id "[caller]" flag is mandatory; omitting it is the #1 az acr build failure (auth context missing). Verified command is in the README.

  • Unique image tag every build. Reusing latest/v1 serves a stale layer and changes won’t roll out. The README uses $(date +%Y%m%d%H%M).

  • ACR pull permission: the Foundry project managed identity needs repository-scoped pull on the ACR (Foundry hosted agents use ABAC mode — Container Registry Repository Reader, not registry-level AcrPull). azd ai agent usually wires this; if the version fails to pull, this is why.

  • active is the gate. az ai agent show --query "version,status"; provisioning can take a couple of minutes. Don’t let teams move to Step 3 on a provisioning version — invokes will return 424 FailedDependency / session_not_ready.

Step 3 — Invoke + identity

  • Two identities, keep them straight: (1) the caller (the student’s DefaultAzureCredential bearer token) authenticates into the endpoint; (2) the per-agent managed identity is what the agent uses to reach the model and knowledge base. The teaching point is that the agent no longer rides on the student’s credentials.

  • Required role: the caller needs Foundry User (formerly Azure AI User) on the project to invoke. A 403 on an authenticated call is almost always a missing role assignment, not bad code.

  • Auth-enforced check: an anonymous call (no Authorization header) must return 401/403. If it returns 200, something is misconfigured — escalate, don’t ship.

  • Responses route: {endpoint}/agents/{agentName}/endpoint/protocols/openai/responses. Teams often fat-finger this path; have them print base_url before debugging deeper.

Step 4 — Monitoring back to Tracing

  • Hosted agents inherit the project’s App Insights, so the spans land in the same tables the team queried in the Tracing challenge. The only new dimension is cloud_RoleName, which carries the agent/container name — that’s how they scope KQL to hosted runs.

  • If a team skipped the Tracing challenge, they can still pass Step 4 via the portal Run history + Tracing tab; the correlate.kql reuse is the richer path but not required.

Timing (60 min)

  • 0–20 min: Step 1 — agent.yaml, entrypoint, local container smoke test (biggest time sink).
  • 20–40 min: Step 2 — ACR build + azd ai agent deploy + wait for active.
  • 40–50 min: Step 3 — invoke + identity/auth verification.
  • 50–60 min: Step 4 — run history + traces.

If time is tight, prioritize a working authenticated invoke (Steps 1–3). Step 4 can be a quick portal walkthrough.

Expected questions

  • azd ai agent deploy succeeded but invoke returns 424.” → version still provisioning. Wait for status == active.

  • az acr build fails with an auth error.” → missing --source-acr-auth-id "[caller]".
  • “My code change didn’t take effect.” → reused image tag; rebuild with a fresh timestamp tag.
  • “403 on an authenticated call.” → caller missing Foundry User (formerly Azure AI User) role on the project.
  • “Where’s the Flask app / managed endpoint from the old challenge?” → removed. This is a hosted agent now; a UI is the separate Build a UI extra.

  • “Container deploys but never goes healthy.” → not listening on 0.0.0.0:8088, or wrong protocol in agent.yaml.

Cleanup discipline

Remind teams to azd down (or az ai agent delete) after the event — a running hosted agent and its ACR image incur cost. Good hygiene to call out at the showcase wrap-up.

Success definition

validate.py --step 4 passes; the agent has an active hosted version; an authenticated Responses call returns a grounded answer; an anonymous call is rejected; and the team can point to the run in both run history and App Insights. No Prompt Flow, no managed online endpoint anywhere in their solution.


WTH AI Hackathon — Built with ❤️ for students and coaches

This site uses Just the Docs, a documentation theme for Jekyll.