Skip to main content
πŸ”’ Preview mode. The first 15 Foundations lessons are free; this one is Pro. Start a 7-day trial to unlock the editor, AI hints and the the rest of the curriculum. Card required, cancel any time in Dashboard.Start 7-day trial β†’
⚑
← Coursesβ€ΊAI Engineering with PythonModule 4 Β· Agent Loops & Workflows Β· Recapβ€ΊπŸŽ― Review: AI engineering module 4 recapscenario85 / 105
+80 XP
Task
Build `eval_run(prompts, expected, model)` β€” the smallest eval harness that's still useful. 1. Mock the model: `mock(i, p) = expected[i] if i % 2 == 0 else "WRONG"`. (Real eval would call the SDK here; we mock so it's deterministic.) 2. For each prompt, compare model output to `expected[i]`. On miss, append `(prompt, expected, got)` to `failures`. 3. Return `{"pass_rate": <float>, "failures": [...], "model": model}`. The harness runs 4 prompts with even indices passing and odd indices failing; you should see pass_rate=0.5 and two failure tuples.
✏️ Write your code here
🐍
Loading Python...
One-time download (~1 MB). Then it runs instantly.
πŸ“Š Result
Press Run to see result...

πŸ’¬ Discussion

Be the first to ask a question or share a tip.
Sign in to join the discussion. Reading is free.
Loading discussion…