Is CodeMentor AI free?

The first 15 Python lessons are free with no signup and no credit card. After that, the 7-day Pro trial unlocks every track; cancel anytime. Pro is $12/month or $89/year.

Can I learn Python without installing anything?

Yes. Every lesson runs Python in your browser — Skulpt for lightweight lessons and Pyodide (full CPython) on the playground. No Anaconda, no pyenv, no terminal commands. Open the page and hit Run.

Is CodeMentor AI good for complete beginners?

Yes — the Foundations track starts with print('Hello, World!') and assumes zero programming background. The first 15 lessons are free to verify the difficulty curve matches you before any signup.

Does the AI tutor replace a human mentor?

It replaces 80% of "I'm stuck at 21:00 and Stack Overflow scared me" moments. You get hints calibrated to your code + a chat for follow-up questions. For project review and career advice the team also answers support@learnpython.academy directly.

Can I learn Python for an AI engineering job?

Yes. The AI Engineering track covers production patterns the US dev community uses in 2026 — Claude/LLM APIs, tool use, RAG, agent loops, prompt caching, evals, voice agents. Build production AI features end-to-end.

Are the courses available in languages other than English?

Yes — the platform UI and most lessons are translated into 18 languages including Ukrainian, Russian, Polish, German, French, Spanish, Portuguese, and more. Pick yours in the language switcher.

← All projects

L3AI Engineering· 25-40h total

AI Telegram bot with conversation memory & tool use

AI Engineering is one of the fastest-growing junior areas right now, and the entry signal teams look for is 'has shipped a real LLM-backed system, not a demo'. This project covers the whole stack — API integration, memory, tools, deployment, evals — in one resume bullet.

▶Open in GitHub Codespaces· free 60h/mo Open in Gitpod

Resume bullet (when finished)

“Shipped a production AI Telegram bot with multi-turn conversation memory, tool use (web search + calculator), Redis persistence, webhook deployment, and a small eval suite. 99.9% uptime over the first month.”

Locked tech stack

No "choose your language" — analysis paralysis kills completion. Follow the stack to the letter on your first build.

Python 3.12Anthropic SDK (Claude Opus 4.6)python-telegram-bot v21Redis (conversation memory + rate limit)FastAPI (webhook receiver)Fly.io (deploy)pytest + httpx

Milestones (7 · ~29h)

M1~3h
Local echo bot — Telegram webhook reaches FastAPI
Set up a bot via @BotFather. ngrok tunnels localhost. /api/telegram/webhook receives Update payloads and echoes the text back.
CHECK BEFORE MOVING ON:
- What's the difference between long-polling and webhooks for a Telegram bot?
- Why must your webhook always respond <5s, even if the LLM call takes longer?
$ git commit -m "feat: telegram webhook receiver + echo behaviour"
M2~3h
Single-turn Claude call
Replace the echo with a Claude API call (claude-opus-4-6, streaming off for simplicity). Reply text comes back to Telegram via the bot API.
CHECK BEFORE MOVING ON:
- Why call the API server-side instead of from the user's device?
- What's `max_tokens` for and what happens if you set it too high?
$ git commit -m "feat(ai): single-turn Claude response"
M3~5h
Multi-turn memory via Redis
Store conversation history keyed by `chat_id` in Redis. Pass last 10 messages (or last 4000 tokens — whichever is smaller) on each turn.
CHECK BEFORE MOVING ON:
- Why does the system message live OUTSIDE the rolling history?
- What's the failure mode if you forget to truncate history?
$ git commit -m "feat(memory): redis-backed multi-turn conversation context"
M4~6h
Tool use — calculator + web search
Declare two tools (`calculator(expression: str)`, `web_search(query: str)`). On `tool_use` stop reason, execute, append `tool_result`, loop until the model emits text.
CHECK BEFORE MOVING ON:
- Why does the spec require you to send tool_result back in the SAME ordering as tool_use blocks?
- What's the safety risk of letting Claude call a calculator that uses `eval()` directly?
$ git commit -m "feat(tools): calculator + web_search tool use loop"
M5~3h
Rate limit + cost cap per chat_id
5 messages/min per chat. 200 messages/day per chat. Friendly throttle message when hit. Each chat capped at $0.20/day in API spend (rough token counting).
CHECK BEFORE MOVING ON:
- Why per-chat rate limits matter even for an internal bot — what's the abuse vector?
- What's a cheap way to estimate Claude input/output tokens without a tokenizer dep?
$ git commit -m "feat: per-chat rate limit + daily cost cap"
M6~5h
Eval suite — 30 golden conversations
Build a `golden.jsonl` of 30 hand-crafted prompts with `must_include` / `must_not_include` checks. CI runs the eval; merge blocked on regression.
CHECK BEFORE MOVING ON:
- Why eval BEFORE you optimize the prompt, not after?
- What's a good mix of happy-path / edge / adversarial in a 30-item eval set?
$ git commit -m "test: 30-item golden eval suite + CI gate"
M7~4h
Deploy to Fly.io with secrets + uptime probe
Multi-stage Dockerfile, fly.toml with health check on /health, Telegram webhook re-registered to the public URL. UptimeRobot pings /health every 5 min.
CHECK BEFORE MOVING ON:
- Why store the Anthropic key in `fly secrets set` instead of fly.toml env?
- What's a good action when /health flaps but uptime says 99.9%?
$ git commit -m "ops: fly.io deploy with secrets + uptime probe"

60-second demo storyboard

What you say in the recruiter screen when they ask "tell me about your latest project." Practice it out loud.

0-5s: 'I built an AI Telegram bot — conversation memory, tools, evals, the works.'
5-15s: Send it 3 messages in a Telegram clip — it remembers context from message 1 in message 3.
15-30s: 'Calculate 27 * 31 + sqrt(144).' — show tool_use roundtrip, the bot returns 849.
30-45s: Show the eval suite running in CI — 28/30 passing, the 2 failing are flagged for review.
45-55s: One architectural decision (e.g. 'I cap each chat at $0.20/day because…') in plain English.
55-60s: 'Repo + deployment URL. Would love your feedback on the tool-loop error handling.'

STAR talking points for behavioral round

STAR — PRODUCTION INCIDENT

Situation: bot started replying with 'I cannot help with that' to ~10% of valid requests. Task: figure out why. Action: added per-message logging of full Claude response, found the safety classifier was firing on a system-prompt phrase ('act as'). Result: rewrote the system message in instructional tone, false-refusal rate dropped from 10% to under 1%, eval suite caught the regression next iteration.

STAR — DESIGN TRADE-OFF

Situation: had to choose between trimming history by message count vs by token count. Task: pick the right one. Action: chose token count with a 4000-token budget. Reason: long messages from one user shouldn't push out 20 short messages from someone else. Result: more even cost per chat, no truncation surprises in high-volume threads.

STAR — EVALUATION DISCIPLINE

Situation: a prompt change felt better in manual testing but I wasn't sure. Task: prove it objectively. Action: ran both prompts through the 30-item golden eval, compared per-item scores. Result: the 'better' prompt actually regressed on 4 edge cases — caught before deploy. Lesson: evals catch what intuition misses.

Production references — how grown-up systems do this

Anthropic →

Anthropic's tool use docs are the canonical reference for the request/response loop and stop_reason handling.

Telegram →

Bot API docs — read the Webhook section twice, it's the source of 80% of first-time-bot bugs.

Vercel AI SDK →

Different language, same shape — Vercel's AI SDK documents the multi-turn + tool-use pattern in a way that maps cleanly to your Python code.

Self-review rubric (before you claim done)

Correctness

Bot remembers the last N user turns and demonstrably uses them.
Tool use loops until `stop_reason: end_turn`; partial tool_use never reaches the user.
Rate limiter triggers cleanly (visible 'slow down' message) at 6 messages/min.
Eval suite passes ≥27/30 on the latest commit; CI fails on regression.

Code quality

System message + tool schemas in a separate module — not inline in the handler.
Anthropic SDK calls wrapped with retries on 529 (overloaded) and rate-limit responses.
All env var access goes through a single config module with validation.
Per-chat metric counters (messages, tokens, cost) — observable in logs.

Testing

Golden eval suite committed; running it locally takes <2 minutes.
At least 2 unit tests for the tool-execution layer (calculator happy + bad-input).
Mock Claude responses in unit tests — no real API calls in CI.
Integration test exercises the full webhook → Claude → tool → reply roundtrip with a fake Telegram client.

Docs

README explains how to provision a bot via @BotFather and where each secret comes from.
Architecture diagram: Telegram → webhook → FastAPI → Claude + Redis → Telegram.
Three design decisions written up in plain English (history strategy, rate-limit math, deploy choice).
Eval suite README explains how to add a new golden item.

✱ AI code review

Get a senior-style review before you call it done

Push your finished work to GitHub, open a PR, paste the PR URL below. Claude reviews the diff against this project's rubric and replies with strengths, must-fix items, and one teachable principle.

Tick the rubric items honestly, write the README, push to GitHub, get the AI review above. Once it's clean, email support@learnpython.academy with the repo link — we feature the best ones on /success-stories.

Need Python first? Start Foundations →