Skip to main content
← CoursesAI Engineering with PythonModule 3 · RAG (Retrieval-Augmented Generation)Chunking strategieswrite35 / 101
💬 Discuss🧪 Playground+100 XP
Task
📝 **Question:** **Write the function** \`chunk_count(total_tokens, chunk_size, overlap)\` that returns the number of chunks a document produces with sliding-window chunking. Spec: every chunk is \`chunk_size\` tokens. Each chunk overlaps the previous by \`overlap\` tokens — so the *step* (effective new tokens per chunk) is \`chunk_size - overlap\`. The first chunk starts at 0; new chunks keep being added until the start position exceeds \`total_tokens\`. Formula: \`\`\` step = chunk_size - overlap count = ceil(max(0, total_tokens - overlap) / step) \`\`\` (Edge case: if \`total_tokens <= chunk_size\`, that's still 1 chunk.) Then print chunk counts for these three configurations of a 50,000-token document: \`\`\` 500/50: 111 800/80: 70 1500/200: 39 \`\`\` The 3× drop from 111 chunks to 39 chunks → that's the embedding cost trade-off senior devs make explicit before indexing 1M docs. 📋 Pick the right answer. 💡 **Hint:** Re-read the theory above if unsure.

Keep going

✏️ Write your code here
🐍
Loading Python...
First visit only — ~5-10s. Stays cached afterward.
📊 Result
Press Run to see result...
📣 Help someone learn PythonShare this lesson with a friend — the first 15 are free, no signup.Tweet

💬 Discussion

Be the first to ask a question or share a tip.
Sign in to join the discussion. Reading is free.
Loading discussion…