Skip to main content
← CoursesAI Engineering with PythonModule 3 · RAG (Retrieval-Augmented Generation)Vector databaseswrite34 / 101
💬 Discuss🧪 Playground+100 XP
Task
📝 **Question:** **Write the function** \`recommend_vector_db(stack, vector_count, ops_burden_ok)\` that returns \`(db_name, reason)\`. Use this decision tree (in order): 1. If \`vector_count >= 10_000_000\` → \`("Qdrant or Pinecone", "10M+ vectors exceed pgvector's comfortable range")\` 2. Else if \`stack == "postgres"\` → \`("pgvector", f"Already on Postgres + {vector_count:,} vectors — one DB, less ops")\` 3. Else if \`ops_burden_ok\` → \`("Qdrant", "Self-hosted, Rust, excellent metadata filtering")\` 4. Else → \`("Pinecone", "Managed, easiest to operate; pay for the convenience")\` Then run it against four real startup shapes: \`\`\` stack=postgres n= 500,000 ops_ok=True -> pgvector (Already on Postgres + 500,000 vectors — one DB, less ops) stack=postgres n=50,000,000 ops_ok=True -> Qdrant or Pinecone (10M+ vectors exceed pgvector's comfortable range) stack=none n= 100,000 ops_ok=False -> Pinecone (Managed, easiest to operate; pay for the convenience) stack=none n= 100,000 ops_ok=True -> Qdrant (Self-hosted, Rust, excellent metadata filtering) \`\`\` The 500K-on-Postgres case is the **most common production answer** and the one juniors get wrong by reaching for Pinecone — operational simplicity (one DB, one backup, one connection pool) beats Pinecone's marginal performance edge at this scale. Migrate ONLY when you've measured pgvector hitting limits. 📋 Pick the right answer. 💡 **Hint:** Re-read the theory above if unsure.

Keep going

✏️ Write your code here
🐍
Loading Python...
First visit only — ~5-10s. Stays cached afterward.
📊 Result
Press Run to see result...
📣 Help someone learn PythonShare this lesson with a friend — the first 15 are free, no signup.Tweet

💬 Discussion

Be the first to ask a question or share a tip.
Sign in to join the discussion. Reading is free.
Loading discussion…