Expense tracker CLI with CSV import + monthly reports
Tiny but interview-defensible CRUD project that teaches schema design, idempotent CSV import, deterministic rules engines, money math with Decimal, and PDF generation — all skills senior engineers reach for daily.
Resume bullet (when finished)
“Built a SQLite-backed expense tracker with CSV bank-statement import, deterministic category-rules engine, monthly reports as PDF, and 87% pytest coverage.”
Locked tech stack
No "choose your language" — analysis paralysis kills completion. Follow the stack to the letter on your first build.
Milestones (5 · ~13h)
- M1~2h
SQLite schema + first migration
Tables `accounts`, `transactions`, `rules`. Migration via plain SQL files (no ORM).
CHECK BEFORE MOVING ON:
- Why SQLite, not Postgres, for this project?
- Why migrations as plain SQL rather than Alembic?
$ git commit -m "feat(db): initial SQLite schema + migrations" - M2~3h
CSV import — bank format
`expenses import statement.csv --account checking` is idempotent (re-running doesn't duplicate). Uses Decimal for amounts.
CHECK BEFORE MOVING ON:
- Why never store money as float?
- How do you make CSV import idempotent without a deduplication ID column?
$ git commit -m "feat(import): idempotent CSV ingest with Decimal amounts" - M3~2h
Rules-based categorisation
`rules add 'STARBUCKS' coffee`. On import, every uncategorised row matching the substring gets the category. First match wins, deterministic order.
CHECK BEFORE MOVING ON:
- Why deterministic order matters for a categorisation engine.
- When would you reach for ML instead of rules — and why don't you here?
$ git commit -m "feat(rules): substring-match categorisation engine" - M4~4h
Monthly PDF report
`expenses report --month 2026-05 --pdf out.pdf` generates a PDF with totals by category, top-5 expenses, and a sparkline.
CHECK BEFORE MOVING ON:
- Why ReportLab instead of WeasyPrint or wkhtmltopdf?
- What's the right way to test PDF generation without comparing bytes?
$ git commit -m "feat(report): monthly PDF via ReportLab" - M5~2h
87% test coverage + CI
pytest + coverage gate at 85% in GitHub Actions. Tests use an in-memory SQLite for speed.
CHECK BEFORE MOVING ON:
- Why a coverage *gate* in CI, not just a *report*?
- What's wrong with chasing 100% coverage?
$ git commit -m "test: 87% coverage + CI gate"
60-second demo storyboard
What you say in the recruiter screen when they ask "tell me about your latest project." Practice it out loud.
- 0-5s: 'CLI expense tracker — CSV in, PDF report out, deterministic rules engine.'
- 5-25s: live demo — import a CSV, add a rule, generate the monthly PDF.
- 25-45s: show the rules engine code + 'why first-match-wins matters'.
- 45-60s: pytest --cov; 87% in 1.4s.
STAR talking points for behavioral round
STAR — CORRECTNESS
Situation: first version stored amounts as float — `0.1 + 0.2 != 0.3` showed up in tests. Task: fix money math. Action: switched to Decimal everywhere, added a custom SQLite adapter, and made the schema TEXT for amounts. Result: arithmetic exact to the cent and tests went green permanently.
STAR — IDEMPOTENCY
Situation: re-importing a CSV duplicated rows. Task: make import safe to re-run. Action: introduced a synthetic dedup key — sha256 of (account, date, amount, raw_description) — added a unique constraint. Result: rerunning yesterday's import is a no-op, which is exactly what users expect.
Production references — how grown-up systems do this
Self-review rubric (before you claim done)
Correctness
- Money math via Decimal, not float.
- CSV re-import is a no-op.
- Reports total to the cent.
- Rules apply deterministically.
Code quality
- DB code separated from CLI layer.
- Migrations live in a versioned directory.
- Decimal serialisation centralised, not scattered.
Testing
- ≥85% line coverage on the `expenses/` package.
- Migration tests verify forward + reverse.
- PDF tests verify structure, not byte-exact output.
Docs
- README explains the dedup-hash design.
- Sample CSV + sample PDF committed for grading.
- One-paragraph rationale: 'why SQLite for this'.
✱ AI code review
Get a senior-style review before you call it done
Push your finished work to GitHub, open a PR, paste the PR URL below. Claude reviews the diff against this project's rubric and replies with strengths, must-fix items, and one teachable principle.
Tick the rubric items honestly, write the README, push to GitHub, get the AI review above. Once it's clean, email support@learnpython.academy with the repo link — we feature the best ones on /success-stories.
Need Python first? Start Foundations →