Personal Budget, ML-Powered Finance Analytics
Most personal finance tools ask you to categorise every transaction by hand. The premise is that you, the human, know what each merchant string means and where it belongs in your budget. The reality is that you do not, your bank does not give you consistent merchant strings anyway, and after three months of dutiful tagging the whole system goes cold. I wanted the opposite: import the raw bank data, let the system discover the structure, surface insights I could act on.
The problem
Bank CSV exports are clean enough to import and messy enough to defeat analysis. Merchants repeat under different labels because a single coffee shop turns into "MERCHANT 472921 ABC" on one statement and "ABC Coffee Pty Ltd" on another. Internal transfers between your own accounts look like spending. Subscriptions hide in plain sight because they look like any other recurring debit. And the interesting questions, the ones a spreadsheet pivot will not answer, are about behaviour over time: am I creeping up on lifestyle inflation, which bills are actually recurring, what would the snowball method cost me versus the avalanche method given my real cash-flow shape?
I also wanted the whole platform local and generic. No hardcoded categories. No assumptions about my account layout. Anyone should be able to clone the repo, import their own data, and have the platform learn the shape of their financial life rather than mine.
The approach
Three layers, with the analytics doing the heavy lifting that a typical SaaS finance tool offloads to the user.
Ingestion. Bank CSV import (manual upload or semi-automated export via Playwright on a headless server). Transactions land in SQLite via FastAPI. The schema is opinionated about what a transaction is and agnostic about what it means.
Analytics engine. Python pipeline using pandas, scikit-learn, and statsmodels.
- Auto-categorisation with TF-IDF on transaction text plus kNN, learning from bank labels and user corrections. The system improves as you teach it, without retraining from scratch each time.
- Subscription detection through periodicity analysis across accounts.
- Transfer pairing by cross-account matching for internal movements, so the household does not register a $2,000 self-transfer as a $2,000 spend.
- Anomaly detection for unusual amounts and spending velocity spikes that deserve a second look.
- Behavioural scoring for impulse, lifestyle creep, discipline, and pay-cycle spending shapes. Abstract patterns become concrete numbers you can react to.
- Debt modelling for snowball versus avalanche, with what-if projections that show the trade-off in months and dollars rather than in opinion.
Presentation. React with TypeScript and Recharts dashboards. Financial health score (0 to 100), budget adherence, debt dashboard, category breakdowns. Discord webhooks for weekly reports and import summaries, so the insight reaches me without me opening the app.

The user-correction loop is the part that matters. Every time I move a transaction to a different category, the model picks up the signal. Within a few weeks the platform's categorisation matches the way I think about money rather than the way the bank labels it.

Evidence
- Categories emerge from the data, not from a configuration file
- Subscriptions and transfers detected algorithmically rather than declared up front
- Behavioural metrics (impulse score, lifestyle creep, pay-cycle shape) make patterns visible that a category-by-category view would miss
- Debt strategies comparable side by side with payoff projections, in months and dollars
- Local-first by design, the data never leaves the machine I run it on
- Acts as the data backbone for the rest of the personal stack: tax-assist snapshots its database for receipt matching, and the home agents pull live balances and transactions for daily nudges
The platform demonstrates the full insight-consulting loop I bring to enterprise work: ingest messy source data, model it rigorously, visualise for decision-makers, close the loop with feedback. Same shape, different scale, no SaaS dependency.
← kipjordan.com