Case Study
Codeword Pipeline
A 6-stage real-time ML pipeline that monitors live audio, transcribes speech, classifies content, and extracts structured data — running unattended in production.
Python
scikit-learn
TF-IDF
Vosk
FFmpeg
500+
Identified
0
Missed
6
False Positives
6/6
Self-Caught
0
Failures
The Build Story
The tension between training confidence and production accuracy — and why the metrics you optimise for during development aren't always the metrics that matter.
Full Case Study
The complete case study — covering the 6-stage architecture, classifier design decisions, the training-vs-production tension in detail, and what I'd do differently — is in progress.
In the meantime, the carousel above covers the core story: why the system I almost didn't trust turned out to be effectively perfect.