ML Engineer & Founder
Dave
A'Hearne
I build production ML systems. 13 years of software engineering and engineering leadership, now focused on applied NLP, ML platform engineering, and distributed systems. The gap between notebook ML and production ML is real, and it matters.
Skills
Languages
Applied ML / NLP
Data Science
MLOps
Cloud & Architecture
Engineering Practice
Experience
Paid engagement to build Keenu.io, an NLP-based IELTS assessment system. Designed and shipped two production systems from scratch.
- Designed an NLP scoring pipeline using spaCy word vectors and cosine similarity across eight subscores: paragraph cohesion, sentence-to-document centroid similarity (min and max), lexical richness, vocabulary sophistication, spelling error count and category, and punctuation placement and type.
- Integrated a HuggingFace BERT-based punctuation restoration model for grammatical analysis, precomputed and cached at startup to keep inference latency low under concurrent load.
- Built a routed LLM client supporting OpenAI (GPT-4o-mini) and Ollama (Qwen 2.5 7B) as interchangeable backends, with structured prompt versioning iterated against a 100-sample evaluation dataset. Applied linear regression calibration achieving 94% of marks within +/- one band and 86% within +/- half a band. Evaluated using MAE and QWK.
- Implemented concurrent subscore computation via ThreadPoolExecutor, running all eight NLP subscores in parallel per request.
- Set up MLflow for experiment tracking and dataset versioning. Built a full CI/CD pipeline via GitHub Actions: running tests, building Docker images, pushing to AWS ECR, and deploying to AWS ECS. Configured all AWS infrastructure including IAM roles, ECR repositories, ECS task definitions, services, and load balancer health checks from scratch.
- Built in Go: a platform for ingesting, storing, and human-scoring IELTS test submissions from real students, producing the ground-truth labels MLflow experiments ran against.
Owned the BrightSafe product end to end, from ideation and prioritisation through delivery across five countries: Australia, New Zealand, Canada, Ireland, and the UK.
- Hired and led a cross-functional team of 9 to 15 engineers, designers, and QA. Responsible for development plans, 1:1s, mentoring, and performance management.
- Acted as the primary interface between engineering, product, and delivery stakeholders, shaping work into deliverable increments while balancing technical constraints and business priorities.
- Delivered third-party identity server integrations, a rebuilt documents platform, the engineering blog, and back office tooling improvements.
- Produced early-stage prototypes and architecture diagrams using microservices and CQRS patterns. Led live production incident response across all five operating regions.
Contract role on the EcoIQ smart heating platform, a distributed .NET system with teams in France and China.
- Collaborated with lead engineers to modernise shared .NET Standard libraries across teams.
- Worked with embedded systems engineers to design, build, and test integrations with Panasonic heat pumps using .NET 6, applying TDD to ensure reliable communication between backend services and physical hardware.
Built and led the investments engineering team from the ground up, working closely with product and stakeholders to shape technical direction and identify milestones.
- Hired and technically mentored the team. Architected greenfield solutions, created technical prototypes, and established logging, alerting, and incident post-mortem processes.
- Ran company-wide talks and 1:1 mentoring on TDD, BDD, trunk-based development, feature toggling, safe refactoring, and CI/CD.
- Developed strategies for removing technical debt and ensuring quality delivery through integration, end-to-end, and contract testing.
- Worked across multiple teams refactoring monoliths into microservices, applying TDD and agile practices. Languages and tools included C#, Java, JavaScript, Python, Terraform, and AWS.
- Collaborated with product owners to identify thin vertical slices of value, delivering through pairing, mobbing, and independent work.
- Mentored junior and graduate engineers through code reviews, pairing, and leading department Campfire talks.
- Built and maintained internal and public-facing websites and APIs using C#, MVC, React, Node.js, and .NET Core, supporting order tracking, diagnostics, and network management systems.
- Prototyped AWS Lambda, Azure, and .NET Core to inform technical direction. Refactored legacy code in an agile environment with rapid feedback cycles.
- Built features and fixed bugs using C#, JavaScript, VB.NET, jQuery, and SQL. Frameworks included MVC, MVVM, and Angular. Applied BDD and TDD using NUnit and SpecFlow.
Early-career commercial development role.
Projects
Clinical code RAG pipeline built for NICE (National Institute for Health and Care Excellence), developed as the employer project for the Cambridge Data Science Career Accelerator. Given a free-text clinical research question, the system parses it into typed clinical entities, retrieves and ranks relevant codes across SNOMED CT, ICD-10, QOF, and NHS reference sets, and returns results with source attribution and confidence scoring.
- Two-stage RAG pipeline: offline ingestion encoding clinical code descriptions into biomedical embeddings (SapBERT/BioBERT) stored in sqlite-vec, and an online query stage combining semantic vector search with TF-IDF hybrid retrieval, reranking, and LLM-assisted reasoning.
- Stanza NER (bc5cdr and i2b2 biomedical models) decomposes free-text queries into typed entities (conditions, drugs, exclusions), driving structured retrieval across multiple code systems simultaneously.
- FastAPI service with WebSocket streaming of pipeline progress so results appear incrementally as each stage completes.
- MLflow evaluation tracking F1, precision, and recall against gold-standard question-to-code-set pairs, with an acceptance threshold of 0.70 across all three metrics and confidence scoring above 0.65.
- Dockerised with automatic ingestion on first boot and Stanza model caching baked into the image layer.
Two-sided senior tech recruitment platform for the UK market, built and run as a founder project. Hard constraint filtering into BAAI/bge-large-en-v1.5 sentence transformer scoring for candidate-role matching. A Scout service pulls from ATS adapters across Greenhouse, Lever, Workable, and Ashby. Go chosen deliberately for compiled binaries, minimal Alpine container images, and first-class concurrency via channels wired into WebSocket connections.
Education
Links
Curriculum Vitae — 2026
ML Engineer & Founder — PDF download