ML Engineer & Founder

Dave
A'Hearne

I build production ML systems. 13 years of software engineering and engineering leadership, now focused on applied NLP, ML platform engineering, and distributed systems. The gap between notebook ML and production ML is real, and it matters.

Download CV PDF — 2026
01

Skills

Languages

Python Go C# SQL

Applied ML / NLP

spaCy HuggingFace Transformers BERT sentence embeddings cosine similarity OpenAI API Ollama RAG pipelines vector search sqlite-vec Stanza NER

Data Science

scikit-learn XGBoost Keras TensorFlow PyTorch pandas numpy statsmodels BERTopic

MLOps

MLflow FastAPI Docker AWS ECS AWS ECR GitHub Actions CI/CD pytest

Cloud & Architecture

AWS (ECS, ECR, S3, IAM) Azure microservices CQRS hexagonal architecture event-driven design distributed systems

Engineering Practice

TDD / BDD trunk-based development feature toggling contract testing incident management technical hiring mentoring
02

Experience

Jan 2026 – Present
Keenu.io
ML Engineer Turing Innovation Catalyst

Paid engagement to build Keenu.io, an NLP-based IELTS assessment system. Designed and shipped two production systems from scratch.

Gradia — IELTS Writing Assessment API
  • Designed an NLP scoring pipeline using spaCy word vectors and cosine similarity across eight subscores: paragraph cohesion, sentence-to-document centroid similarity (min and max), lexical richness, vocabulary sophistication, spelling error count and category, and punctuation placement and type.
  • Integrated a HuggingFace BERT-based punctuation restoration model for grammatical analysis, precomputed and cached at startup to keep inference latency low under concurrent load.
  • Built a routed LLM client supporting OpenAI (GPT-4o-mini) and Ollama (Qwen 2.5 7B) as interchangeable backends, with structured prompt versioning iterated against a 100-sample evaluation dataset. Applied linear regression calibration achieving 94% of marks within +/- one band and 86% within +/- half a band. Evaluated using MAE and QWK.
  • Implemented concurrent subscore computation via ThreadPoolExecutor, running all eight NLP subscores in parallel per request.
  • Set up MLflow for experiment tracking and dataset versioning. Built a full CI/CD pipeline via GitHub Actions: running tests, building Docker images, pushing to AWS ECR, and deploying to AWS ECS. Configured all AWS infrastructure including IAM roles, ECR repositories, ECS task definitions, services, and load balancer health checks from scratch.
MarkVerify — Evaluation Scoring Platform
  • Built in Go: a platform for ingesting, storing, and human-scoring IELTS test submissions from real students, producing the ground-truth labels MLflow experiments ran against.
Sept 2023 – Dec 2024
BrightHR
Delivery Lead

Owned the BrightSafe product end to end, from ideation and prioritisation through delivery across five countries: Australia, New Zealand, Canada, Ireland, and the UK.

  • Hired and led a cross-functional team of 9 to 15 engineers, designers, and QA. Responsible for development plans, 1:1s, mentoring, and performance management.
  • Acted as the primary interface between engineering, product, and delivery stakeholders, shaping work into deliverable increments while balancing technical constraints and business priorities.
  • Delivered third-party identity server integrations, a rebuilt documents platform, the engineering blog, and back office tooling improvements.
  • Produced early-stage prototypes and architecture diagrams using microservices and CQRS patterns. Led live production incident response across all five operating regions.
May 2023 – Sept 2023
Schneider Electric
Contract Engineer

Contract role on the EcoIQ smart heating platform, a distributed .NET system with teams in France and China.

  • Collaborated with lead engineers to modernise shared .NET Standard libraries across teams.
  • Worked with embedded systems engineers to design, build, and test integrations with Panasonic heat pumps using .NET 6, applying TDD to ensure reliable communication between backend services and physical hardware.
Aug 2020 – May 2023
OpenMoney
Tech Lead

Built and led the investments engineering team from the ground up, working closely with product and stakeholders to shape technical direction and identify milestones.

  • Hired and technically mentored the team. Architected greenfield solutions, created technical prototypes, and established logging, alerting, and incident post-mortem processes.
  • Ran company-wide talks and 1:1 mentoring on TDD, BDD, trunk-based development, feature toggling, safe refactoring, and CI/CD.
  • Developed strategies for removing technical debt and ensuring quality delivery through integration, end-to-end, and contract testing.
Sept 2018 – Aug 2020
Raytheon
Senior Software Engineer
  • Worked across multiple teams refactoring monoliths into microservices, applying TDD and agile practices. Languages and tools included C#, Java, JavaScript, Python, Terraform, and AWS.
  • Collaborated with product owners to identify thin vertical slices of value, delivering through pairing, mobbing, and independent work.
  • Mentored junior and graduate engineers through code reviews, pairing, and leading department Campfire talks.
Apr 2016 – Sept 2018
Zen Internet
Systems Developer
  • Built and maintained internal and public-facing websites and APIs using C#, MVC, React, Node.js, and .NET Core, supporting order tracking, diagnostics, and network management systems.
  • Prototyped AWS Lambda, Azure, and .NET Core to inform technical direction. Refactored legacy code in an agile environment with rapid feedback cycles.
Jun 2013 – Apr 2016
Swinton Insurance
Junior C# Developer
  • Built features and fixed bugs using C#, JavaScript, VB.NET, jQuery, and SQL. Frameworks included MVC, MVVM, and Angular. Applied BDD and TDD using NUnit and SpecFlow.
Oct 2012 – Jun 2013
Parker Sandfords
Junior Developer

Early-career commercial development role.

03

Projects

Nightshift RAG / Biomedical NLP

Clinical code RAG pipeline built for NICE (National Institute for Health and Care Excellence), developed as the employer project for the Cambridge Data Science Career Accelerator. Given a free-text clinical research question, the system parses it into typed clinical entities, retrieves and ranks relevant codes across SNOMED CT, ICD-10, QOF, and NHS reference sets, and returns results with source attribution and confidence scoring.

  • Two-stage RAG pipeline: offline ingestion encoding clinical code descriptions into biomedical embeddings (SapBERT/BioBERT) stored in sqlite-vec, and an online query stage combining semantic vector search with TF-IDF hybrid retrieval, reranking, and LLM-assisted reasoning.
  • Stanza NER (bc5cdr and i2b2 biomedical models) decomposes free-text queries into typed entities (conditions, drugs, exclusions), driving structured retrieval across multiple code systems simultaneously.
  • FastAPI service with WebSocket streaming of pipeline progress so results appear incrementally as each stage completes.
  • MLflow evaluation tracking F1, precision, and recall against gold-standard question-to-code-set pairs, with an acceptance threshold of 0.70 across all three metrics and confidence scoring above 0.65.
  • Dockerised with automatic ingestion on first boot and Stanza model caching baked into the image layer.
Stanza NER SapBERT BioBERT sqlite-vec FastAPI WebSockets MLflow RAG SNOMED CT ICD-10
Covet Founder / Platform

Two-sided senior tech recruitment platform for the UK market, built and run as a founder project. Hard constraint filtering into BAAI/bge-large-en-v1.5 sentence transformer scoring for candidate-role matching. A Scout service pulls from ATS adapters across Greenhouse, Lever, Workable, and Ashby. Go chosen deliberately for compiled binaries, minimal Alpine container images, and first-class concurrency via channels wired into WebSocket connections.

Go sentence-transformers BAAI/bge-large-en-v1.5 WebSockets Greenhouse Lever Workable Ashby Docker
04

Education

Sept 2024 – Present
BSc (Hons) Mathematics
Open University
Oct 2025 – Present
Cambridge Data Science Career Accelerator
University of Cambridge, PACE
Sept 2013 – Jun 2014
BSc (Hons) Computing — 1st Class Honours
In progress at time of first commercial role