Machine Learning & Artificial Intelligence (ML/AI) Software Engineer
DaveA'Hearne
I build production ML systems. 13 years of software engineering and engineering leadership, now focused on applied NLP, ML platform engineering, and distributed systems. Software engineering background. Production ML focus. I build complete systems, not just models.
Skills
Languages
Applied ML / NLP
Data Science
MLOps
Cloud & Architecture
Engineering Practice
Experience
Paid engagement to build Keenu.io, an NLP-based IELTS assessment system. Designed and shipped two production systems from scratch.
- Designed an NLP scoring pipeline using spaCy word vectors and cosine similarity across eight subscores: paragraph cohesion, sentence-to-document centroid similarity (min and max), lexical richness, vocabulary sophistication, spelling error count and category, and punctuation placement and type.
- Integrated a HuggingFace BERT-based punctuation restoration model for grammatical analysis, precomputed and cached at startup to keep inference latency low under concurrent load.
- Built a routed LLM client supporting OpenAI (GPT-4o-mini) and Ollama (Qwen 2.5 7B) as interchangeable backends. Applied linear regression calibration achieving 94% of marks within ±1 band and 86% within ±0.5 band. Evaluated using MAE and QWK.
- Implemented concurrent subscore computation via ThreadPoolExecutor, running all eight NLP subscores in parallel per request.
- Built a full CI/CD pipeline via GitHub Actions: running tests, building Docker images, pushing to AWS ECR, and deploying to AWS ECS. Configured all AWS infrastructure including IAM roles, ECR repositories, ECS task definitions, and load balancer health checks from scratch.
- Built in Go: a platform for ingesting, storing, and human-scoring IELTS test submissions from real students, producing the ground-truth labels MLflow experiments ran against.
Owned the BrightSafe product end to end, from ideation and prioritisation through delivery across five countries: Australia, New Zealand, Canada, Ireland, and the UK.
- Hired and led a cross-functional team of 9 to 15 engineers, designers, and QA. Responsible for development plans, 1:1s, mentoring, and performance management.
- Acted as the primary interface between engineering, product, and delivery stakeholders, shaping work into deliverable increments while balancing technical constraints and business priorities.
- Delivered third-party identity server integrations, a rebuilt documents platform, the engineering blog, and back office tooling improvements.
- Produced early-stage prototypes and architecture diagrams using microservices and CQRS patterns. Led live production incident response across all five operating regions.
Contract role on the EcoIQ smart heating platform, a distributed .NET system with teams in France and China.
- Collaborated with lead engineers to modernise shared .NET Standard libraries across teams.
- Worked with embedded systems engineers to design, build, and test integrations with Panasonic heat pumps using .NET 6, applying TDD to ensure reliable communication between backend services and physical hardware.
Built and led the investments engineering team from the ground up, working closely with product and stakeholders to shape technical direction and identify milestones.
- Hired and technically mentored the team. Architected greenfield solutions, created technical prototypes, and established logging, alerting, and incident post-mortem processes.
- Ran company-wide talks and 1:1 mentoring on TDD, BDD, trunk-based development, feature toggling, safe refactoring, and CI/CD.
- Developed strategies for removing technical debt and ensuring quality delivery through integration, end-to-end, and contract testing.
- Worked across multiple teams refactoring monoliths into microservices, applying TDD and agile practices. Languages and tools included C#, Java, JavaScript, Python, Terraform, and AWS.
- Collaborated with product owners to identify thin vertical slices of value, delivering through pairing, mobbing, and independent work.
- Mentored junior and graduate engineers through code reviews, pairing, and leading department Campfire talks.
- Built and maintained internal and public-facing websites and APIs using C#, MVC, React, Node.js, and .NET Core, supporting order tracking, diagnostics, and network management systems.
- Prototyped AWS Lambda, Azure, and .NET Core to inform technical direction. Refactored legacy code in an agile environment with rapid feedback cycles.
- Built features and fixed bugs using C#, JavaScript, VB.NET, jQuery, and SQL. Frameworks included MVC, MVVM, and Angular. Applied BDD and TDD using NUnit and SpecFlow.
Early-career commercial development role.
Projects
End-to-end MLOps pipeline predicting the probability of each driver finishing on the podium for a given F1 race. Trained on historical Ergast data from 1990 to 2024, validated against live 2025 results, and served via a containerised FastAPI application. Documented as a multi-part blog series covering the full arc from EDA and heuristic baselines through to automated retraining and drift-gated model promotion.
- LightGBM binary classifier trained with walk-forward cross-validation across 10-year rolling windows, with engineered features covering driver and constructor rolling podium rates, championship position, circuit-specific form, regulation era, and mechanical DNF rates.
- LakeFS provides Git-style data versioning with a staging-to-main branch strategy; every training run links to the exact LakeFS commit SHA via MLflow dataset logging, giving full data-to-model lineage.
- Model exported to ONNX at training time and registered in the MLflow model registry; deployment is an alias swap — the serve layer fetches by alias so promotion requires no code change and no redeploy.
- Evidently drift detection runs after each race weekend against a pinned reference dataset; a Prefect flow orchestrates ingestion, validation, conditional retraining, and margin-gated promotion end to end.
Clinical code RAG pipeline built for NICE (National Institute for Health and Care Excellence), developed as the employer project for the Cambridge Data Science Career Accelerator. Given a free-text clinical research question, the system parses it into typed clinical entities, retrieves and ranks relevant codes across SNOMED CT, ICD-10, QOF, and NHS reference sets, and returns results with source attribution and confidence scoring.
- Two-stage RAG pipeline: offline ingestion encoding clinical code descriptions into biomedical embeddings (SapBERT/BioBERT) stored in sqlite-vec, and an online query stage combining semantic vector search with TF-IDF hybrid retrieval, reranking, and LLM-assisted reasoning.
- Stanza NER (bc5cdr and i2b2 biomedical models) decomposes free-text queries into typed entities (conditions, drugs, exclusions), driving structured retrieval across multiple code systems simultaneously.
- FastAPI service with WebSocket streaming of pipeline progress so results appear incrementally as each stage completes.
- MLflow evaluation tracking F1, precision, and recall against gold-standard question-to-code-set pairs, with an acceptance threshold of 0.70 across all three metrics and confidence scoring above 0.65.
- Dockerised with automatic ingestion on first boot and Stanza model caching baked into the image layer.
End-to-end image classification pipeline built as an MLOps exercise. A PyTorch CNN trained on CIFAR-10 and exported to ONNX for runtime decoupling, served via a production-style FastAPI REST API. Emphasis on the gap between a working model and a deployable one.
- PyTorch CNN with a 90/10 train/validation split, per-epoch checkpointing, and a held-out test set evaluated only at the end — ONNX exported with a pinned opset version to prevent silent runtime drift.
- FastAPI serving layer with API key auth, request ID propagation through context vars, structured log formatting, rotating file handler, and a timeout middleware returning 504 on breach.
- Separate Docker images for training and serving: the train image persists checkpoints and the ONNX artifact via mounted volumes; the serve image mounts the ONNX file at runtime rather than baking it in.
- Test suite using pytest with a mocked ONNX InferenceSession, covering preprocessing shape contracts, softmax numerical stability, postprocessing label mapping, and the inference endpoint.
Four analytical projects completed across the Cambridge Data Science Career Accelerator (PACE), covering unsupervised learning, supervised classification, NLP, and time series forecasting. Each applied to a real-world dataset with a defined business problem and written report.
K-Means clustering applied to 68,000+ e-commerce customers aggregated to a customer-level view with engineered RFM-style features. Optimal k=4 selected via elbow method, silhouette scoring, and agglomerative clustering, then validated through PCA and t-SNE dimensionality reduction.
XGBoost and neural network classifiers trained to predict student dropout for Study Group across three progressively richer data stages — enrolment, attendance, and academic performance. Macro F1 and AUC used as primary metrics to account for the 85/15 class imbalance.
Multi-method NLP pipeline applied to 12 months of PureGym reviews across Google and TrustPilot. BERTopic topic modelling, BERT-based emotion classification, LDA, and LLM-assisted theme extraction combined to surface customer pain points, with structured improvement recommendations generated per theme.
Classical (AutoARIMA/SARIMA), machine learning (XGBoost), deep learning (LSTM), and hybrid forecasting models compared against weekly and monthly Nielsen BookScan sales data. Tuned XGBoost dominated short-term weekly forecasting (MAE 41.52 vs SARIMA's 142.96); SARIMA outperformed at the monthly horizon where aggregation removes the lag-feature signal XGBoost relies on.
Two-sided senior tech recruitment platform for the UK market, built and run as a founder project. Hard constraint filtering into BAAI/bge-large-en-v1.5 sentence transformer scoring for candidate-role matching. A Scout service pulls from ATS adapters across Greenhouse, Lever, Workable, and Ashby. Go chosen deliberately for compiled binaries, minimal Alpine container images, and first-class concurrency via channels wired into WebSocket connections.
Education
Links
Curriculum Vitae — 2026
ML & Software Engineer — PDF download
Technical Blog
blog.daveahearne.com — writing on ML, systems, and building