What I do
Three closely-related disciplines, owned end to end — not thrown over the fence between teams.
Programming
Production-grade software, built test-first — with AI to move fast without cutting corners.
- Backends & APIs (Django, Scala / Akka-HTTP)
- Test-driven development & clean architecture
- AI-assisted coding with senior review
- Web UIs, native apps & PWAs on CloudFront
SRE & Reliability
Make systems observable, resilient and survivable — before the incident, not during it.
- SLOs, error budgets & alerting that doesn't cry wolf
- Incident readiness, runbooks & postmortems
- Observability: metrics, logs, tracing
- Performance & cost under real load
Cloud Architecture & Automation
Infrastructure as code, golden paths and pipelines that let you move fast without breaking production.
- AWS architecture & well-architected reviews
- Terraform modules & reusable foundations
- CI/CD, GitOps & deployment safety
- Security-by-design & least privilege
Notes from the field
Practical writing on reliability, architecture and operating real systems — no hype, no thought-leadership theatre.
SLOs that don't lie: measuring what users actually feel
Most SLOs are green while users suffer — they measure the system, not the person. How to build SLIs from real user journeys, give each journey the target it deserves, turn the gap into a team-owned error budget, and wire alerts that drill straight to the cause.
Designing alerts nobody ignores
Noisy alerts train your team to ignore the real one. A deep, practical guide to symptom-based, multi-window multi-burn-rate SLO alerting — the burn-rate maths, copy-pasteable PromQL, and the on-call process that makes pages trustworthy again.
Terraform modules that scale with your team, not against it
Reusable modules only scale if you treat them like products: small, reviewed, tested and versioned. A practical guide to building, releasing and consuming Terraform modules straight from GitHub — pinned to a tag or, when it matters, an immutable commit hash — with a Terragrunt layout that mirrors your estate.
Got a system that has to stay up?
Whether it's an architecture review, an SRE engagement, or a reliability fire you need help putting out — let's talk.
Get in touch