The Problem
A well-instrumented engineering operation requires: error tracking with full stack traces, centralised log aggregation for debugging production incidents, VPN for secure production access, and CI runner capacity for compute-intensive jobs. The SaaS options for all four are good products. The combined cost at agency scale — across all developer seats, all managed stores, and the volume of data each generates — becomes a meaningful line item.
The question was whether self-hosted equivalents could deliver the same capability without proportionate operational overhead. The answer across all four was yes.
Sentry — Error Tracking & Performance Monitoring
Self-hosted Sentry deployed via Docker Compose on dedicated infrastructure. All managed client stores and agency tooling instrumented with the Sentry SDK — full error tracking with stack traces, breadcrumbs, and user context; release tracking that correlates deployment events with error rate changes; performance monitoring for transaction tracing and slow query detection.
Sentry self-host is a mature, well-documented deployment. Upgrade path is straightforward. Operational overhead: periodic version updates and monitoring of the Sentry instance itself (ironic, but solved by pointing the Sentry instance’s own SDK at the same instance).
The equivalent Sentry SaaS plan at the same organisation size, project count, and event volume would cost substantially more monthly. Data residency is also cleaner — client error data doesn’t leave the managed infrastructure.
ELK Stack — Log Aggregation
Elasticsearch, Logstash (replaced by Filebeat for most ingestion paths), and Kibana deployed on dedicated infrastructure. Centralised aggregation for:
- Nginx access logs (all managed stores)
- PHP-FPM logs (slow request detection, error patterns)
- Magento exception logs
- Deployment event logs
- RabbitMQ consumer logs
- Varnish MISS/HIT logs (cache behaviour analysis)
Query speed for debugging production incidents is the key differentiator over per-server log review: correlating an error trace against nginx access logs, PHP-FPM slowlog, and Magento exception log simultaneously, in Kibana, in seconds. Previously this required SSHing into each server and grepping across multiple log files manually.
Retention is configured per log type based on operational need and any client-specific requirements. Logs older than retention threshold are deleted — no unbounded growth in storage costs.
OpenVPN — Production Access
Production and staging environment access gated behind OpenVPN with certificate-based authentication. Per-developer certificates issued on onboarding, revoked immediately on offboarding. Certificate revocation takes effect across the entire fleet within seconds.
The alternative — firewall rules per IP address — is operationally untenable at team scale: developer IP addresses change, remote workers use different networks, and IP-based access control doesn’t integrate with offboarding workflows.
Self-hosted OpenVPN requires no per-seat licensing. The operational overhead is certificate lifecycle management — issuing, tracking expiry, and revoking. Automated certificate expiry notifications handle the monitoring.
Self-Hosted GitHub Actions Runners
Public GitHub Actions runners have two constraints relevant to agency-scale CI: per-minute billing for heavy compute (Magento E2E test suites run long) and restrictions on privileged Docker operations required for some CI jobs.
Self-hosted runners on on-premises infrastructure address both. Compute-intensive jobs — full Playwright E2E suite, Magento install-and-test, Docker build for production images — run on self-hosted runners at infrastructure cost (already running hardware) rather than per-minute GitHub billing. Privileged Docker operations for Warden-based Magento test environments run without restriction.
Public runners remain in use for lightweight jobs where per-minute cost is negligible and isolation is preferable.
Operational Reality
Self-hosting carries operational overhead: infrastructure to maintain, software to update, and incidents to handle. The honest assessment across all four tools is that the overhead is low for mature, stable self-hosted products with active communities and clear upgrade paths.
None of these are bleeding-edge projects. Sentry, ELK, and OpenVPN have been self-hosted at scale by organisations for years. The operational patterns are well-understood. The main requirement is treating the self-hosted infrastructure with the same care as the production infrastructure it monitors.