Uptime Monitor

A serverless approach to infrastructure reliability and performance monitoring.

Fleet-Wide Availability Tracking

A lightweight SRE laboratory for managing multi-site uptime with high-signal visibility and minimal operational overhead.

Reliability as a Priority

Uptime Monitor is a serverless platform engineering lab for operating reliable health checks across a fleet of targets with clear visibility into status, latency, and history.

It demonstrates practical serverless SRE work: automated scheduling, historical data persistence in S3, and clean architectural separation between monitoring and presentation.

The project favors event-driven simplicity and cost-efficient cloud primitives over complex, heavy-weight monitoring solutions.

Why It Matters

  • Shows ownership of serverless architecture, event-driven triggers, and S3-based data lifecycles.
  • Demonstrates SRE judgment through multi-site fleet management and historical trend analysis.
  • Adheres to high engineering standards with 80%+ test coverage and structured logging.
  • Maintains transparency through a narrative Evolution Log and documented architectural decisions (ADRs).

Fleet Signals

Lambda Operations

Executes lightweight, concurrent health checks via AWS Lambda, minimizing compute costs while maximizing speed.

S3 Persistence

Uses S3 as a low-cost database to store historical check results, driving trendline 'Pulse' visualization.

Infrastructure as Code

Provisioned and managed declaratively with Terraform, ensuring reproducible cloud environments.

Go Performance

Built with Go for rapid startup (low cold starts) and reliable concurrent network operations.