A serverless uptime monitoring solution designed for personal fleet management, demonstrating AWS proficiency and SRE principles through Go, Lambda, S3-backed persistence, and scheduled EventBridge triggers.
Architecture Blueprint
Core Components & Logic
-
AWS Lambda Execution
AWS Lambda serves as the execution runtime, running a Go API router that handles HTTP endpoints and validates request actions.
-
Amazon S3 Storage
Stores state objects directly in Amazon S3, maintaining latest.json for current fleet status snapshots and history.json for rolling historical trend metrics.
-
EventBridge Cron Trigger
Invokes the Lambda cron path automatically every hour via Amazon EventBridge, driving background uptime checks to ensure persistent coverage.
Validation & Resiliency Testing
Reproducibility
Provisioned entirely using declarative Terraform modules, ensuring reproducible environment boots, IAM role boundary containment, and consistent AWS resource setups.
Automated Verification
Unit tests validation with mock AWS client interfaces and testing harnesses, verifying JSON marshaling, state validation, and router responses under failure conditions.
Telemetry Pipeline
Structured JSON logging emitted on health-check failures, standard error output capture, and historical pulse trendlines rendered in the frontend to visualize operational availability.
Design Trade-offs & Verification Logs
Humble Pivots
Serverless S3 Over Relational DB
Chose S3 JSON storage rather than DynamoDB or RDS. This minimized running costs and database operational overhead, since fleet snapshots and short histories only require basic key-value read/write access patterns.
Lambda Function URLs Instead of API Gateway
Avoided API Gateway for the backend API by utilizing Lambda Function URLs. This significantly reduced infrastructure complexity, since the project did not need advanced routing, request transformation, or edge authorizers at this stage.
Objective Clarity
Evaluated under single-writer assumptions. Up to 5 monitored endpoints with concurrent execution checks capped to 10-second request timeouts to prevent AWS Lambda execution drift and control costs.
Verifiable Outputs
Validation Suite (Unit Tests)
=== RUN TestRouter
=== RUN TestRouter/health_returns_healthy_response
=== RUN TestRouter/check_accepts_post
=== RUN TestRouter/latest_returns_stored_snapshot
=== RUN TestRouter/history_returns_stored_entries
--- PASS: TestRouter (0.00s)
--- PASS: TestRouter/health_returns_healthy_response (0.00s)
--- PASS: TestRouter/check_accepts_post (0.00s)
--- PASS: TestRouter/latest_returns_stored_snapshot (0.00s)
--- PASS: TestRouter/history_returns_stored_entries (0.00s)
PASS
ok uptime-monitor/internal/api 0.004s
=== RUN TestCheck
=== RUN TestCheck/marks_200_response_up
=== RUN TestCheck/marks_500_response_down
=== RUN TestCheck/records_request_error
--- PASS: TestCheck (0.00s)
--- PASS: TestCheck/marks_200_response_up (0.00s)
--- PASS: TestCheck/marks_500_response_down (0.00s)
--- PASS: TestCheck/records_request_error (0.00s)
PASS
ok uptime-monitor/internal/monitor 0.003s
=== RUN TestStoreLatest
=== RUN TestStoreLatest/returns_stored_latest_response
--- PASS: TestStoreLatest (0.00s)
--- PASS: TestStoreLatest/returns_stored_latest_response (0.00s)
=== RUN TestStoreHistory
=== RUN TestStoreHistory/returns_stored_history_response
--- PASS: TestStoreHistory (0.00s)
--- PASS: TestStoreHistory/returns_stored_history_response (0.00s)
PASS
ok uptime-monitor/internal/storage 0.004s