How I Built a Full Status Alert System

Website uptime, Supabase health, API monitoring, Garmin failure alerts, Fail2ban reports, personal changelog, and a weekly summary email — all automated from a headless home server

YieldA full observability stack for a personal server — every meaningful system sends an alert or weekly report automatically, with zero manual checking

DifficultyIntermediate (bash scripting, cron scheduling, Resend API, Claude CLI integration)

Total Cook Time~4 hours spread across several sessions — each alert is 20–45 minutes individually

Ingredients

Headless Linux server — the old laptop from the previous two posts, running 24/7 (already set up)
cron — Linux task scheduler, built-in (free)
Resend — email API for all alert delivery (free tier: 3,000 emails/month)
Claude Code — terminal AI for writing every script ($200/yr)
healthchecks.io — dead-man’s switch that alerts when a scheduled job stops running (free tier)
Supabase project — for the Numerator game database health check (already set up)

The Problem: A Server You Can’t See

A headless server is quiet by design. That’s the point — it runs in the background, lid closed, in another room. But quiet also means invisible. If joseandgoose.com goes down at 2am, I won’t know until someone tells me. If Supabase has an outage and my contact form stops saving submissions, I’ll find out when I check the database manually (which I never do). If the Garmin recap cron job silently fails, I get no email and no clue.

The solution isn’t to check things manually — that defeats the purpose of automation. The solution is to make the system tell you when something is wrong. Every important job should either succeed quietly or fail loudly. Here’s the full stack:

Alert 1: Website Uptime Monitor — checks joseandgoose.com every 5 minutes
Alert 2: Garmin Recap Failure Check — dead-man’s switch if the 7am recap doesn’t run
Alert 3: Nightly Fail2ban Ban Report — daily delta of new SSH attack IPs blocked
Alert 4: Supabase Health + GitHub Activity — Sunday database ping and weekly code activity
Alert 5: Personal Server Changelog — Claude writes a plain-English weekly standup
Alert 6: Weekly Status Report Email — everything in one Sunday morning digest
The Meta-Alert: healthchecks.io — alerts if the server itself goes offline

Alert 1: Website Uptime Monitor

Every 5 minutes

The most basic question: is joseandgoose.com responding? A curl request every 5 minutes, checked against an expected HTTP status code. If it returns anything other than 200, send an alert.

🔧 Developer section: Uptime monitor script

curl -s -o /dev/null -w "%{http_code}" https://joseandgoose.com — gets the HTTP status code silently
If status ≠ 200: sends a Resend email with the status code, timestamp, and a note to check Vercel logs
If status = 200: logs the timestamp quietly to ~/.system-reports/uptime.log with no email
Cron: */5 * * * * — runs every 5 minutes, 288 checks per day
Also logs to a daily CSV so I can spot patterns (slow responses during peak hours, etc.)

In the first month of running: two downtime events. One was a Vercel deployment that briefly returned a 503 during a cold start. One was my own fault — a broken build that I caught within 5 minutes because the alert email beat me to it.

Don’t over-alert

Add a cooldown: only alert once per hour per incident. If the site is still down an hour later, send another. One alert per incident is actionable; a flood of them is just noise.

Alert 2: Garmin Recap Failure Check

Every morning at 8am

The Garmin recap runs at 7am. By 8am, a recap file should exist for today. If it doesn’t, something broke overnight — and I should know before I’ve been waiting all day for a recap email that’s never coming.

🔧 Developer section: Garmin failure check script

Runs at 8:00am daily: 0 8 * * *
Checks if ~/.garmin-recap/recaps/garmin-recap-YYYY-MM-DD.md exists for today
If it exists: silent pass, no email
If it’s missing: sends an alert email with the last 20 lines of the recap log
Alert subject: "⚠️ Garmin Recap Failed — [DATE]"

This is a dead-man’s switch pattern: instead of the job alerting on success, a second job alerts on missing success. It catches silent failures — crashes, auth errors, network timeouts — that don’t generate their own error output.

Alert 3: Nightly Fail2ban Ban Report

Every evening at 7pm

Fail2ban bans IPs automatically, but I wanted a daily snapshot: how many new IPs got banned today? Is that number trending up (could indicate a targeted scan) or holding steady (normal background noise)?

🔧 Developer section: Fail2ban report script

sudo fail2ban-client status sshd — outputs total banned count
Script reads the current total, compares to yesterday’s saved count
Calculates: new bans = current total − previous total
Saves today’s count to ~/.system-reports/fail2ban-lastcount.txt for tomorrow
Sends an email: "🛡️ Fail2ban Daily Report — X new bans today (Y total)"
Cron: 0 19 * * *

The delta matters more than the total. A large cumulative count after weeks of running is expected. An unusual spike in a single day is worth investigating.

Alert 4: Supabase Health + GitHub Activity (Sunday)

Every Sunday at 8am

Two separate checks that share a Sunday timeslot because they’re both weekly sanity checks rather than urgent alerts:

🔧 Developer section: Supabase health check

Runs a simple query against the Supabase REST API: count rows in the contacts table
If the API returns a valid response: logs the count, no email
If it errors (503, timeout, auth failure): sends an alert with the error body
Also checks the numerator_rounds table — confirms the game database is live
Uses the Supabase service role key from .env.local

🔧 Developer section: GitHub activity report

Calls the GitHub API: GET /users/joseandgoose/events
Filters for the past 7 days of events: pushes, PRs, issues, stars
Formats into a short summary and includes in the Sunday report email
Uses a personal access token from the env file (read-only, public repo scope)

Alert 5: Personal Server Changelog (Sunday)

Sunday 7am — Claude-generated

Every Sunday morning, Claude writes a short narrative of what the server did that week. It’s not a metrics dump — it’s a 3–5 sentence changelog in plain English, like a standup from the server to me.

🔧 Developer section: Claude-generated changelog

Script collects stats: Garmin recaps generated this week, new Fail2ban bans, AI jobs completed, uptime log entries, disk usage, site downtime events
Passes stats to claude -p "..." with a prompt asking for a casual, 3–5 sentence changelog
Claude output is saved to a temp file and folded into the Sunday weekly report
Cron: 0 7 * * 0 — 7am Sunday, runs before the 9am weekly report email so it’s ready

A recent example output from Claude:

Weekly Changelog — Home Server

Solid week. All 7 Garmin recaps generated on schedule — no missed mornings.

Fail2ban blocked 23 new IPs, all automated bots, nothing unusual.

Site was up 100% — 2,016 uptime checks passed, zero alerts sent.

3 AI batch jobs processed from the inbox queue.

Disk at 34% used, 87GB free. No action needed.

Claude writes the server’s weekly standup. No log-diving required.

Alert 6: Weekly Status Report Email

Sunday 9am — the full picture

All the pieces come together in one Sunday email: changelog, Supabase health, GitHub activity, Fail2ban weekly total, disk space, and a resource summary. It’s the one email that tells me everything about the past week without opening a terminal.

🔧 Developer section: Weekly report assembly

Runs at 9am Sunday — after all the 7am and 8am jobs have completed
Reads the Claude changelog from the temp file generated at 7am
Reads the Supabase and GitHub outputs from the 8am jobs
Pulls resource stats: df -h for disk, last CSV entry from resources-YYYY-MM-DD.csv
Assembles into an HTML email via Resend API
Subject: "🖥️ Server Weekly — [WEEK OF DATE]"

The Meta-Alert: healthchecks.io

There’s one failure mode none of the above covers: what if the server itself goes down? If the machine crashes, no cron runs, no emails send, and I notice nothing until I happen to SSH in. The solution is a dead-man’s switch hosted externally.

🔧 Developer section: healthchecks.io heartbeat

Free account at healthchecks.io creates a unique URL — the "check"
If the URL isn’t pinged within a set interval, healthchecks.io sends a failure alert
Added to cron: every 30 minutes, curl -s https://hc-ping.com/[uuid] sends a heartbeat
If the server goes offline for 30+ minutes, I get an email from healthchecks.io
Cron: */30 * * * *

The monitoring gap

Everything else I built monitors from the server. healthchecks.io monitorsthe server — from outside. It’s the only alert that can fire when the machine itself is unreachable. Without it, a power outage or crash is invisible until you notice the silence.

The Full Cron Schedule

Everything running on a single crontab:

🔧 Developer section: Complete cron schedule

* * * * * — AI job queue worker (processes inbox/*.txt files)
*/5 * * * * — site uptime monitor (joseandgoose.com)
*/10 * * * * — resource sampler (CPU, memory → CSV)
*/30 * * * * — healthchecks.io heartbeat
0 7 * * * — Garmin recap generation
0 8 * * * — Garmin failure check (alerts if no recap file)
0 8 * * 1-5 — daily market briefing email
0 19 * * * — Fail2ban nightly ban report
0 7 * * 0 — Claude personal changelog generation
0 8 * * 0 — Supabase health + GitHub activity
0 9 * * 0 — weekly status report email
0 23 * * 6 — Lynis security audit (Saturday night)
0 8 * * 1 — disk snapshot
0 1 * * 0 — log archiver (weekly)
0 2 * * 0 — security apt upgrades + journal vacuum
0 2 1 * * — apt autoremove/autoclean (monthly)
0 4 1 * * — scheduled monthly reboot

Final Output

The server now manages itself. I never log in to check if things are running. I get emails when something is wrong, and I get a weekly report that tells me everything is fine. The no-email state is the good state.

Site downtime → email within 5 minutes
Garmin recap failure → email by 8am
Claude API credits exhausted → email immediately on failure
SSH login from outside LAN → email within 3 seconds
Server offline → healthchecks.io alert within 30 minutes
Everything working fine → one Sunday morning summary email

What went fast

Each individual script — Claude Code wrote every bash script from a plain-English description. Uptime monitor: 15 minutes. Fail2ban report: 20 minutes. Each one is simple; the value comes from having all of them running together.
Resend API reuse — same API key, same sender, same pattern for every email. Once the first alert email worked, every subsequent one took 5 minutes to wire up.
Cron scheduling — crontab -e, paste a line, save. Linux cron is reliable and dead-simple for time-based jobs.

What needed patience

Alert fatigue tuning — initial versions sent too many emails. Had to add cooldowns, state files (to remember last-alerted time), and delta calculations (ban count delta, not total). Getting the signal-to-noise ratio right took iteration.
Cron environment — cron jobs run in a minimal shell environment without your normal PATH. Scripts that work fine in a terminal can silently fail in cron because python3, node, or claude can’t be found. Fix: explicitly set PATH at the top of every cron command or in the crontab header.
Sunday job ordering — the weekly report at 9am depends on outputs from the 7am and 8am jobs. If any upstream job is slow, the 9am job reads an empty file. Added fallback messages ("data unavailable") for each section so the report always sends even if one input is missing.
healthchecks.io setup — the concept clicked immediately; finding the right "grace period" setting (how long to wait before alerting) took some tuning. Too short (5 minutes) and a brief network hiccup triggers a false alarm. 35 minutes works well for a 30-minute heartbeat interval.

The hardest part of a home server isn’t setting it up — it’s knowing what’s happening on it without babysitting it. This alert stack is the answer. Every meaningful event surfaces as an email. Everything else is silence, which is good news.

← Back to all writing