The Script
How CleverUptime sees your server — explained clearly for everyone who wants to understand Linux monitoring.
So you're curious about how CleverUptime works? That’s awesome. We’re glad you’re here.
We know that Linux and Bash can seem intimidating at first. All those strange symbols, commands, and file paths… It looks like code from a hacker movie!
Here’s a fun fact: We were Windows users once.
We spent years clicking through File Explorer, scared to open a terminal.
The first time we typed ls instead of dir, it felt like speaking a foreign language.
But then something changed. We needed to host a website. We broke a server. We lost data. And suddenly, we had no choice — we had to learn.
We read guides. We made mistakes. We rebooted more than we’d like to admit. But slowly, the pieces fell into place. And one day, it clicked: Linux isn’t scary — it’s empowering, because you're in charge!
That’s why we built CleverUptime. Not for experts. For people like us — and like you — who just want their server to work, and maybe learn something along the way.
Here’s the truth: It’s not magic. It’s just a conversation between you and your server — written in a language you can learn.
And the best part? You don’t need to be a sysadmin to understand this. We’ll explain everything — line by line, command by command — with links to our knowledge base so you can go deeper.
Ready? Let’s go on a journey through the CleverUptime monitoring script. By the end, you’ll not only understand it — you’ll trust it.
Before We Begin: What Is a Server?
A server is just a computer that runs 24/7 and hosts websites, apps, databases, or services. Unlike your laptop, it lives in a data center and talks to the internet all the time.
Most servers run Linux — a powerful, free operating system. Linux gives you full control. But with great power comes great responsibility.
And just like your car needs regular checkups, your server needs monitoring.
Why Monitoring Matters
Imagine your website suddenly goes down. No error. No warning. Your users can’t reach you. You only find out when someone tweets “your site is broken.”
Monitoring is like a doctor for your server. It checks the heartbeat (CPU), breathing (network), and temperature (disk) — and tells you when something’s wrong.
CleverUptime does that. But instead of magic, it uses a simple script — a list of commands that your server runs automatically.
What Is a Shell? What Is Bash?
When you log into a Linux server, you’re dropped into a shell — a text-based interface where you type commands.
The most common shell is Bash (Bourne Again SHell). It’s like the voice of your server — you ask it questions, it gives answers.
A Bash script is just a text file full of these commands. Instead of typing them one by one, you save them and run the whole list at once.
That’s what this script is: A conversation with your server, saved in a file.
The First Line: Telling Linux How to Run This
#!/bin/bash
This is called a shebang. It tells Linux: “Use the Bash shell to run this script.” Without it, your server wouldn’t know what to do with the file.
The Header: A Promise of Transparency
################################################################################
# CleverUptime #
# Monitoring Script v2.5 #
# 2026-06-11 #
# #
# https://cleveruptime.com #
# #
# This script collects data from your server so you can monitor it in #
# CleverUptime. #
# #
# WARNING: NEVER RUN A SCRIPT YOU DON'T UNDERSTAND. #
# That's why we explain EVERY LINE below — even if you're new to Linux. #
# #
# You can: #
# - Read every command and learn what it does #
# - Comment out lines you don't want to send #
# - Modify values (like server name) #
# - Learn how real server monitoring works #
# #
# Nothing is sent without your permission. #
# No hidden code. No surprises. #
# #
# Want to learn more about this script? #
# Visit: https://cleveruptime.com/the-script #
# We'll explain every single line of code with links to documentation. #
################################################################################
This is a comment block — ignored by the computer, but crucial for humans. It introduces the script, its purpose, and our promise: transparency.
We’re not hiding anything. We’re teaching you how it works — because you should never run code you don’t understand.
Section: Configuration — Your Server’s Identity
The script starts by setting up some values — like filling out a form.
# This is your private key. It tells CleverUptime: "This data is from me."
# It's like a password for your server. Never share it.
UPLOAD_KEY="your-upload-key"
UPLOAD_KEY is your secret password.
It proves to CleverUptime that the data is coming from you.
If someone else used your key, they could fake data — so keep it safe.
You can generate one at your account page.
# This script version helps us make sure everything works correctly.
# Please do not change it.
SCRIPT_VERSION="2.4"
We use SCRIPT_VERSION to make sure your script matches what our servers expect.
If you change it, things might break — so we ask you not to.
# This is where the data is sent. Do not change.
URL="https://upload.cleveruptime.net/"
This is the URL where your data goes. Like mailing a letter to a specific address.
# How often to send data (in seconds)
# - Free plan: every 300 seconds (5 minutes)
# - Pro/Business: every 60 seconds (1 minute)
# DO NOT SET BELOW 60! Otherwise your account may be blocked.
SLEEP_INTERVAL="60"
This controls how often data is sent. Too fast, and you might overload your server or get blocked. That’s why we say: don’t set it below 60 seconds.
# Optional: Give your server a custom ID (e.g. "web01")
SERVER_ID="web01"
# Optional: A friendly name (e.g. "Main Web Server")
SERVER_NAME="Main Web Server"
# Optional: Add a description
DESCRIPTION="Primary production web server"
# Optional: Group servers (e.g. "blog", "api")
PROJECT="blog"
# Optional: Add tags to filter later (comma-separated)
TAGS="web,prod"
# Optional: Add location (helps with monitoring)
# If not set, we'll guess from your IP.
CONTINENT="Europe"
COUNTRY="Germany"
ZONE="eu-central"
CITY="Munich"
HOSTER="Hetzner"
DATACENTER="FSN1"
RACK="R12"
These are optional labels to help you organize your servers in the dashboard.
You can leave them as-is or customize them.
For example, set SERVER_NAME="My Blog Server".
Section: Safety Checks — Making Sure We Can Run
Before doing anything, the script checks if it can run safely.
# Check if 'gzip' is installed
# gzip compresses the data so it uploads faster
if ! command -v gzip &> /dev/null; then
echo "ERROR: gzip is not installed." >&2
echo "It's used to compress data before sending." >&2
echo "Install it with: sudo apt install gzip" >&2
exit 1
fi
This checks for gzip — a tool that compresses data (like zipping a file).
If it’s missing, the script tells you how to install it.
The command -v command checks if a program exists.
&> /dev/null means “hide the output unless there’s an error.”
# Check if 'curl' is installed
# curl sends the data to CleverUptime
if ! command -v curl &> /dev/null; then
echo "ERROR: curl is not installed." >&2
echo "It's used to send data to the internet." >&2
echo "Install it with: sudo apt install curl" >&2
exit 1
fi
Same thing for curl — the tool that sends data to CleverUptime. Without it, the script can’t communicate with our servers.
# Check if script is run as root (administrator)
# Many system files can only be read by root
if [ "$EUID" -ne 0 ]; then
echo "ERROR: This script needs root permissions." >&2
echo "It reads system files like /proc/meminfo and /etc/passwd." >&2
echo "Run it with: sudo ./cleveruptime.sh" >&2
exit 1
fi
This checks if you're running as root (administrator).
Why? Because accessing certain system data — like disk health with the smartctl command —
requires elevated privileges.
That’s why you need to run the script with sudo — "do as super user".
Most of the script will still work without superuser permissions, but you’ll miss out on important insights, like SMART data, detailed network stats, or full process lists.
Section: Server Identity — Who Are You?
CleverUptime needs to recognize your server, even if its IP changes.
# Read the server's hostname (e.g. "my-server")
HOSTNAME=$(cat /proc/sys/kernel/hostname)
This reads your server’s name using the cat command from the file /proc/sys/kernel/hostname.
This file is created by the Linux kernel and contains your server’s hostname.
# Read hardware ID (only on physical servers)
# If not available (e.g. in Docker), we'll use "n/a"
PRODUCT_UUID=$(cat /sys/class/dmi/id/product_uuid 2>/dev/null || echo "n/a")
This reads the hardware ID of a physical server from /sys/class/dmi/id/product_uuid.
If it doesn’t exist (e.g. in a VM), it returns “n/a”.
The 2>/dev/null hides errors.
The || means “if that fails, do this instead.”
# Read system ID (unique to this Linux install)
# Found in /etc/machine-id
MACHINE_ID=$(cat /etc/machine-id 2>/dev/null || echo "n/a")
This reads the system ID from /etc/machine-id — a unique ID assigned when Linux is installed.
It’s stable across reboots.
Section: The Main Loop — Continuous Monitoring
Now we enter the heart of monitoring:
while true; do
This means: do this forever. The script will keep running, checking your server every few seconds. You can stop it anytime with Ctrl + C.
# Record when this cycle started (for performance tracking)
startTime=$(date +"%s%3N")
This records the current time in milliseconds. We use it to measure how long the script takes to run. The date command formats the time.
# Show a message in the terminal
echo -n "$(date -u +'%Y-%m-%d %H:%M:%S') Running CleverUptime..."
This prints a message in the terminal.
echo displays text.
The -n flag means “don’t add a newline.”
It keeps the HTTP status on the same line.
Section: Command Structure — How We Capture and Send Data
Now we come to one of the most powerful — and confusing — parts of the script:
status=$(
{
# ... many commands ...
} 2>/dev/null | gzip --fast | curl -X POST ...
)
Let’s break this down piece by piece. Even experienced admins can get lost here — so don’t worry if it looks strange.
What Does status=$(...) Mean?
The $() is called command substitution.
It means: “Run everything inside the parentheses, and return the output.”
Here, we’re capturing the HTTP status code from curl and storing it in the status variable.
What Does { ... } Do?
The curly braces { } create a command group.
They group all the commands inside so their combined output can be processed together — like putting all your server’s data into a single box.
Think of it like this:
- You run 20
echocommands - Instead of printing them separately, you wrap them in
{ } - Now you can compress them, send them, or save them as one unit
This is how CleverUptime collect data efficiently.
Section: Setup — Language and Delimiters
Before collecting data, we set two important values:
# Force all commands to use English output
# Why? So we can parse the output correctly later
export LC_ALL=C
This forces all commands to use English (C locale) output.
Why? Because if your server uses German, French, or Chinese, the output of commands like df or ps might be translated —
and our system wouldn’t be able to parse it correctly.
LC_ALL=C ensures consistency.
# This marker helps CleverUptime split the output into sections
# Example: "***CleverUptime_proc:cpuinfo" marks the start of CPU data
MARKER="***CleverUptime_"
This is a special delimiter — a unique string that marks the start of each data section. For example:
***CleverUptime_proc:loadavg
1.20 0.85 0.67
On our servers, we split the incoming data at each ***CleverUptime_ to extract individual metrics.
It’s like labeling folders in a filing cabinet — so we know which data is which.
Section: Sending Data — The Echo Commands
Now we start sending data using echo — a command that prints text. Each line follows this pattern:
echo "${MARKER}category:key"; echo "value"
The first echo sends the marker and key.
The second sends the actual data.
This structure makes parsing reliable.
echo "${MARKER}general:uploadKey"; echo "${UPLOAD_KEY}"
Sends your private key so we know which account this data belongs to. This variable was set at the top of the script.
Advanced users might wonder: Why not send the key in an HTTP header or as a curl parameter?
Because those could appear in the process list (ps aux) — visible to other users on the server.
By embedding the key inside the command block and sending it in the request body, we ensure it stays hidden — even from other users with shell access. It’s never logged, never shared — used only for secure authentication.
echo "${MARKER}general:scriptVersion"; echo "${SCRIPT_VERSION}"
Tells us which version of the script you’re running. Helps us keep everything compatible.
echo "${MARKER}general:hostname"; echo "${HOSTNAME}"
Sends your server’s name (e.g. web01.example.com).
Remember: We have read it from /proc/sys/kernel/hostname.
echo "${MARKER}general:productUuid"; echo "${PRODUCT_UUID}"
Sends the hardware ID (if available).
Useful for identifying physical servers.
From /sys/class/dmi/id/product_uuid.
echo "${MARKER}general:machineId"; echo "${MACHINE_ID}"
Sends the system ID — unique to this Linux installation.
From /etc/machine-id.
Next come the optional metadata lines. These are all the labels you can set at the top of the script — they’re sent the same way, and every one is optional. Leave them blank and we simply skip them; fill them in and they help you slice and filter your servers in the dashboard:
echo "${MARKER}general:serverId"; echo "${SERVER_ID}"
echo "${MARKER}general:serverName"; echo "${SERVER_NAME}"
echo "${MARKER}general:description"; echo "${DESCRIPTION}"
echo "${MARKER}general:project"; echo "${PROJECT}"
echo "${MARKER}general:tags"; echo "${TAGS}"
echo "${MARKER}general:continent"; echo "${CONTINENT}"
echo "${MARKER}general:country"; echo "${COUNTRY}"
echo "${MARKER}general:zone"; echo "${ZONE}"
echo "${MARKER}general:city"; echo "${CITY}"
echo "${MARKER}general:hoster"; echo "${HOSTER}"
echo "${MARKER}general:datacenter"; echo "${DATACENTER}"
echo "${MARKER}general:rack"; echo "${RACK}"
Same pattern throughout: a marker line naming the field, then the value. Location fields (continent, country, city, hoster…) are handy if you run servers in more than one place — and if you leave them blank, we make a best guess from your IP.
# === Time & Timezone ===
# We record start time to measure performance and calculate metrics like Bytes/sec
echo "${MARKER}time:start"; timeout 5 date -u +"%Y-%m-%dT%H:%M:%S.%3NZ"
echo "${MARKER}time:timezone"; timeout 5 date +"%Z"
Records the exact time data collection started.
Used to measure script execution time and calculate metrics like network throughput per second.
date outputs in ISO 8601 format.
We use timeout 5 as a safety measure: if the command takes longer than 5 seconds, it’s canceled. This prevents our script from adding extra load to a server that’s already struggling.
The second line sends the server’s current timezone (e.g. “UTC”, “CET”, “PST”).
It’s best practice to run your server in UTC (Coordinated Universal Time). Why? Because timezones with daylight saving time (like CET or EST) can cause real problems:
- Cron jobs may run twice or skip when clocks "fall back" or "spring forward".
- Logs become ambiguous — did that error happen at 2:30 AM before or after the switch?
- Database or monitoring gaps and duplicates can appear in time-series data during the transition.
UTC avoids all of this. It has no daylight saving changes — making it stable, predictable, and safe.
# === /proc — Performance Data ===
# These files contain live system metrics
# They are updated regularly by the Linux kernel
echo "${MARKER}proc:version"; timeout 5 cat /proc/version
echo "${MARKER}proc:uptime"; timeout 5 cat /proc/uptime
echo "${MARKER}proc:cpuinfo"; timeout 5 cat /proc/cpuinfo
echo "${MARKER}proc:stat"; timeout 5 cat /proc/stat
echo "${MARKER}proc:loadavg"; timeout 5 cat /proc/loadavg
echo "${MARKER}proc:meminfo"; timeout 5 cat /proc/meminfo
echo "${MARKER}proc:sys:vm:swappiness"; timeout 5 cat /proc/sys/vm/swappiness
echo "${MARKER}proc:partitions"; timeout 5 cat /proc/partitions
echo "${MARKER}proc:diskstats"; timeout 5 cat /proc/diskstats
echo "${MARKER}proc:mdstat"; timeout 5 cat /proc/mdstat
echo "${MARKER}proc:swaps"; timeout 5 cat /proc/swaps
echo "${MARKER}proc:mounts"; timeout 5 cat /proc/mounts
echo "${MARKER}proc:net:dev"; timeout 5 cat /proc/net/dev
echo "${MARKER}proc:pressure:cpu"; timeout 5 cat /proc/pressure/cpu
echo "${MARKER}proc:pressure:io"; timeout 5 cat /proc/pressure/io
echo "${MARKER}proc:pressure:memory"; timeout 5 cat /proc/pressure/memory
echo "${MARKER}proc:vmstat"; timeout 5 cat /proc/vmstat
echo "${MARKER}proc:net:snmp"; timeout 5 cat /proc/net/snmp
echo "${MARKER}proc:net:netstat"; timeout 5 cat /proc/net/netstat
echo "${MARKER}proc:sys:fs:file-nr"; timeout 5 cat /proc/sys/fs/file-nr
echo "${MARKER}proc:sys:net:netfilter:nf_conntrack_count"; timeout 5 cat /proc/sys/net/netfilter/nf_conntrack_count
echo "${MARKER}proc:sys:net:netfilter:nf_conntrack_max"; timeout 5 cat /proc/sys/net/netfilter/nf_conntrack_max
echo "${MARKER}proc:net:sockstat"; timeout 5 cat /proc/net/sockstat
echo "${MARKER}proc:net:sockstat6"; timeout 5 cat /proc/net/sockstat6
echo "${MARKER}proc:sys:kernel:pid-max"; timeout 5 cat /proc/sys/kernel/pid_max
echo "${MARKER}proc:sys:kernel:entropy-avail"; timeout 5 cat /proc/sys/kernel/random/entropy_avail
That’s the full /proc set — live system metrics the Linux kernel keeps continuously up to date. A few highlights:
/proc/loadavg— CPU load over the last 1, 5 and 15 minutes (high load with low CPU usage often means I/O wait)./proc/meminfo— total RAM, free memory, buffers and cache./proc/stat— CPU time split across user, system and idle./proc/diskstats— read/write activity and wait time for every disk./proc/pressure/cpu— how often processes wait for CPU (a modern PSI metric).
Every line is read with cat (which prints a file’s contents), wrapped in timeout 5 so a slow read can never stall the script.
Section: Hardware Sensors — Temperatures, Disks, Memory, Network
Next we read hardware and kernel state straight from /sys — no extra tools required. This covers the server’s identity (model, serial), live temperatures and fan speeds, CPU frequency and power, ECC memory errors (an early warning for failing RAM), and each network interface’s link state:
# === /sys/class/dmi — Hardware Info ===
# Only available on physical servers
# Not present in Docker or some VMs
echo "${MARKER}dmi:id:boardSerial"; timeout 5 cat /sys/class/dmi/id/board_serial
echo "${MARKER}dmi:id:chassisSerial"; timeout 5 cat /sys/class/dmi/id/chassis_serial
echo "${MARKER}dmi:id:chassisVendor"; timeout 5 cat /sys/class/dmi/id/chassis_vendor
echo "${MARKER}dmi:id:productFamily"; timeout 5 cat /sys/class/dmi/id/product_family
echo "${MARKER}dmi:id:productName"; timeout 5 cat /sys/class/dmi/id/product_name
echo "${MARKER}dmi:id:sysVendor"; timeout 5 cat /sys/class/dmi/id/sys_vendor
# === /sys/class/hwmon — Temperatures, Fans (straight from the kernel) ===
# The Linux kernel exposes hardware sensors as plain files here.
# No extra tools needed (no lm-sensors) — we just read the files.
# Temperatures are in milli-°C (86750 means 86.75 °C), fans in RPM.
echo "${MARKER}sys:hwmon:name"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/name
echo "${MARKER}sys:hwmon:tempInput"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/temp*_input
echo "${MARKER}sys:hwmon:tempLabel"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/temp*_label
echo "${MARKER}sys:hwmon:fanInput"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/fan*_input
echo "${MARKER}sys:hwmon:fanLabel"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/fan*_label
echo "${MARKER}sys:hwmon:powerInput"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/power*_input
echo "${MARKER}sys:hwmon:currInput"; timeout 5 grep -H . /sys/class/hwmon/hwmon*/curr*_input
# === CPU frequency, throttling & power (kernel sysfs, no tools) ===
# scaling_cur_freq/cpuinfo_max_freq in kHz; energy_uj is a running
# microjoule counter (two samples give watts). Server does the math.
echo "${MARKER}sys:cpufreq:cur"; timeout 5 grep -H . /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
echo "${MARKER}sys:cpufreq:max"; timeout 5 grep -H . /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq
echo "${MARKER}sys:cpufreq:governor"; timeout 5 grep -H . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
echo "${MARKER}sys:throttle"; timeout 5 grep -H . /sys/devices/system/cpu/cpu*/thermal_throttle/*_count
echo "${MARKER}sys:powercap"; timeout 5 grep -H . /sys/class/powercap/*/name /sys/class/powercap/*/energy_uj
# === ECC memory errors (EDAC) — predicts failing RAM ===
# ce = correctable, ue = uncorrectable. Per controller (mc*) and per DIMM.
echo "${MARKER}sys:edac:ce"; timeout 5 grep -H . /sys/devices/system/edac/mc/mc*/ce_count
echo "${MARKER}sys:edac:ue"; timeout 5 grep -H . /sys/devices/system/edac/mc/mc*/ue_count
echo "${MARKER}sys:edac:dimmLabel"; timeout 5 grep -H . /sys/devices/system/edac/mc/mc*/dimm*/dimm_label
echo "${MARKER}sys:edac:dimmCe"; timeout 5 grep -H . /sys/devices/system/edac/mc/mc*/dimm*/dimm_ce_count
echo "${MARKER}sys:edac:dimmUe"; timeout 5 grep -H . /sys/devices/system/edac/mc/mc*/dimm*/dimm_ue_count
# === btrfs allocation — unallocated space (the real ENOSPC signal) ===
# df can show terabytes free while btrfs is wedged: when there is no
# unallocated space left to carve a new metadata chunk, btrfs flips the
# filesystem read-only. The truth lives in chunk allocation, not df.
# allocation/ → per-profile total_bytes/disk_total/bytes_used;
# devices/ → which devices back each filesystem (sizes come from
# /proc/partitions above). Unallocated = device size − allocated.
echo "${MARKER}sys:btrfs:allocation"; timeout 5 grep -rH . /sys/fs/btrfs/*/allocation/ 2>/dev/null
echo "${MARKER}sys:btrfs:devices"; timeout 5 grep -H . /sys/fs/btrfs/*/devices/*/size 2>/dev/null
# === Network interfaces — link state ===
# operstate up/down, carrier 1/0 (flapping), mtu.
# NOTE: /sys/class/net/*/speed is deliberately NOT read — it can block
# for seconds on virtual/down NICs; fetch via ethtool on-demand instead.
echo "${MARKER}sys:net:operstate"; timeout 5 grep -H . /sys/class/net/*/operstate
echo "${MARKER}sys:net:carrier"; timeout 5 grep -H . /sys/class/net/*/carrier
echo "${MARKER}sys:net:mtu"; timeout 5 grep -H . /sys/class/net/*/mtu
Most of these only exist on physical servers — in a container or VM they simply come back empty, and that’s fine.
We deliberately skip /sys/class/net/*/speed: reading it can hang for seconds on some NICs, so we fetch link speed on demand instead.
Section: System Config — Users and OS
A few /etc files tell us about your operating system and accounts:
# === /etc — System Config Files ===
# These help us understand your OS and users
# Note: Passwords are NOT stored in /etc/passwd — they're in /etc/shadow
echo "${MARKER}etc:passwd"; timeout 5 cat /etc/passwd
echo "${MARKER}etc:group"; timeout 5 cat /etc/group
echo "${MARKER}etc:osRelease"; timeout 5 cat /etc/os-release
echo "${MARKER}etc:resolv.conf"; timeout 5 cat /etc/resolv.conf
/etc/passwd lists user accounts — and note that passwords are not there; they live in /etc/shadow, which we never read.
We also read group, os-release (your distro and version), and resolv.conf (DNS).
Section: Commands — Processes, Network, Packages
Last in the data block, a handful of standard commands. You can comment out any line you’d rather not send:
# === Commands — Network & System ===
# You can comment out any line if you don't want to send that data
echo "${MARKER}command:who"; timeout 5 who
echo "${MARKER}command:ip:address"; timeout 5 ip address
echo "${MARKER}command:ss"; timeout 5 ss -lntup
echo "${MARKER}command:df"; timeout 5 df -l --output
echo "${MARKER}command:df:inodes"; timeout 5 df -li
# ZFS pool capacity — CoW pools degrade past ~80% regardless of size.
# 'cap' is the percentage we band on; -Hp gives exact bytes, no header.
echo "${MARKER}command:zpool"; timeout 5 zpool list -Hp -o name,size,alloc,free,cap,frag,health
# LVM thin pools — a thin pool can exhaust (data OR metadata) while the
# filesystem df looks half-empty, then errors/goes read-only. Only the
# device-mapper status exposes the fill level; emits nothing without pools.
echo "${MARKER}command:dmsetup:thin"; timeout 5 dmsetup status --target thin-pool
echo "${MARKER}command:apt"; timeout 5 apt list --installed
echo "${MARKER}command:wg"; timeout 5 wg
The highlights: ps aux lists every running process; ss -lntup shows listening sockets — which ports are open and which program owns each; df reports disk usage.
Alongside them we read logged-in users (who), interface addresses (ip address), inode usage, installed packages (apt), and WireGuard status (wg).
# === SMART Data — Disk Health ===
# Reads health info from hard drives and SSDs
# Requires 'smartmontools' to be installed
# Only runs if the disk exists
for d in /dev/sd[a-z] /dev/nvme[0-9]n1; do
if [ -e "${d}" ]; then
echo "${MARKER}command:smartctl:${d}"; timeout 10 /usr/sbin/smartctl -a "${d}"; echo "exit status: $?"
fi
done
This block checks the health of your server’s hard drives and SSDs using
smartctl, a tool from the smartmontools package.
Let's break it down in more detail:
Line 1: The for Loop — Scanning for Devices
for d in /dev/sd[a-z] /dev/nvme[0-9]n1; do
This starts a for loop — a way to repeat an action for multiple items.
It looks for devices matching two common naming patterns:
/dev/sd[a-z]: Traditional hard drives and SSDs (e.g./dev/sda,/dev/sdb)/dev/nvme[0-9]n1: NVMe SSDs (e.g./dev/nvme0n1)
The loop assigns each matching device name to the variable d, then runs the commands inside.
Line 2: Check If Device Exists
if [ -e "${d}" ]; then
This checks if the device file actually exists on the system using the test command ([ is shorthand for test).
-e means “does this file or device exist?”
This prevents errors if a device is missing — for example, if there’s no /dev/sdc.
Line 3: Run smartctl and Capture Output
echo "${MARKER}command:smartctl:${d}"; timeout 10 /usr/sbin/smartctl -a "${d}"; echo "exit status: $?"
This does three things:
- Marker:
echo "${MARKER}..."labels the output so CleverUptime knows it’s SMART data for this device. - Health Check:
smartctl -a "${d}"runs a full SMART (Self-Monitoring, Analysis, and Reporting Technology) test. It reports:- Drive temperature
- Power-on hours
- Bad sectors
- Reallocated sectors
- Wear level (for SSDs)
- Predicted failure status
- Exit Status:
echo "exit status: $?"captures the command’s exit code.$?holds the result of the last command:0= success, anything else = error. This helps us detect ifsmartctlfailed (e.g. no permission, unsupported drive).
Line 4: End the if Block
fi
fi is if spelled backwards — it closes the if statement.
Line 5: End the for Loop
done
This marks the end of the loop. The script goes back and processes the next device — until all are checked.
Why This Matters
Disk failures are one of the most common causes of server outages. SMART data helps predict failures before they happen.
But not all servers have smartctl installed.
That’s why we:
- Use
timeout 10— in case the drive is unresponsive - Check if the device exists — to avoid errors
- Capture the exit status — to diagnose issues later
This is defensive scripting: we gather useful data when possible — but never at the cost of your server’s stability.
Section: Application Checks — Your Services
If you run common services, the script peeks at those too — all optional, and harmless if a service isn’t installed (the command simply fails and we move on):
# === Application Checks ===
# Optional: Monitor services you use
# Comment out if not installed
echo "${MARKER}docker:ps"; timeout 5 docker ps -a
echo "${MARKER}apache:serverNames"; timeout 5 grep -h -E 'ServerAlias|ServerName' /etc/apache2/sites-enabled/*
echo "${MARKER}mysql:processlist"; timeout 5 mysql -u root -N -e "SET STATEMENT MAX_STATEMENT_TIME = 5 FOR SHOW PROCESSLIST"
echo "${MARKER}mysql:databases"; timeout 5 mysql -u root -N -e "SET STATEMENT MAX_STATEMENT_TIME = 5 FOR SHOW DATABASES"
echo "${MARKER}mysql:globalStatus"; timeout 5 mysql -u root -N -e "SET STATEMENT MAX_STATEMENT_TIME = 5 FOR SELECT * FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME IN ('COM_SELECT','COM_INSERT','QUERIES','ROWS_READ')"
echo "${MARKER}mysql:tables"; timeout 5 mysql -u root -N -e "SET STATEMENT MAX_STATEMENT_TIME = 5 FOR SELECT * FROM information_schema.TABLES WHERE TABLE_SCHEMA NOT IN ('mysql','information_schema','performance_schema')"
echo "${MARKER}mysql:partitions"; timeout 5 mysql -u root -N -e "SET STATEMENT MAX_STATEMENT_TIME = 5 FOR SELECT * FROM information_schema.PARTITIONS WHERE PARTITION_NAME IS NOT NULL"
echo "${MARKER}ceph:status"; timeout 5 ceph status
Docker containers, Apache virtual hosts, a quick MySQL/MariaDB snapshot (process list, databases, a few global counters), and Ceph status — each wrapped in timeout 5. If the tool isn’t there, nothing is sent for it.
Section: Your Own Data — Custom & Conditional
The data block ends with examples you can copy: add your own one-line outputs, or collect something only on specific servers. Then it stamps the end time:
# === Custom Data (Example) ===
# You can add your own commands here
echo "${MARKER}custom:myOwnData"; timeout 5 echo "You can add your own data here."
echo "${MARKER}custom:someOtherData"; timeout 5 echo "Just return a single line of text."
# === Conditional Data (Example) ===
# Only collect data on specific servers
if [[ "${HOSTNAME}" =~ "myHostName" ]]; then
echo "${MARKER}custom:myHostData"; timeout 5 echo "Only collected on myHostName."
fi
# === End timestamp ===
# Marks the end of this cycle
echo "${MARKER}time:end"; timeout 5 date -u +"%Y-%m-%dT%H:%M:%S.%3NZ"
Anything that prints a single line of text can become a metric — so you can extend monitoring to whatever matters for your setup, without touching anything else.
Section: Sending Data — The Pipeline Explained
Now comes the most powerful part of the script: the pipeline. This is where all the collected data is compressed and sent securely to CleverUptime.
Here’s the full chain:
} 2>/dev/null \
| gzip --fast \
| curl -X POST \
--silent \
--fail \
--max-time 5 \
-H "Content-Encoding: gzip" \
--output /dev/null \
--write-out "%{http_code}" \
--data-binary @- \
"${URL}"
This is a pipeline — a Unix concept where the output of one command becomes the input of the next, connected by | (pipes).
It’s like an assembly line: data flows from left to right, transformed at each step.
Step 1: Close the Command Block and Capture Output
}
This closes the { ... } command group we opened earlier.
Everything inside this block — all the echo commands, cat calls, and data — is now ready to be processed as a single stream.
Step 2: Suppress Error Messages
2>/dev/null
In Unix, 2 refers to stderr — the error output stream.
/dev/null is a special "black hole" device that discards anything written to it.
So 2>/dev/null means:
“If any command inside the block fails or prints an error, don’t show it.”
Why? Because we don’t want error messages (like “permission denied” or “file not found”) to mix with the actual data — that would corrupt the payload.
This keeps the output clean and parseable.
Step 3: Compress the Data
gzip --fast
This compresses the data using gzip with the --fast option.
It reduces the size of the data — often by 90% — so it uploads faster and uses less bandwidth.
We use --fast instead of maximum compression because we prioritize speed and low CPU usage — especially on busy servers.
Step 4: Send the Data with curl
curl -X POST \
curl is a command-line tool for transferring data over the internet. Here, we use it to send an HTTP POST request to CleverUptime’s server.
-X POST explicitly sets the HTTP method — required for sending data to APIs.
Step 5: Run Silently
--silent
This tells curl not to show the progress bar or transfer stats.
Without it, you’d see a lot of noise in the terminal — not what we want in a monitoring script.
Step 6: Fail on HTTP Errors
--fail
Normally, curl exits with status 0 even if the server returns 404 or 500.
--fail changes that: if the HTTP response is an error (4xx or 5xx), curl will exit with a non-zero status.
This allows us to detect and handle upload failures.
Step 7: Set a Time Limit
--max-time 5
This tells curl to give up if the request takes longer than 5 seconds.
Without it, curl might hang forever on a slow or unresponsive connection — blocking the entire script.
This is another safety measure: we never want CleverUptime to make your server less responsive.
Step 8: Tell the Server the Data Is Compressed
-H "Content-Encoding: gzip"
This adds an HTTP header to inform CleverUptime’s server that the incoming data is compressed with gzip. Without this, the server wouldn’t know how to decode it.
Step 9: Discard the Server’s Response
--output /dev/null
This tells curl to discard the server’s response body.
We don’t need it — we only care about the HTTP status code.
Step 10: Capture the HTTP Status Code
--write-out "%{http_code}"
This makes curl output only the HTTP status code (e.g. 200) after the transfer.
This value is captured by the outer status=$(...) and used for error handling.
Step 11: Read Data from Standard Input
--data-binary @-
This tells curl to send the data it receives from stdin (standard input) — which comes from the pipe.
@- means “read the entire body from stdin”.
This is how the compressed output from gzip becomes the request body.
Step 12: Send to the Correct URL
"${URL}"
This is the endpoint where your data is sent — defined earlier in the script. It’s like the address on an envelope.
Why This Design?
This pipeline is carefully designed to be:
- Efficient: Compressed data uses less bandwidth
- Safe: Timeouts and error suppression prevent hangs
- Reliable: HTTP status is captured and checked
- Transparent: You can see every step — no black box
It’s not magic — it’s Unix philosophy in action: “Write programs that do one thing well. Chain them together.”
And that’s exactly what we’re doing here.
Section: Error Handling — Making the Script Resilient
After trying to send data, we check the result. Not every request will succeed — and that’s okay. A good monitoring script doesn’t crash at the first error. Instead, it responds appropriately to different kinds of problems.
That’s what this block does:
# Handle errors
case "${status}" in
"200") true ;; # All good
"400")
echo "ERROR: Bad Request. Check scriptVersion, hostname, productUuid, machineId." >&2
exit 1
;;
"401")
echo "ERROR: Unauthorized. Check your upload key." >&2
exit 1
;;
"403")
echo "ERROR: Forbidden. Upload key may be invalid or revoked." >&2
exit 1
;;
"429")
echo "WARNING: Too Many Requests. Increasing interval." >&2
sleep "${SLEEP_INTERVAL}"
;;
*)
echo "WARNING: Unexpected HTTP status: ${status}. Retrying..." >&2
sleep 5
;;
esac
This uses the case statement — a way to match the HTTP status code and take different actions based on what went wrong.
Status 200: Success
"200") true ;;
This means the server received the data successfully.
true is a command that does nothing and returns 0 — it’s a clean way to say “no action needed.”
The script continues normally.
Status 400: Bad Request
"400") echo "ERROR: Bad Request." >&2; exit 1 ;;
This means CleverUptime couldn’t understand the data — maybe a required field is missing.
This usually indicates a script or configuration issue.
We echo an error message to stderr (&>2) and exit 1 — stop the script.
It won’t retry, because the problem won’t fix itself.
Status 401: Unauthorized
"401") echo "ERROR: Unauthorized." >&2; exit 1 ;;
This means the UPLOAD_KEY is missing or invalid.
You don’t have permission to send data.
We log the error and exit — no point in retrying with a wrong key.
Status 403: Forbidden
"403") echo "ERROR: Forbidden." >&2; exit 1 ;;
Similar to 401, but stronger: your key may be revoked, or your account blocked. We exit to prevent repeated failed attempts.
Status 429: Too Many Requests
"429") echo "WARNING: Too Many Requests." >&2; sleep "${SLEEP_INTERVAL}" ;;
This means you’re sending data too often — for example, setting SLEEP_INTERVAL below 60 seconds.
Instead of exiting, we sleep and let the loop continue.
Any Other Status (Wildcard)
*) echo "WARNING: Unexpected HTTP status." >&2; sleep "${SLEEP_INTERVAL}" ;;
This catches anything we didn’t expect — like 500 (server error) or 503 (unavailable). These are temporary issues — maybe CleverUptime is restarting. We log a warning and wait before the next try. The script keeps running, ready to resume when things stabilize.
Back to the Top: Pausing and Repeating
# Wait before next run
sleep "${SLEEP_INTERVAL}"
After handling the response, the script pauses for SLEEP_INTERVAL seconds (default: 60).
This ensures we don’t overload your server or our API.
It’s part of being **efficient and respectful** of system resources.
done
This closes the while true loop.
After the sleep, the script jumps back to the top and starts the next monitoring cycle.
This is how continuous monitoring works: regular, predictable checks — forever.
You Made It!
Congratulations — you’ve just read and understood a real-world monitoring script. That’s no small feat.
And now you know: CleverUptime isn’t magic. It’s just clear code, honest comments, and a desire to help you learn.
We believe in a simple rule: Never run a script you don’t understand. But now? You’ve read every line. You know what it does. You’re in control.
This is Linux at its best: transparency, freedom, and power in your hands. Feel free to comment out anything you don’t want, change the labels, or even use it as a template for your own tools.
When you're ready, try it out — and watch how CleverUptime transforms raw system data into clear, actionable insights.