When setting up a new server, it's tempting to install the OS and jump straight into deploying services. But if auto-boot after power loss isn't configured, you'll be driving to the office at 3 AM. If a kernel panic strikes, you'll need physical monitor access. If disk failure goes undetected, you'll lose data. This guide shares the 8-step checklist verified across 16 production servers.
8 Steps
Setup Checklist
16
Servers Deployed
10 sec
Panic Auto-Reboot
1 GB
Log Retention Limit
Why You Need a Checklist
Setting up one server from memory is fine. But when you manage multiple machines, questions like "Did I register SSH keys on this one?" or "Was journald configured?" start creeping in. We've actually had incidents on servers where a step was skipped.
What Happens When You Skip a Step
With a checklist, every server gets the same quality setup regardless of who installs it. Create it once, and it scales from 1 server to 100.
1BIOS — Auto-Boot After Power Loss
Servers run headless by default. When a power outage occurs, the machine must boot automatically when power is restored. Without this setting, you'll need to physically press the power button after every outage.
BIOS Configuration
BIOS → Power Management (or Advanced → ACPI) Restore on AC Power Loss → [Power On] ───────────────────────────────────── Power Off : Stay off after outage (default) Power On : Auto-boot after outage ✅ Last State : Restore pre-outage state
The menu name varies by motherboard manufacturer: "AC Power Recovery", "After Power Failure", or "Restore on AC Power Loss". Always set it to Power On.
Mini PCs (like N100-based systems) can have tricky BIOS access, so it's best to verify this setting during initial installation. This single configuration enables remote recovery after overnight power outages.
2Install SSH Server
Ubuntu Desktop doesn't ship with SSH server pre-installed. Even the Server edition may miss it if you skip the checkbox during installation.
Install and Enable SSH
# Install sudo apt update && sudo apt install -y openssh-server # Start and enable on boot sudo systemctl enable --now ssh # Verify status sudo systemctl status ssh # ● ssh.service - OpenBSD Secure Shell server # Active: active (running)
Once SSH is running, you can connect from another machine with ssh user@server-ip. In the next step, we'll switch from password to key-based authentication.
3SSH Key Authentication
Password authentication is vulnerable to brute-force attacks. Switching to ed25519 key authentication lets you connect securely without passwords. For the full walkthrough, see our SSH Key Multi-Server Management guide. Here are the essential commands.
Run on Your Admin Machine
# Generate key pair (skip if you already have one) ssh-keygen -t ed25519 -C "admin@office" # Copy public key to the new server ssh-copy-id -i ~/.ssh/id_ed25519.pub user@new-server-ip
Disable Password Authentication on the Server
# Edit /etc/ssh/sshd_config sudo sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' \ /etc/ssh/sshd_config # Restart SSH sudo systemctl restart ssh
Always verify key-based login works before disabling password authentication. If you disable passwords without a registered key, you'll lock yourself out.
4SSH Config Aliases
Memorizing IP addresses for multiple servers gets old fast. Add aliases to ~/.ssh/config and connect with ssh web-server instead.
~/.ssh/config Example
# Admin machine ~/.ssh/config
Host web-server
HostName 10.0.10.10
User admin
IdentityFile ~/.ssh/id_ed25519
Host gpu-server
HostName 10.0.10.20
User admin
IdentityFile ~/.ssh/id_ed25519
Host backup-server
HostName 10.0.10.30
User admin
IdentityFile ~/.ssh/id_ed25519This works identically on Linux, macOS, and Windows (OpenSSH). On Windows, the config path is C:\Users\YourName\.ssh\config with the same format.
| OS | Config Path | Notes |
|---|---|---|
| Linux / macOS | ~/.ssh/config | Built-in |
| Windows 10+ | %USERPROFILE%\.ssh\config | OpenSSH built-in |
| Windows (PuTTY) | — | Use saved sessions instead |
5kernel.panic Auto-Reboot
By default, a kernel panic leaves the server frozen. You'd need to connect a monitor and manually reboot — a serious problem for remote servers. Setting kernel.panic triggers an automatic reboot after a panic event.
Configuration
# Add to /etc/sysctl.conf echo "kernel.panic = 10" | sudo tee -a /etc/sysctl.conf # Apply immediately sudo sysctl -p # Verify sysctl kernel.panic # kernel.panic = 10
| Value | Behavior | Best For |
|---|---|---|
| 0 (default) | No reboot (stays frozen) | Dev environments (debugging needed) |
| 10 (recommended) | Auto-reboot after 10 seconds | Production servers (uptime priority) |
| 30 | Auto-reboot after 30 seconds | When crash dump collection is needed |
kernel.panic = 10 means "reboot 10 seconds after a kernel panic." 10 seconds is enough for logs to flush to disk while still recovering quickly.
6journald Persistent Logging
Ubuntu's systemd-journald defaults to volatile mode. All logs vanish on reboot. After a crash and reboot, you can't answer "What happened right before the reboot?"
Enable Persistent Storage
# Create log directory sudo mkdir -p /var/log/journal # Edit /etc/systemd/journald.conf sudo tee /etc/systemd/journald.conf > /dev/null << 'JEOF' [Journal] Storage=persistent SystemMaxUse=1G SystemMaxFileSize=100M MaxRetentionSec=3month JEOF # Restart journald sudo systemctl restart systemd-journald # Verify: list previous boot logs journalctl --list-boots
| Setting | Value | Description |
|---|---|---|
| Storage | persistent | Save logs to disk permanently |
| SystemMaxUse | 1G | Maximum total log size |
| SystemMaxFileSize | 100M | Maximum single log file size |
| MaxRetentionSec | 3month | Auto-delete logs older than 3 months |
Capping at SystemMaxUse=1G prevents excessive disk usage while safely retaining the last 3 months of logs for troubleshooting.
7smartmontools + Email Alerts
Disk failures strike without warning. But by monitoring SMART (Self-Monitoring, Analysis and Reporting Technology), you can catch early signs of degradation. smartmontools automates SMART checks and sends email alerts on anomalies.
Installation and Setup
# Install sudo apt install -y smartmontools # Check SMART support sudo smartctl -i /dev/sda # SMART support is: Available # SMART support is: Enabled # Current health status sudo smartctl -H /dev/sda # SMART overall-health self-assessment test result: PASSED
/etc/smartd.conf Configuration
# /etc/smartd.conf example # DEVICESCAN auto-detects all disks DEVICESCAN \ -d removable \ -n standby \ -s (S/../../1/02|L/../../5/03) \ -W 0,45,50 \ -m admin@example.com \ -M exec /usr/share/smartmontools/smartd_warning.sh # -s : Short test every Monday 2AM, Long test every Friday 3AM # -W 0,45,50 : Warning at 45°C, Critical at 50°C # -m : Alert recipient email # -M exec : Alert notification script
Enable the Service
# Start smartd service sudo systemctl enable --now smartd # Verify status sudo systemctl status smartd # ● smartd.service - Self Monitoring and Reporting Technology # Active: active (running)
| Test Type | Schedule | Duration | Scope |
|---|---|---|---|
| Short self-test | Every Monday | ~2 min | Basic read/electrical checks |
| Long self-test | Every Friday | ~2 hours (varies by capacity) | Full surface scan |
For NVMe SSDs, use smartctl -i /dev/nvme0. NVMe SMART attributes differ from SATA, so thresholds need separate configuration. Email alerts require mailutils + SMTP relay setup.
8One-Command Verification
After completing all 7 steps, run a verification script to confirm every setting is properly applied. Automated checks are faster and more reliable than manually verifying each item.
server-check.sh
#!/bin/bash
# server-check.sh — Server initial setup verification script
echo "=== Server Setup Verification ==="
echo ""
# 1. SSH service
echo -n "[1] SSH Service: "
systemctl is-active ssh > /dev/null 2>&1 && echo "✅ Running" || echo "❌ Not running"
# 2. Password authentication
echo -n "[2] Password Auth: "
grep -q "^PasswordAuthentication no" /etc/ssh/sshd_config && \
echo "✅ Disabled" || echo "⚠️ Still enabled"
# 3. kernel.panic
echo -n "[3] kernel.panic: "
val=$(sysctl -n kernel.panic 2>/dev/null)
[ "$val" -gt 0 ] 2>/dev/null && \
echo "✅ Reboot after ${val}s" || echo "❌ Not configured (0)"
# 4. journald persistence
echo -n "[4] journald Storage: "
[ -d /var/log/journal ] && echo "✅ Persistent" || echo "❌ Volatile"
# 5. smartd service
echo -n "[5] smartd Service: "
systemctl is-active smartd > /dev/null 2>&1 && echo "✅ Running" || echo "❌ Not running"
echo ""
echo "=== Verification Complete ==="Sample Output
=== Server Setup Verification === [1] SSH Service: ✅ Running [2] Password Auth: ✅ Disabled [3] kernel.panic: ✅ Reboot after 10s [4] journald Storage: ✅ Persistent [5] smartd Service: ✅ Running === Verification Complete ===
Transfer this script to new servers via scp and run it. If any item shows a failure mark, revisit that step. For bulk-checking multiple servers, combine it with SSH config aliases to run ssh web-server 'bash -s' < server-check.sh for remote batch verification.
Checklist Template
Copy this table and use it every time you set up a new server. When every item is checked off, the server is production-ready.
| # | Item | Verification Command / Method | Expected Result |
|---|---|---|---|
| 1 | BIOS Auto-Boot | BIOS → AC Power Loss | Power On |
| 2 | SSH Server | systemctl is-active ssh | active |
| 3 | SSH Key Auth | ssh alias (no password prompt) | Connection succeeds |
| 4 | Password Auth Disabled | grep PasswordAuthentication /etc/ssh/sshd_config | no |
| 5 | SSH Config Aliases | ~/.ssh/config Host entries added | Connect by name |
| 6 | kernel.panic | sysctl kernel.panic | 10 |
| 7 | journald Persistence | ls /var/log/journal | Directory exists |
| 8 | smartmontools | systemctl is-active smartd | active |
This checklist covers the minimum essential settings. Depending on server role, you may also want to add firewall (ufw), fail2ban, time sync (chrony), and swap configuration. For monitoring, see our Grafana + Prometheus monitoring guide.
After applying this checklist across 16 servers, auto-recovery from power outages, automatic kernel panic reboots, and proactive disk anomaly detection all worked flawlessly. Repeating this same procedure for every new server ensures consistently reliable infrastructure.