Ansible: Stop SSH-ing Into 50 Servers Like a Caveman š¤āļø
Ansible: Stop SSH-ing Into 50 Servers Like a Caveman š¤āļø
Real confession: I once spent 4 hours SSH-ing into 40 production servers one-by-one to update an Nginx config file. Server 23 got a typo. Server 37 was accidentally skipped. Server 12 rebooted mid-config. The result? An incident ticket, a very angry CTO, and me stress-eating my third coffee. ā
You know what would have fixed all of that? Ansible. In 3 minutes. With zero typos.
Welcome to configuration management - the thing between you and a complete server meltdown at 2 AM. š„
What Is Ansible, and Why Should You Care? š¤
Ansible is an automation tool that lets you describe your server state in YAML files called playbooks, then applies those changes across all your servers simultaneously.
The elevator pitch:
# BEFORE ANSIBLE (manual caveman approach)
ssh user@server-01 "sudo apt-get update && sudo apt install -y nginx"
ssh user@server-02 "sudo apt-get update && sudo apt install -y nginx"
ssh user@server-03 "sudo apt-get update && sudo apt install -y nginx"
# ... 47 more servers š
# Pray you didn't make a typo on server-14
# AFTER ANSIBLE (civilized human approach)
ansible-playbook install-nginx.yml
# Done. All 50 servers. Simultaneously. Consistently. ā
Why Ansible beats the alternatives:
- No agents required - uses plain SSH (zero extra software on servers!)
- YAML playbooks - readable by humans, even junior devs
- Idempotent - run it 10 times, same result each time (won't break if already configured)
- Huge module library - apt, yum, file, template, systemd, docker... everything's built-in
The Horror Story That Converted Me š
After setting up CI/CD pipelines for several Laravel and Node.js projects, I thought I had server management figured out. I did not.
The scenario: A critical OpenSSL vulnerability dropped. Needed to patch and restart services on 40 AWS EC2 instances. No Ansible. Just me, a bash one-liner, and misplaced confidence.
# My "brilliant" plan
for i in {01..40}; do
ssh -i ~/.ssh/prod.pem [email protected].$i \
"sudo yum update openssl -y && sudo systemctl restart nginx"
done
What actually happened:
- Servers 01-18: ā Patched fine
- Server 19: SSH key mismatch (different key pair!)
- Servers 20-25: ā Patched fine
- Server 26: Nginx wasn't installed there (it was Apache!) - restart command failed silently
- Server 27: The loop counter hit
10.0.0.27... which was the database server š - Servers 28-40: Patched fine, but I was already sweating
Damage: 1 server missed, 1 database restarted unnecessarily, 2 hours of panic. No incidents, but pure luck.
After that incident, I spent a weekend learning Ansible. Never looked back.
Ansible 101: Your First Playbook š
Step 1: The Inventory File
Tell Ansible which servers exist:
# inventory.ini
[web]
web-01 ansible_host=10.0.1.10
web-02 ansible_host=10.0.1.11
web-03 ansible_host=10.0.1.12
[app]
app-01 ansible_host=10.0.2.10
app-02 ansible_host=10.0.2.11
[db]
db-01 ansible_host=10.0.3.10
[production:children]
web
app
db
[all:vars]
ansible_user=ec2-user
ansible_ssh_private_key_file=~/.ssh/prod.pem
ansible_python_interpreter=/usr/bin/python3
Group your servers logically: web, app, db, production. Target them individually or all at once.
Step 2: Your First Playbook
# install-nginx.yml
---
- name: Install and configure Nginx
hosts: web # Run on "web" group only
become: yes # sudo mode
tasks:
- name: Install Nginx
package:
name: nginx
state: present # Ensure it's installed
- name: Start and enable Nginx
systemd:
name: nginx
state: started
enabled: yes # Start on boot too
- name: Copy Nginx config
template:
src: templates/nginx.conf.j2
dest: /etc/nginx/nginx.conf
owner: root
group: root
mode: '0644'
notify: Restart Nginx # Only restart if config changed!
handlers:
- name: Restart Nginx
systemd:
name: nginx
state: restarted
Run it:
# Dry run (check what WOULD happen)
ansible-playbook install-nginx.yml --check
# Actually apply it
ansible-playbook install-nginx.yml
# Only on specific servers
ansible-playbook install-nginx.yml --limit web-01
# See verbose output
ansible-playbook install-nginx.yml -v
The beautiful part: Run this 10 times. If Nginx is already installed and configured correctly, Ansible does nothing. If server-03 drifted out of config? It fixes only that server. This is idempotency - the DevOps superpower! āØ
Real-World Playbook: Laravel App Deployment š
Here's an actual playbook I use to deploy Laravel applications:
# deploy-laravel.yml
---
- name: Deploy Laravel Application
hosts: app
become: yes
vars:
app_dir: /var/www/myapp
app_user: www-data
php_version: "8.2"
tasks:
- name: Ensure app directory exists
file:
path: "{{ app_dir }}"
state: directory
owner: "{{ app_user }}"
group: "{{ app_user }}"
mode: '0755'
- name: Pull latest code from Git
git:
repo: [email protected]:myorg/myapp.git
dest: "{{ app_dir }}"
version: main
accept_hostkey: yes
force: yes
become_user: "{{ app_user }}"
- name: Install Composer dependencies
composer:
command: install
working_dir: "{{ app_dir }}"
no_dev: yes # Production! No dev deps!
optimize_autoloader: yes
become_user: "{{ app_user }}"
- name: Copy .env file
template:
src: templates/laravel.env.j2
dest: "{{ app_dir }}/.env"
owner: "{{ app_user }}"
mode: '0600' # Secrets! Lock it down!
- name: Run database migrations
command: php artisan migrate --force
args:
chdir: "{{ app_dir }}"
become_user: "{{ app_user }}"
- name: Clear and rebuild cache
command: "{{ item }}"
with_items:
- php artisan config:cache
- php artisan route:cache
- php artisan view:cache
args:
chdir: "{{ app_dir }}"
become_user: "{{ app_user }}"
- name: Set correct permissions
file:
path: "{{ app_dir }}/storage"
state: directory
recurse: yes
owner: "{{ app_user }}"
mode: '0775'
- name: Restart PHP-FPM
systemd:
name: "php{{ php_version }}-fpm"
state: restarted
A CI/CD pipeline that saved our team: We wired this directly into GitHub Actions. Every push to main triggers ansible-playbook deploy-laravel.yml, and 30 seconds later, all app servers are updated. Zero manual steps. Zero SSH-ing. Zero 3 AM mistakes. šÆ
Variables and Templates: Configuration Done Right šØ
Hard-coding values is for amateurs. Use variables:
# group_vars/production.yml
---
app_env: production
db_host: "{{ vault_db_host }}" # Encrypted with Ansible Vault!
db_password: "{{ vault_db_pass }}" # Never plain text in Git!
max_connections: 100
nginx_worker_processes: 4
# group_vars/staging.yml
---
app_env: staging
db_host: staging-db.internal
max_connections: 20
nginx_worker_processes: 2
Jinja2 templates for configs:
# templates/nginx.conf.j2
worker_processes {{ nginx_worker_processes }};
events {
worker_connections 1024;
}
http {
upstream app {
{% for host in groups['app'] %}
server {{ hostvars[host]['ansible_host'] }}:9000;
{% endfor %}
}
server {
listen 80;
server_name {{ server_name }};
location / {
proxy_pass http://app;
}
}
}
What this generates automatically: An Nginx config with ALL your app servers added to the upstream block. Add a new app server to inventory? Re-run the playbook. Done. No manual edits. š¤Æ
Ansible Vault: Stop Committing Passwords to Git š
After countless deployments, I've seen teams commit .env files to Git "just temporarily." Ansible Vault prevents this:
# Encrypt a file
ansible-vault encrypt group_vars/production/secrets.yml
# Edit encrypted file
ansible-vault edit group_vars/production/secrets.yml
# Run playbook with vault password
ansible-playbook deploy.yml --ask-vault-pass
# Or use a password file (for CI/CD)
ansible-playbook deploy.yml --vault-password-file ~/.vault_pass
Your secrets file (encrypted at rest):
# group_vars/production/secrets.yml (encrypted!)
---
vault_db_host: prod-mysql.cluster.amazonaws.com
vault_db_pass: S3cur3P@ssw0rd!
vault_redis_pass: AnotherS3cret!
vault_app_key: base64:yourapplicationkeyhere==
Docker taught me the hard way about secrets in images. Ansible Vault teaches you the same lesson for server configs - encrypt everything, commit nothing sensitive in plain text.
Roles: Organizing Playbooks Like a Pro š
Once you have 5+ playbooks, they get messy. Roles are the solution:
# Generate role structure
ansible-galaxy init roles/nginx
ansible-galaxy init roles/php
ansible-galaxy init roles/laravel
# Structure
roles/
āāā nginx/
ā āāā tasks/main.yml # What to do
ā āāā handlers/main.yml # What to do when notified
ā āāā templates/ # Jinja2 templates
ā āāā files/ # Static files
ā āāā defaults/main.yml # Default variables
āāā php/
ā āāā tasks/main.yml
āāā laravel/
āāā tasks/main.yml
A clean master playbook:
# site.yml
---
- name: Configure web servers
hosts: web
become: yes
roles:
- nginx
- certbot
- name: Configure app servers
hosts: app
become: yes
roles:
- php
- laravel
- supervisor
- name: Configure database servers
hosts: db
become: yes
roles:
- mysql
- mysql-backup
Now your whole infrastructure is documented as code. New team member? Point them at site.yml. Spin up a new server? Run site.yml. Rebuild from scratch? Run site.yml. šÆ
Before vs After: Real Numbers š
Patching 40 servers with manual SSH:
| Task | Manual SSH | Ansible |
|---|---|---|
| Update packages | 40 minutes | 3 minutes |
| Config change | 25 minutes | 90 seconds |
| App deployment | 45 minutes | 2 minutes |
| Error rate | ~15% (human error!) | ~0% |
| Auditability | "I think I did it right?" | Full YAML history in Git |
| Reproducibility | "Uh... I took notes?" | Run the playbook again |
After countless deployments, the speed isn't even the main benefit. It's the consistency and auditability. I can open a PR for a config change, have a teammate review it, merge, and the pipeline runs the playbook automatically. No mystery. No "works on server-14 but not server-23". š
Common Mistakes to Avoid šŖ¤
Mistake #1: Not Using --check First
# BAD: Run blind in production š
ansible-playbook site.yml
# GOOD: Check what would change
ansible-playbook site.yml --check --diff
# Shows exact file diffs BEFORE applying!
# BETTER: Test on staging first
ansible-playbook site.yml -i inventory/staging.ini --check
Mistake #2: Running as Root Everywhere
# BAD
- hosts: all
become: yes # Everything as root
# GOOD
- hosts: all
tasks:
- name: Install nginx
package:
name: nginx
become: yes # Only escalate when needed!
- name: Deploy app files
copy:
src: app/
dest: /var/www/app/
become_user: www-data # Deploy as app user!
Mistake #3: Hardcoding Secrets
# BAD š±
- name: Configure database
template:
vars:
db_password: "my_actual_password123"
# GOOD ā
- name: Configure database
template:
vars:
db_password: "{{ vault_db_password }}"
# vault_db_password lives in encrypted Ansible Vault file
Mistake #4: No Tags on Long Playbooks
# Tag your tasks!
- name: Install Nginx
package:
name: nginx
tags:
- nginx
- packages
- name: Deploy Laravel app
git:
...
tags:
- deploy
- laravel
# Run ONLY deploy tasks (skip package installs)
ansible-playbook site.yml --tags deploy
# Skip the slow database migration task
ansible-playbook site.yml --skip-tags migrations
Time saved: Running a full site.yml might take 15 minutes. Running with --tags deploy? 90 seconds. Tag everything! ā”
Quick Start: Your First Ansible Setup ā”
# Install Ansible
pip install ansible
# Test connectivity
ansible all -i inventory.ini -m ping
# Run a quick ad-hoc command on all web servers
ansible web -i inventory.ini -m shell -a "uptime"
# Check disk space
ansible all -i inventory.ini -m shell -a "df -h"
# Your first playbook!
ansible-playbook install-nginx.yml -i inventory.ini --check
My recommended directory structure:
ansible/
āāā inventory/
ā āāā production.ini
ā āāā staging.ini
āāā group_vars/
ā āāā all.yml
ā āāā production/
ā ā āāā vars.yml
ā ā āāā secrets.yml # Encrypted with Vault!
ā āāā staging/
ā āāā vars.yml
āāā roles/
ā āāā nginx/
ā āāā php/
ā āāā laravel/
āāā site.yml # Full infrastructure
āāā deploy.yml # Just app deployment
TL;DR ā Stop Doing It Manually šÆ
Ansible turns server management from "12-step manual process where step 7 is 'pray'" into declarative, repeatable, reviewable code.
- ā One command updates 50 servers consistently
- ā Config lives in Git (review it, audit it, roll it back!)
- ā Vault encrypts secrets (stop committing .env files!)
- ā Idempotent (safe to re-run anytime)
- ā Works with CI/CD (automate your playbooks!)
- ā No agents on servers (just SSH!)
The moment Ansible clicked for me: I ran a playbook to update a config file across 35 servers, made a cup of tea, came back, and it was done. All 35 servers. Correct. No typos. No missed servers. Just done.
That's the whole pitch. You deserve that cup of tea. āš¤
Still SSH-ing into servers manually? Connect with me on LinkedIn - I'll share more Ansible tips from 7 years of production deployments!
Want to see real Ansible playbooks? Check out my GitHub - where I keep templates from actual production setups!
Now go write some YAML and reclaim your weekends! šāļøāØ
P.S. If you're running the same command on more than 3 servers manually, that's your sign to write an Ansible playbook. Your future self will thank you at 3 AM. š
P.P.S. Yes, you can use Ansible with your existing Terraform setup. They complement each other perfectly - Terraform provisions the infrastructure, Ansible configures it. That's a post for another day! š