What to Do When Your Rails App Crashes in Production

When production breaks, your job is to slow things down, gather facts, and bring in the right help. This simple plan keeps you calm, limits damage, and speeds up recovery. Share it with your team.

Step 1: Pause and set status

Tell your team you see the issue. If users are affected, post a short status note. Keep it simple:

We are investigating a production error. Some users may see failures. Next update in 30 minutes.

Promise a time for the next update and stick to it.

Step 2: Capture what users see

Ask support to collect four details:

The exact time the problem started
The page or action that fails
Any error message or code shown
The user email or account if it helps

These details let engineers match reports to logs.

Step 3: Check simple health signals

Open your host dashboard or uptime tool. Confirm:

Are there 500 errors or timeouts?
Are servers or containers restarting?
Is the database reachable?

Do not change anything yet. You are just confirming scope.

Step 4: Review recent changes

Ask two questions:

What code shipped in the last 24 hours?
What infra changes happened in the last week?

If a change lines up with the start of the outage, consider rolling back to the last good release or image.

Step 5: Check Rails and app logs

You do not need to read Ruby code to help. Search for repeating errors near the start time. Common signals:

Exception, Timeout, NoMethodError
ActiveRecord, PG:: (Postgres), Redis, Sidekiq

Save a sample error with timestamp. This is the “receipt” an expert will need.

Step 6: Verify key services

Most Rails apps depend on a few services. Check each one is healthy:

Database: Postgres or MySQL is up and connections are not maxed
Cache/Queue: Redis and Sidekiq are running and queues are not stuck
Storage: S3 or similar is reachable
Vendors: Payments, email, and other APIs show green on their status pages

If a vendor is down, note their incident link in your internal thread.

Step 7: Contact your host

Open a support ticket with a short summary:

When the issue started
The error sample from logs
Any recent deploy or infra change

Ask them to confirm there is no platform incident, disk full issue, network block, or SSL problem. Keep the ticket updated.

Step 8: Reduce blast radius

If one feature is causing crashes, turn it off with a feature flag if you have one. Consider read-only mode to protect data while you investigate. If needed, put up a short maintenance page while you roll back.

Step 9: Call for rails emergency help

If you do not have in-house Rails help, bring in a specialist. Rails Fever handles production emergencies. We work with your host, read your logs, stabilize first, then find root cause. Have your error sample, deploy history, and vendor list ready. This saves time and cost.

Step 10: Do a fast postmortem

Keep it to one page:

What happened in plain words
Start and end time
User and revenue impact
Trigger and root cause
What will change so it does not happen again

Share it with the team and close the loop.

Prevention: build your safety net

Crashes will happen. Your goal is to spot them early and make them small.

1) Monitoring and alerts

Use an uptime tool and a metrics dashboard that watches CPU, memory, queue size, error rate, and DB connections. Set alerts that wake a human in minutes.

2) Error tracking

Install an error tracker like Sentry, Honeybadger, or Rollbar. These tools group errors, tag releases, and show which users are hit. They turn “it is broken” into “this line, after this deploy.”

3) Centralized logs

Send Rails, Sidekiq, Nginx, and host logs to one place. Fast search during a fire saves hours.

4) Backups and restore drills

Backups matter only if you can restore them. Test a DB restore on staging every month.

5) Safer deploys

Ship small changes often. Add health checks and automatic rollback. Small changes are easier to unwind than big ones.

6) Capacity checks

Watch database size, disk usage, and connection limits. Many “crashes” are really resource exhaustion.

7) SSL and domain hygiene

Track certificate and domain renewals. Set calendar reminders 30 and 7 days before expiration.

8) Security and patching

Keep Rails and gems current. Do a monthly patch window. Outdated libraries cause errors and risk.

Need help with Rails maintenance? We offer comprehensive Rails Care Plans for ongoing support, technical audits to assess your current state, and Rails upgrades to keep you current. View our pricing plans to find the right fit for your needs.

Schedule a consultation or email hello@railsfever.com to discuss your Rails needs.