Resend is committed to providing reliable and consistent service to our customers. Incidents are inevitable. What matters is how quickly and clearly we respond and how we can learn to make that same error not happen again.
This document explains how incidents are declared, handled, and closed.
An incident can start in a few ways:
We declare an incident when there is customer impact, degraded service, or a strong signal that customer impact is likely. When in doubt, we declare early. A false positive is cheaper than a late response.
If a provider issue affects our customers, we still treat it as our incident. Our customers experience Resend, not our dependencies.
When an incident is declared, we create or join the incident channel and use it as the source of truth for triage, decisions, and updates.
Every incident should have clear ownership. One person coordinates the response. One person owns customer communication. Early in the incident, the same person may temporarily do both, but ownership should always be explicit.
Triage quickly
The first step is to decide whether the report should be accepted as an incident. If not, we close it and document why. If yes, we move into incident mode immediately.
Huddle when needed
Written updates are useful, but they are not always enough. We don't expect the entire company is following the incident channel and reading. We join a huddle to pass context and make decisions fast.
Stay urgent until the incident is resolved
Incidents take priority over normal work until customer impact has been removed and the system is stable. If the incident is not resolved, it remains the top priority until the end.
Centralize communication
Findings, decisions, and next steps belong in the incident channel so everyone is working from the same context.
Mitigate first
Our first priority is to reduce or eliminate customer impact. Rollbacks, feature flags, traffic shifts, or operational workarounds come before root cause analysis. We can investigate deeply once the system is stable.
Communicate clearly
If customers are impacted, we communicate early and clearly through the appropriate channel. Broad incidents should be reflected on the status page. A single-customer incident may be handled directly.
An incident is closed only when customer impact has ended and the system is stable.
Before closing an incident, we confirm: