Incident response

Routing and maintenance

Routing controls decide who gets alerted for service incidents, while maintenance windows suppress expected alert noise during planned work.

Audience: Admins, SREs, on-call managers

Escalation policies

Name policies after ownership or severity, such as Critical service routing.
Scope policies to a service, team, severity, or fallback org-wide route.
Choose a primary alert channel for initial notification.
Choose a secondary alert channel and escalation delay when unattended incidents should escalate.
Use severity to keep low-risk incidents away from high-urgency channels.

On-call schedules

Create schedules for teams or organization-wide responder groups.
Set timezone, rotation start time, handoff hour, and rotation length.
Add responders in ordered rotation positions.
Use temporary overrides for vacation coverage, event staffing, or incident bridge ownership.
Schedule, member, and override changes are audit logged.

Maintenance windows

Create a window before planned infrastructure or application work.
Scope windows to a service, monitor, or the organization.
Set start and end times carefully so suppression does not outlive the planned work.
Use the reason field to explain the change for responders reviewing history.

Governance

Creating and deleting policies or maintenance windows requires owner or admin access.
Routing, maintenance, and on-call mutations require routing management permission.
Routing and maintenance mutations are audit logged.
Maintenance windows are part of the operational record for incident review.

Related documentation

Alert channels

Configure email, Slack, webhook, SMS, and WhatsApp delivery targets and test alert delivery.

Incidents

Acknowledge, investigate, route, resolve, and review service incidents.

Audit log

Review access, configuration, API key, incident, review, and team-management events.