Shoreline releases Incident Insights to help users identify causes of incidents

Shoreline launches Shoreline Incident Insights, a free product that helps Cloud Ops teams analyze their incidents so they can improve reliability, customer satisfaction, and on-call experience.

Shoreline Incident Insights

The tool automatically ingests ticketing data from incident management systems, and applies a machine learning algorithm to filter and group tickets. This makes it easier to identify the underlying causes of incidents. Incident Insights allows users to highlight top issues and calculate important metrics such as MTTA (mean time to acknowledge) and MTTR (mean time to repair).

Incident data is usually messy, often computer generated, and contains a lot of duplicates. This means that it’s hard for managers to see trends and patterns that could improve team performance. Every time an analysis is needed, a lot of time must be spent to scrub and aggregate the records manually. This extra work creates so much friction, that often the reports are never created, and potential improvements are lost.

Packaged with a simple-to-use data import engine and out-of-the-box reports, managers using Incident Insights can import their data in minutes, and begin receiving insights in seconds. This eliminates the hours of drudgery it typically takes to create incident reports.

“Too many Cloud Ops teams are flying blind,” said Anurag Gupta, founder and CEO of Shoreline.io. “I ask leaders which of their teams are carrying the heaviest on-call burden, and they don’t know. I ask which incidents are most common, and they don’t know that either. We need to learn from incidents to continuously improve availability for customers and make on-call better for our teams.”

Incident Insights’ out-of-the-box reports and dashboards include:

Top problems – top incident categories are automatically determined from the tickets themselves, so engineers can focus on determining the right action to improve the situation. Users are given lots of flexibility to order incident groups by key metrics such as frequency, severity, and MTTR.

Reports can be filtered by service, user, category, or search, and settings are automatically saved. This brings clarity to noisy ticket data.

Drilling into a specific category provides detailed statistics such as MTTA, participants, and links back to the original source data. This is crucial to identify top automation opportunities by assessing the on-call actions taken or a root cause software fix.

Operational efficiency – summary level data shows how the on-call team is performing with incident count by time period, average MTTR, and tickets by service. This shows actual performance vs. SLAs/SLOs promised to customers.

Team health – shows how on-call is impacting each team, which team members are carrying a disproportionate burden, and where there are individual performance gaps.

Historical trends – clearly see if the key metrics trending in the right direction and whether new initiatives are having the expected impact. This keeps the team working towards their quarterly and annual goals for continuous improvement.

Shoreline Incident reporting comes with an out of the box integration with PagerDuty. Integrations for Opsgenie, ServiceNow, and ZenDesk ticketing systems are coming soon. Shoreline is SOC 2 certified. Built by AWS experts, data security best practices are fully baked into the design, including end-to-end data encryption in transit and at rest. Incident Insights is a read-only tool, and can not disrupt production systems.

Cloud leaders rely on Shoreline’s Cloud Reliability Platform to self-heal common incidents in production, broaden the team that can safely repair incidents, and perform live site debugging of new incidents.

More about

Don't miss