Skip to main content
Version: 1.2.0

Manage Incident RCAs

This topic covers how to track and manage incident root cause analyses (RCAs).

info

Currently, the AI SRE Agent only supports the Jira tracking tool.

View Incident RCAs

  1. Go to Incident RCA on the left sidebar.

  2. On the Incident RCA page, you must select an incident from the drop-down list on the right side of the page.

  3. By default, you go to the Root Cause Analysis tab. View the root cause analysis of this incident.

    alt

    info

    The Root Cause Analysis tab supports human-in-the-loop coordination. For more information, see Perform Human-in-the-Loop RCA.

  4. Go to the Remediation actions tab to view remediation action to resolve this incident.

    alt

  5. Go to the Graph View tab and expand Interactive RCA Graph View to view the interactive service graph of this incident. The graph provides a legend that displays the number of critical, degraded, and healthy connections.

    alt

  6. Go to the Agent Trace tab to view real-time visibility into agent processing and LLM calls.

    alt

  7. Scroll down to view the RCA Analyzer.

    alt

View Metadata of an Incident

For every incident, you can view the following attributes/metadata on the left side of the Incident RCA page:

  1. Symptoms: Do a mouse hover and click View Details. On the Symptoms pane, you can view the list of symptoms with their severity of the incident.

    alt

  2. Signals: Do a mouse hover and click View Details. On the Signals pane, you can view the list of signals of all the integrations.

    alt

  3. Confidence Score: Do a mouse hover and click View Details. On the Confidence Score pane, you can view the list of tasks with their calculate confidence.

    alt

  4. Topology Impact: Do a mouse hover and click View Details. On the Topology Impact pane, you can view the list of impacted service topology or infrastructure.

    alt

  5. Log Summary: Do a mouse hover and click View Details. On the Log Summary pane, you can view the list of logs related to the incident.

    alt

Perform Human-in-the-Loop RCA

  1. Go to Incident RCA on the left sidebar.

  2. On the Incident RCA page, you must select an incident from the drop-down list on the right side of the page.

  3. By default, you go to the Root Cause Analysis tab. View the root cause analysis of this incident.

    alt

  4. Scroll down the Root Cause Analysis pane to Incident Collaboration Hub.

    alt

  5. In the text box, you can provide feedback on the root cause analysis finding. For example, suggesting a different investigation direction or requesting a review of additional information. If the system does not have access to all the required data, you can attach supporting evidence using the Attach Evidence button.

  6. Click Send & Re-calibrate RCA for the AI SRE Agent to generate a recalibrated RCA based on the new information.