Kyndryl AIOps

Introduction to Kyndryl AIOps

Incidents per server per Month (Pervasive Management)
Published On Aug 30, 2024 - 10:42 PM

Incidents per server per Month (Pervasive Management)

This section shows the Incidents per server per Month for I-AIOps.
This is the Pervasive Management Insight which provides the trends for Incident per server per month and trend analysis for top 5 affected servers
Incident per server per month is the key metric as part of Advanced delivery where the ratio of Incidents to server has to be <0.8 for the month.
Pervasive Management Dashboard provides a trend analysis of the top five affected servers (top talkers) and identifies where noise need to be removed and/or an area of concern need to be addressed within the underlying environment.
By addressing the top five affected servers continuously at the account and squad level, it would help in reducing the noise and thus achieve the metric.
The following image shows how to use the Integrated AIOps Pervasive Management insight within the account management system. It supports identifying incident noise and pervasive issues within the managed environment. The insight provides the possibility for the account team to get insight where non-actionable incident tickets are created which should be addressed by tuning monitoring or working with the customer to address underlying systemic issues.

Business Value and benefits

  • The Pervasive Management dashboard identifies noise within the environment and areas of concerns which need to be addressed by the account team to reduce the overall amount of incident tickets on the account.
  • By reducing the noise, account team will be able to redeploy the resources to focus on Continuous Improvement Opportunities, customer needs and other value-added activities.
  • Account team can manage to transform from fire-fighting mode to proactive management of customer needs.

Metrics

The following table provides a description and calculation for each KPIs/Metrics used within the insight.
Filters:
From a specific filter drop-down, select the items you want, then click
Update
,  and
Apply
. To reset all the filters, click on
Reset Filters
at any time. This will remove all filters selected and move to default view.
To remove a specific filter, unselect the item from the drop-down,
Update
and then
Apply
. Otherwise, go to the Adjustable filter line on the top and use
x
to remove any filter of your choice.
KPI/Metric Name
KPI/Metric Description
Incidents per server per month
Calculation of the last three months of this metric. For the current month, the month to date received tickets are calculated for the full month. This is to show a realistic projection of where the account is going to end up on this metric at the end of the month. See "Calculation of projected incident per server per month value" formula at the bottom of this page. This widget is not affected by filters applied to the dashboard.
Servers linked with tickets
Count of distinct servers associated with the tickets
Total Ticket Count
Count of incident tickets
MTTR Incl Hold (Minutes)
Mean Time taken to Resolve server tickets which includes the hold time
MTTR Excl Hold (Hours)
Mean Time taken to Resolve server Tickets which excludes the hold time
Capacity Vs Non Capacity
"Storage incidents compared to non-storage incidents. Incidents that occur because of storage issues such as insufficient disk space are classified as Capacity tickets. Incidents not related to storage, such as server or network related issues are classified as Non-Capacity tickets."
Server Region
Region where the server exists. Example: EMEA
Priority View
"Count of Tickets along with the severity (P1- Critical, P2-Major, P3-Minor, P4-Low). Mouse hover on any of the priority view Tree map to view a tooltip. The tooltip displays Priority and count of tickets for that priority. Example: ‘P2- Major Count: 611', which indicates the priority level is ‘Major' and 611 tickets are available under major level. Similar information is available for Critical, Minor and Low priority tickets. The sum of priority wise ticket count (P1+P2+P3+P4) shown in Priority View always matches the Total Ticket Count"
Top 50 servers
Top 50 servers associated with the tickets
Server Function
Different type of server function
OpCo
OpCo (Sub-Companies) based on the Ticket Count
Top 50 Category
Top 50 Category which are handling the Incident Tickets
Assignment Group
Assignment Group to which the servers are assigned
Day wise Category Trend
Day-by-day trend of tickets broken down by category
Day wise Priority Trend
Day-by-day trend of tickets broken down by priority
Month Wise Trend
Monthly trend of ticket creation on the respective server
Week Wise Trend
Weekly trend of ticket creation on the respective server
Top Servers On Issues
"Top 10 servers that are associated with different issues. Mouse hover on the heat map to view the server's name, disk space and issue type"
Top Issues on Servers
"Top five categories of issues impacting the servers. Mouse hover on the heat map to view the Ticket Category, server name and issues count"
Service Line
Number of tickets broken out by service line
Incident Details
Detailed listing of incident tickets, including all the available fields for this report. You can download the details to an Excel spreadsheet.
Calculation of projected incident per server per month value:
in-scope of Kyndryl incident received month to date projected incident per server per month = --------------------------------------------------- x number of days of full month elapsed days within the month
Do you have two minutes for a quick survey?
Take Survey