Automate Root Cause Analysis
The typical cost for an hour of service downtime is $100,000 – $72,000 per minute on average if revenue was lost (DEJ’s). Identifying the root cause (Mean-Time-To-Identification) typically takes hours but it can run into days for very complex outages. The MTTI is lengthy since IT organizations can only rely on siloed, domain-specific tools. This forces the IT teams to correlate data from multiple sources across multiple domains, to be involved in lengthy conference calls, and waste precious resources. Sound familiar?
This problem is expected to get worse as companies accelerate their digital business transformation initiatives and move to hybrid IT by embracing virtualization, cloud, DevOps or containers. IT teams will need to manage millions of data points (up from today’s tens of thousands) and therefore the current set of tools and manual processes will no longer cut it. IT teams will need to look for advanced analytics (AIOps) to automate this process.
Watch FixStream’s Rapid Root Cause Analysis Video
The industry has been trying to solve this problem for years. The wait is over!
FixStream is a software-only solution that automates the root cause analysis by:
- Providing an integrated end-to-end view of business transaction flows correlated to application services and infrastructure entities
- Correlating millions of data points per business application in real-time
- Providing auto-discovery and topology mapping of data center entities up to 2,000 within 3-4 hours, without agents
- Enriching the value by correlating the data from existing IT (APM, ITSM, SIEM, ITOM) tools with open API connectors
- Being deployed in SaaS or on-prem in hours and not days
You can identify the root cause of a revenue impacting outage in few clicks, potentially saving $100,000’s in lost revenue. How long would it have taken in your environment?
Contextual Root Cause Analysis
- A topology map is created through the auto-discovery of the entire environment, without any agents!
- The discovery process requires the IP address range and the service accounts information only
- This map shows all the entities as well as all the physical, logical and virtual connections of your infrastructure
- Imagine that just like in Google maps, you could focus on what you care about, say “eCommerce”.
- The application overlay visualizes just the faults and alerts impacting the specific application
- You can identify the root cause in few clicks!
FixStream’s patented Flow2Path™ algorithm allows you to visualize the paths each flow takes so that you can measure and improve their performance and uptime
Application Flow Analysis
- By selecting a specific application you can measure the performance of individual business transactions
- The Application map visualizes faults, alerts, and tickets in your business application at the individual application flow
- Automatically discovers the infrastructure topology as well as the flows that represent the entire business application
- If you select a specific flow, you can visualize the path taken across the supporting infrastructure
- By applying the Fault or Alert overlay you can now visualize the specific flows that are impacted by them
FixStream correlates information from business transactions, through application services and infrastructure
Business Transactions Analysis
- You can go deeper than this by analyzing the performance of the business transactions
- You can see the number of successful and failed transactions, their response time over time, as well as the latency across each hop in the application services layer
- By clicking on a failed transaction, you can identify the specific Java code in the JVM where the failure occurred!
- This information can be passed to your Java developers to fix.
Data stored in time-series helps you determine patterns to prevent issues from re-occurring in the future.
- We are able to determine the root cause behind transaction performance issues in a few clicks. This problem seems to be happening every Thursday morning, though. Why?
- Let’s run the analysis for the last several weeks
- By clicking on the DVR-like play button, we can play back all the events (flows, paths, alerts, etc.) that occurred in this time period
- The playback will highlight every change of status (from 2 to 5 events, for instance), so that we can identify the patterns behind the root causes.