Configure system alerts and warnings for your RudderStack implementation.
Real-time alerts are critical to every system. You need to be notified if the system is not working as per your expectation. Categorizing all possible alerts and being able to identify a given alert helps you resolve any issue quickly. This way, you can maximize high availability of your system, and not worry about unknown downtime.
The open-source version of RudderStack supports the following alerts:
RudderStack Crash Alerts
If you configure the variable BUGSNAG_KEY environment variable, the crashes are automatically sent to the Bugsnag destination.
RudderStack Running Mode Alert
If the RudderStack server starts in a degraded or maintenance mode, you will be alerted. We support PagerDuty and VictorOps integrations for this alert.
In addition to the default open-source alerts, the enterprise version of RudderStack comes with over 30 built-in warnings and critical alerts. These alerts are customizable and can be configured according to your infrastructure requirements.
This alerting system is designed to help you identify and debug the problems quickly and efficiently, before things start to go wrong in the system.
The following are a few examples where the alerts would be triggered:
Jobs DB table count (This is a metric to measure the unprocessed events)
Control Plane API errors
You can setup warnings to check if RudderStack is not behaving as expected. Warnings are better for the acceptable anomalies in the system.
For example - The disk usage reaching 40% for a short duration due a sudden spike in your customer activity might be acceptable. But consistent disk usage of 80% is something that needs your immediate attention.
These are the alerts that need to be immediately paged to your engineering or ops teams. Our runbooks will help you understand the problem and suggest possible remedies for each alert.
For example - Warehouse upload failures - There could be a possible change in access to the warehouse.
RudderStack provides default values for alert configurations that would work well for most of the cases. It is recommended to set up the thresholds based on your event volume and the acceptable values as per your requirements.
A sample alert configuration in a RudderStack Enterprise Kubernetes deployment is as shown:
Enterprise alerting has native integrations with various third-party incident management tools like PagerDuty, VictorOps and OpsGenie, as well as notification tools such as Slack and Mattermost. It also supports webhooks, so that you can easily integrate any third-party tool that has HTTP API endpoints.
cookies, the cookies that are categorized as necessary are stored on your browser as they are as
for the working of basic functionalities of the website. We also use third-party cookies that
analyze and understand how you use this website. These cookies will be stored in your browser
consent. You also have the option to opt-out of these cookies. But opting out of some of these
have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This
category only includes cookies that ensures basic functionalities and security
features of the website. These cookies do not store any personal information.
learn more about cookies and why we use them, visit our cookie
policy. We'll assume you're ok with this, but you can opt-out if you wish Cookie Settings.