You will likely generate hundreds of event types across your websites and apps to understand how customers interact with your product. These events are captured in a specific format that generally includes event name, properties, and metadata. Consistency in the formatting of your event data is crucial for your teams to effectively gain insights on customer behavior.

However, because multiple stakeholders often define and implement event specifications, formatting inconsistencies can be introduced in your data. These inconsistencies could be in the form of: missing fields, capitalization errors (e.g. one event sets an event name in lower case, while another sets it in the upper case), and unit errors (e.g. one event tracks the revenue in pounds, while another tracks in dollars).

RudderStack’s Data Governance API was built to help you quickly address data inconsistencies.

Github Badge

Data Governance API

RudderStack's Data Governance API allows you to programmatically diagnose inconsistencies in your event data. The feature gives you access to information on all your events and their metadata including event schema, event payload versions, data types, and more. With this access, your teams can quickly pinpoint the specific nature and source of event data inconsistencies.

After you have identified the inconsistencies in your data, you can implement governance processes and set alerts for your workflows. For example, you can create alerts for missing keys in your events, or use RudderStack's Transformations to validate your event schemas and transform faulty event payloads.

This video gives a quick overview of the Data Governance API:

Using the Data Governance API

Refer to `this blog` for examples of the Data Governance API in use

Step 1: Get all event models from your data plane using RudderStack’s Data Governance API call: schemas/event-models.

Event models are the distinct events that you send through RudderStack. By pulling in all the event models, you can inspect these for inconsistencies. For example, you can check that distinct event models do not refer to the same event.

Step 2. Identify the source of the offending event model(s)

Inspect the WriteKey for each offending event to understand its source. This will allow you to address the root of the inconsistency.

Step 3: Count your event models.

If you notice multiple event models for the same event, it’s helpful to know each event’s relative quantity to determine if one is more prevalent than the other. This might influence how you reconcile the differences.

Step 4: Identify differences in schema versions for a single event model using RudderStack's Data Governance API call: schemas/event-versions.

Investigate variances in the schemas of individual events and determine how to resolve these differences.

Tracking Plans

Tracking plans let you proactively monitor and act on non-compliant event data coming into your RudderStack sources based on predefined plans. This helps you minimize risk of missing or improperly configured event data breaking your downstream systems.

Contact us

For more information on the topics covered on this page, email us or start a conversation in our Slack community.

On this page