Data Governance

Identify inconsistencies in your event data.

You will likely generate hundreds of event types across your websites and apps to understand how customers interact with your product. These events are captured in a specific format that generally includes event name, properties, and metadata.

Consistency in the formatting of your event data is crucial for your teams to effectively gain insights on customer behavior. However, because multiple stakeholders often define and implement event specifications, formatting inconsistencies can be introduced in your data.

These inconsistencies could be in the form of:

  • Missing fields
  • Capitalization errors (for example, one event sets an event name in lower case, while another sets it in the upper case)
  • Unit errors (for example, one event tracks revenue in pounds, while another tracks it in dollars).

RudderStack’s data governance features help you quickly and efficiently address these data inconsistencies.

RudderStack’s Data Governance offerings comprise of the following features:

Event Audit API

RudderStack’s Event Audit API allows you to programmatically diagnose inconsistencies in your event data. The feature gives you access to information on all your events and their metadata including event schema, event payload versions, data types, and more. With this access, your teams can quickly pinpoint the specific nature and source of event data inconsistencies.

After you have identified the inconsistencies in your data, you can implement governance processes and set alerts for your workflows. For example, you can create alerts for missing keys in your events, or use RudderStack’s Transformations to validate your event schemas and transform faulty event payloads.

This video gives a quick overview of the Event Audit API (formerly known as the Data Governance API):

Using the Event Audit API

  1. Get all event models from your data plane using RudderStack’s Event Audit API call: schemas/event-models.

Event models are the distinct events that you send through RudderStack. By pulling in all event models, you can inspect these for inconsistencies. For example, you can check that distinct event models do not refer to the same event.

  1. Identify the source of the offending event model(s).

Inspect the write key for each offending event to understand its source. This will allow you to address the root of the inconsistency.

  1. Count your event models.

If you notice multiple event models for the same event, it’s helpful to know each event’s relative quantity to determine if one is more prevalent than the other. This might influence how you reconcile the differences.

  1. Identify differences in schema versions for a single event model using RudderStack’s Event Audit API call: schemas/event-versions.

Investigate variances in the schemas of individual events and determine how to resolve these differences.

See the Event Audit API reference for usage details.

Tracking plans

Tracking plans let you proactively monitor and act on non-compliant event data coming into your RudderStack sources based on predefined plans. This helps you minimize risk of missing or improperly configured event data breaking your downstream systems.

Geographic data residency

With RudderStack’s data residency feature, you can:

  • Ensure user events are received, temporarily stored, and processed within the geographic region they originated (for example, US or EU).
  • Ensure any derived data (for example, reporting data, live events, etc.) is also in compliance.

Questions? Contact us by email or on Slack