Feeling stuck with Segment? Say 👋 to RudderStack.

SVG
Log in

Learning Topics

Subscription

Subscribe

We'll send you updates from the blog and monthly release notes.

Data Sharing and Third Parties

In addition to collecting in-house data from your own business processes and directly from your customers, you’ll most likely obtain data from external sources to better inform the decisions you and other departments in your organization are making.

This article will explain the definition and importance of this third-party data and identify potential sources for obtaining shared third-party data. It will outline the risks of using this data and sharing your own data, and explain best practices for mitigating these risks.

What is third-party data?

Third-party data is any data that you obtain that you did not generate or collect yourself. It includes data that you purchase from data vendors or obtain from marketing platforms. It does not include data your customers give to you directly — this is considered first-party data as you have directly collected it from its consenting owner.

Data sources — first-party, second-party, and third-party data

Data can be categorized by who collects or provides it, and their relationship with whomever subsequently uses it. This splits the data into first-party, second-party, and third-party data.

First-party data is provided directly by consenting customers to your business.

  • This will be sourced from your sales channels or engagement and interaction tracking tools, or be solicited from customers through direct feedback.
  • It will likely be highly individualized and often include personally identifiable information (PII).
  • First-party data is incredibly valuable as it is accurate, proprietary, and specific to your business.

Second-party data is collected through a trusted intermediary.

  • This may be a related business that you cooperate with that has its customers’ consent to collect their data and share it with you.
  • It may also be a service or platform that collects data on your behalf under these conditions, for your use only.
  • Second-party data is likely to be similar in content and value to first-party data, but must be treated according to the conditions set by the party sharing the data with you.

Third-party data is provided by data aggregators who collect data from a variety of sources, categorize it by use case (for example, by industry, demographic, or the data attributes present), and make it available for you to purchase.

  • The data contained could be of any use-category based on what exactly you purchased.
  • This data is usually non-specific and of uncertain origin.
  • Personally identifiable information (PII) will most likely have been stripped out or masked to comply with data sharing regulations.
  • However, third-party data can still be used to gain valuable audience insights that can be useful if applied skillfully.

Types of data — declared, inferred, observed, and modeled

Data is also split into different types depending on how it was collected and processed. Each type of data has differing levels of accuracy.

Declared data is gathered directly from users through self-reporting. This is the most accurate kind of data and is collected through surveys, product wish lists, and feedback. The alternative to this is inferred data (also referred to as observed data) that is gathered about the user without their active input. This data is still highly accurate and is collected from search histories, social media reactions, and content engagement metrics.

Data can also be collected for a cohort. Observed audiences are groups of people whose declared or inferred/observed data shares a common characteristic. For example, you may target an audience that all participate in a particular hobby.

Modeled audiences, conversely, do not exist in the real world — they are ‘mock’ audiences composed of real users that may not yet be identified as a cohesive cohort, but that match the behavior of a known audience. For example, if you have an observed audience that likes your product, you can create a modeled audience that mimics their characteristics using third-party data to broaden its scope and find new potential customers.

Third-party data uses

The specific uses of third-party data will be dependent on your business structure and monetization methods, but will usually contribute towards two key goals: growing your audience and enhancing your first-party data.

Third-party data can reveal new demographics that may find your product valuable. By finding audiences that are demographically similar to your existing users, you can tailor marketing campaigns and enhance your product to attract them.

By combining third-party data with your first-party data you can multiply the value of both. Your first-party data can be enriched using third-party data (for example, you may find additional information about your customers, such as their hobbies or location), and potentially useful patterns in third-party data can be confirmed using first-party data (for example, verifying the accuracy of third-party data against known information before applying it in expensive marketing campaigns).

Identifying the data you require

Before you can seek and obtain data from third parties, you need to know what you want. With clear goals set, you can identify and source the right data to achieve them.

Before you go looking for appropriate third-party data it pays to see what first-party data you already have, or are able to collect. It is advisable to collect as much first-party data as possible, for potential future use cases. Data that you could be collecting from your in-house processes and your customers but are not is a wasted information resource.

If your first-party data can be used to realize your goals, third-party data is not required. First-party data is both cheaper and more reliable than data sourced from third parties, so it should be your primary source whenever possible.

In addition to being comparatively expensive to obtain and maintain, the value of third-party data depreciates over time — you are usually paying for a data set that is only current for a certain moment and will quickly become stale.

Obtaining data from third parties

Data Types

Once you’ve decided on the data you require, you can start to figure out where it can be acquired. Acquiring data directly from known third parties is often the most efficient method, followed by sourcing the data from data brokers and marketplaces.

Directly sourcing third-party data

You may already have a source for the data you require — a marketing company you work with, or another business that you cooperate with, that is willing to trade information with you.

When working with third parties, even if they are known, a suitable data sharing agreement must be in place. Using data cleanrooms is also recommended when directly sourcing data from third parties.

Data brokers and data marketplaces

Data brokers sell third-party data. They collect user data, categorize and anonymize it, then package it for sale in bulk to those targeting a certain demographic. Data brokers will usually obtain their data from one of the following:

  • Demographic and behavior data collected from cookies and analytics
    • For example, Nielson collect and provide data organized into over 20000 audience segments, including demographic and behavioral data
  • User-generated data gathered via web scraping from social networks using bots
    • Social Searcher scrapes popular social media websites for mentions of specific terms so that you can monitor how your brand is perceived online
  • Data collected from public sources like government registries
    • Voter registers/electoral rolls and company registers like Companies House are a common source of publicly available information on individuals and businesses
  • Other third-party data that has been acquired by the data broker from their own third parties, including demand-side platforms (DSPs), product suppliers and manufacturers, and media agencies and digital marketing systems
    • Companies like OnAudience  provide aggregated audience data organized into billions of user profiles

Data brokers traditionally operated on their own platforms, but now data marketplaces are emerging as a centralized location for sourcing data from brokers in a compliant manner. Data marketplaces simplify the process of finding the data you need, purchasing it, and using it safely.

Snowflake's Marketplace is one example of a data marketplace that does all of the above. It allows you to purchase third-party data and move it directly into your Snowflake environment for immediate use.

Legal compliance

Data is becoming increasingly regulated. The EU’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) are the most prominent of these, and other jurisdictions are implementing their own privacy laws that regulate user data.

Privacy regulations such as the GDPR refer to data controllers and data processors, your legal responsibilities differing based on which category you fall into.

Data controllers are the entities that determine how and when data is to be used, assess the associated risks, and request the necessary measures are taken by the data processors to protect user data. Data processors are the entities that do the actual processing, usually on behalf of the controller, ensuring that the protective measures defined by the controller are met in-line with regulation.

If you are receiving third party data you are most likely acting as a data processor whereas if you are sharing your own data with a third party you will be acting as the controller. It is possible that in some scenarios, you may be acting as both.

Due to the differences in regulation depending on your users’ location, and the changes that may occur to them over time, it is important to use data sharing tools with built-in compliance. Using compliant tools will allow you to more easily meet your legal responsibilities whether acting as the data controller or data processor.

It is your responsibility to ensure that you handle data in a manner that complies with the regulations that cover the users who the data was collected from. Failure to comply with these laws leaves you open to fines or liable for damages. Your reputation is at stake, too, as security incidents have become popular news item

Data sharing agreements

Data sharing agreements lay out exactly what data will be shared, how the data will be handled, and the applications it will be used for. These agreements must be in place both to comply with regulation and to confirm that both the parties sharing and receiving data are fully aware of what is being shared and their responsibilities.

Data sharing agreements should be as detailed as possible and include a clear identification of the parties involved, an inventory of the data being shared, and the lawful basis for the sharing. Additional legal requirements will also need to be met depending on the location of your users, so it is best to consult with a legal authority on the contents and validity of your agreement.

Data sharing agreements should be regularly reviewed and updated. Customer data management (CDM) platforms can assist with this process by keeping track of what kinds of customer data are stored. This helps you assess the risks of sharing the data prior to sharing it, and make sure the data sharing agreements cover only the intended usage of each kind of data.

Sharing your own data

You may wish to mutually share your first-party data with someone else or collaborate on jointly-acquired third-party data — for example, you may partner with another business to share data and target your audiences with coordinated offers.

When sharing your first-party data, verify that the party that you are sharing with is compliant and that your customers are informed. As evidence of compliance, the parties that you exchange data with should hold a relevant privacy certification, for example SOC2 or ISO 270001, or other certification from an accredited body that covers the data laws applicable to your use case.

If you are re-sharing third-party data, make sure that you have permission to do so. Data sharing agreements should be in place in all cases to enshrine the use case and legal basis for the sharing.

Data cleanrooms

Data cleanrooms allow you to share and collaborate on data from multiple sources while respecting users’ privacy and remaining compliant with privacy regulations. They provide a secure environment where data is anonymized and aggregated from first- and third-party sources.

By using data cleanrooms, you can match your first-party data with third-party data to confirm trends and build a better picture of your audience – without having to worry about leaking PII.  When working with others, you can identify the meaning behind the data and safely leverage your first party data while extracting the maximum value for all parties contributing data to a project.

Data sharing risks

There are risks involved whether you are sharing data with or receiving shared data from third parties. In both cases, you are broadening your exposure to data leaks, breaches, and misuse.

The primary risks involved are:

  • Leaks and breaches: data is either exposed unintentionally or stolen, leaving you liable if preventative measures were not taken or the breach was not disclosed to the affected parties.
  • Lack of traceability: when processing data, it can become mixed and the provenance can be lost, resulting in data being used for purposes it was not consented to be used for.
  • Low standards: the party you share data with may not be as careful as you are or comply with regulation or your data agreement. Depending on data regulation and your relationship with your users and the third party, you may be liable for these mistakes.
  • Loss of control: Once data is out of your control, it's out of your control. If you share data with someone with low standards it may be leaked, sold on, or otherwise misused.

These risks can never be completely eliminated, and data privacy regulations usually recognise this. To remain compliant, all reasonable measures must be taken to protect customer data and ensure its accuracy.

Reducing risk — third-party data sharing best practices

When working with data supplied by others, store and process data shared with you strictly according to the data sharing agreements you have established, as they may have their own legal obligations that are covered in the agreement.


By de-identifying and anonymizing PII before sharing your data you remove the risk of sensitive information being further shared or leaked — and can potentially free the data from the constraints of privacy regulations. You should also track your data as far as possible — at least until it leaves your control — so that you can watch for potential misuse.

Mitigating the risks from data handling processes

Potential avenues for data misuse can be lessened by clearly defining the processes involved and properly recording user consent is:

  • Regularly audit your security and data policies and practices, and communicate the risks with your team.
  • Verify that the third parties you work with are reputable and can demonstrate that they have user consent for the data sharing practices in place.
  • Ensure that all stored data is properly identified and labeled and trace all usage, especially when mixing with other data. This will make it easier to check that third-party data from various sources is always handled according to the data agreements in place.
  • Choose your friends wisely — only share data with trusted parties and services with a good reputation for data security.

Protecting data by choosing compliant tools

When using services that store or process your data, make certain that you are fully aware of how your data will be treated. You may be inadvertently consenting to the further sharing or use of data you upload based on the platform's terms and conditions. Additionally, something as innocuous as using a cloud storage provider could be considered sharing or transferring that data in contravention of privacy laws.

Check that the online platforms you use to store, process, and share data meet the regulations for the data you handle, and that they are hosted in a jurisdiction that data can be legally transferred into and out of. The data platforms you use should have a good history and transparent data handling practices.

Third-party data and CDPs

Customer data platforms (CDPs) consume data from both first- and third-party sources. This data can then be processed and stored to give your business insights into existing and potential new audiences to drive growth.

When choosing a CDP, ensure that the supplier provides a detailed data sharing agreement that conforms with your use case, local regulations, and your users’ expectations and any agreements they have made with you.

Your CDP should encourage data accountability and traceability by logging all interactions with data, so that any misuse can be quickly identified and stopped. This will mean that the third-party data shared with you is protected, and the data you share is high quality.

For more information visit: What is a CDP?

Further reading

This article explained the concepts behind sharing data with third parties, and what you need to do to remain compliant while sending and receiving shared user data. For more information on customer data, and how (and why) it needs to be protected, see our other learning center articles:

Get the Data Maturity Guide

Our comprehensive, 80-page Data Maturity Guide will help you build on your existing tools and take the next step on your journey.

Build a data pipeline in less than 5 minutes

Create an account

See RudderStack in action

Get a personalized demo

Collaborate with our community of data engineers

Join Slack Community