By Rudderstack Team

How to Load data from Stripe to MS SQL Server

This post helps you with loading your data out of Stripe to MS SQL Server. If you are looking to get analytics-ready data without the manual hassle you can integrate Stripe to MS SQL Server with RudderStack, so you can focus on what matters, getting value out of your business data.

Extract your Stripe data

Stripe is an API-first product, it’s a unified set of APIs and tools that instantly enables businesses to accept and manage online payments. It is a web API following the RESTful principles, they try to use as many as possible HTTP built- in features to make it accessible to off-the-shelf HTTP clients and the serialization they support for their responses is JSON.

They also have two different types of keys used for authentication, one for testing mode and one for live mode, using the testing mode key it becomes easy to test every aspect of the API without messing with your real data. Also, keep in mind that the calls you make to the Stripe API have to be over HTTPS only for security reasons, plain HTTP calls will fail, same happens for non-authenticated calls, so do not forget to use your testing mode key in case you want to experiment with the API.

Currently, the Stripe API is built around the following ten core resources:

  • Balance – an object that represents your stripe balance.
  • Charges – to charge a credit or debit card you create a charge
  • Customers – Customer objects allow you to perform recurring charges and track multiple charges that are associated with the same customer.
  • Dispute – A dispute occurs when a customer questions your charge with their bank or credit card company.
  • Events – Events are our way of letting you know when something interesting happens in your account.
  • File uploads – There are various times when you’ll want to upload files to Stripe (for example, when uploading dispute evidence).
  • Refunds – Refund objects allow you to refund a charge that has previously been created but not yet refunded.
  • Tokens – Tokens can be created with your publishable API key.
  • Transfers – When Stripe sends you money or you initiate a transfer to a bank account
  • Transfer reversals – A previously created transfer can be reversed if it has not yet been paid out.

All of the above resources support CRUD operations by using HTTP verbs on their associated endpoints. As a web API, you can access it using by using tools like CURL or Postman or your favorite HTTP client for the language or framework of your choice. Some options are the following:

  • Apache HttpClient for Java
  • Spray-client for Scala
  • Hyper for Rust
  • Ruby rest-client
  • Python http-client

There’s also a large number of libraries that wrap around the Stripe API and offer an easier way to interact with it, both communities developed and from Stripe. For more information, you can check the libraries section in the API documentation.

Stripe and any other service that you might be using has figured out (hopefully) the optimal model for its operations, but when we fetch their data them, we usually want to answer questions or do things that are not part of the context that these services operate, something that makes these models sub-optimal for your analytic needs.

For this reason, we should always keep in mind that when we work with data coming from external services, we need to remodel it and bring it to the right form for our needs.

So let’s assume that we want to perform some churn analysis for our company, and to do that, we need customer data that indicates when they have canceled their subscriptions. To do that, we’ll have to request the customer objects that Stripe holds for our company. We can do that with the following command:

curl https://api.stripe.com/v1/charges?limit=3
-u sk_test_BQokikJOvBiI2HlWgH4olfQ2:

and a typical response will look like the following:

{
"object": "list",
"url": "/v1/charges",
"has_more": false,
"data": [
{
"id": "ch_17SY5f2eZvKYlo2CiPfbfz4a",
"object": "charge",
"amount": 500,
"amount_refunded": 0,
"application_fee": null,
"balance_transaction": "txn_17KGyT2eZvKYlo2CoIQ1KPB1",
"captured": true,
"created": 1452627963,
"currency": "usd",
"customer": null,
"description": "thedude@grepinnovation.com Account Credit",
"destination": null,
"dispute": null,
"failure_code": null,
"failure_message": null,
"fraud_details": {
},

Inside the customer object there’s a list of subscription objects that look like the following JSON document:

{
"id": "sub_7hy2fgATDfYnJS",
"object": "subscription",
"application_fee_percent": null,
"cancel_at_period_end": false,
"canceled_at": null,
"current_period_end": 1455306419,
"current_period_start": 1452628019,
"customer": "cus_7hy0yQ55razJrh",
"discount": null,
"ended_at": null,
"metadata": {
},
"plan": {
"id": "gold2132",
"object": "plan",
"amount": 2000,
"created": 1386249594,
"currency": "usd",
"interval": "month",
"interval_count": 1,
"livemode": false,
"metadata": {
},
"name": "Gold ",
"statement_descriptor": null,
"trial_period_days": null
},
"quantity": 1,
"start": 1452628019,
"status": "active",
"tax_percent": null,
"trial_end": null,
"trial_start": null
}

These objects together with part of the customer object, contain the information we need to perform churn analysis. Of course, we’ll have to extract all the information we need, map it to the schema of our data warehouse repository and then load the data to it following the instructions of this post.

About Stripe

Stripe is the best way to accept payments online. Stripe aims to expand internet commerce by making it easy to process transactions and manage an online business. They want to increase the GDP of the internet. Enabling more transactions is a problem rooted in code and design, not finance. Stripe is built for developers, makers, and creators. On almost every front, it was becoming easier to build and launch an online business. Payments, however, remained dominated by clunky legacy players. It seemed clear that there should be a developer-focused, instant-setup payment platform that would scale to any size. Stripe launched in September 2011.

Stripe now processes billions of dollars a year for thousands of businesses, from newly-launched start-ups to Fortune 500 companies. Since Stripe powers so many new businesses, it’s a snapshot of how the internet is changing; many users are in categories that barely existed five years ago.

Stream data using API from Stripe to MS SQL Server

It is also possible to setup a streaming data infrastructure that will collect Stripe’s data and push them into your data warehouse in a streaming fashion. This can be achieved by using the webhooks functionality that Stripe supports, you register some events to it, and every time something happens, Stripe will push a message to your webhook.


For more information about that, check the API documentation on webhooks.

Load Data from Stripe to MS SQL Server

As a feature-rich and mature product, MS SQL Server offers a large and diverse set of methods for loading data into a database. One way of importing data into your database is by using the SQL Server Import and export Wizard. With it and through a visual interface you will be able to bulk load data using a number of data sources that are supported.

You can import data by another SQL Server, from an Oracle database, from Flat Files, from an Access Data Source, PostgreSQL, MySQL and finally Azure Blob Storage. Especially if you are using a managed version of MS SQL Server on Azure, you should definitely consider utilizing the Azure Blob Storage connection.

In this way, you will be loading data as Blobs on Azure, and your MS SQL Server database will sync with it through the Import and Export Wizard.

Another way for importing bulk data into an SQL Server, both on Azure and on-premises, is by using the bcp utility. This is a command-line tool that is built specifically for bulk loading and unloading of data into an MS SQL database.

Finally and for compatibility reasons, especially if you are managing databases from different vendors, you can you BULK INSERT SQL statements.

About Microsoft SQL Server

Microsoft SQL Server is one of the oldest and most mature database systems. Its first version was introduced about 28 years ago, in 1989, and Microsoft has been consistently supporting and extending the product until today.

So, it’s no surprise that Microsoft SQL Server has one of the richest feature sets among the currently available database systems.

SQL Server is delivered in different editions or flavors. The most notable being the Enterprise edition which can manage databases as large as 524 petabytes utilizing up to 12 terabytes of memory and 640 CPU processors. A free and scaled-down version is called Express.

A Business Intelligence version focusing on use cases where BI is performed on-premises. This version is actually a bundle of different products, including the core database system, together with other Microsoft-related products that can be used for BI purposes like visualization and data management.

In addition, there are also plenty of specialized versions of the database like the Compact edition that can be used on small devices and of course the Azure version, which is the cloud-based edition of SQL Server. Microsoft SQL Server incorporates a modular architecture that can extend the database with additional services. Replication services can extend the database to a cluster version and thus help with scaling and fault tolerance.

The SQL Server Analysis Services augment the database with OLAP and data mining capabilities, making the database ideal for the workloads that we care about in this guide.

In a similar way and as it happens with the rest of the databases, you can also use the standard INSERT statements, where you will be adding data row-by-row directly to a table. It is the most basic and straightforward way of adding data into a table but it doesn’t scale very well with larger data sets.

So, for bulk datasets, you better consider one of the previous methods.

Updating your Stripe data on MS SQL Server

As you will be generating more data on Stripe, you will need to update your older data on an MS SQL Server database. This includes new records together with updates to older records that for any reason have been updated on Stripe.

You will need to periodically check Stripe for new data and repeat the process that has been described previously, while updating your currently available data if needed. Updating an already existing row on a SQL Server table, is achieved by creating UPDATE statements.

Another issue that you need to take care of is the identification and removal of any duplicate records on your database. Either because Stripe does not have a mechanism to identify new and updated records or because of errors on your data pipelines, duplicate records might be introduced to your database.

In general, ensuring the quality of the data that is inserted in your database is a big and difficult issue and MS SQL Server features like TRANSACTIONS can help tremendously, although they do not solve the problem in the general case.

The best way to load data out of Stripe to MS SQL Server

So far, we just scraped the surface of what you can do with MS SQL Server and how to load data into it. Things can get even more complicated if you want to integrate data coming from different sources.

Are you striving to achieve results right now?

Instead of writing, hosting, and maintaining a flexible data infrastructure, use RudderStack that can handle everything automatically for you.

Rudderstack, with one click, integrates with sources or services, creates analytics-ready data, and syncs your Stripe to MS SQL Server right away.

Get Started Image

Get started today

Start building smarter customer data pipelines today with RudderStack. Our solutions engineering team is here to help.