Shopify extract source

Sync data from Shopify to your warehouse destination via RudderStack.

danger
RudderStack’s Cloud Extract (ETL) product will be sunset on December 1, 2025. See this release note for more details.

Shopify is a popular ecommerce platform that gives you all tools to start, run, and grow your business effectively. It offers online retailers a variety of services around digital payments, marketing, product shipping, customer engagement and retention, and more.

This document guides you in setting up Shopify as a source in RudderStack. Once configured, RudderStack automatically ingests your Shopify data and routes it to your specified data warehouse destination.

warning
All the Cloud Extract sources support sending data only to a data warehouse destination.

Getting started

To set up Shopify as a source in RudderStack, follow these steps:

  1. Log into your RudderStack dashboard.
  2. Go to Sources > New source > Cloud Extract and select Shopify from the list of sources.
  3. Assign a name to your source and click Continue.

Connection settings

Enter the following connection credentials to set up the Shopify source

Shopify credentials
  • Shopify Store: Enter the name of your Shopify store from the URL. For example, if your URL is https://NAME.myshopify.com, then the store name would be NAME.
  • Replication Start Date: Select the date from when RudderStack should ingest your Shopify data. RudderStack will not replicate any data before this date.
  • API Password: Enter the Admin API access token which you can obtain by following the below steps:
  1. Log in to your Shopify account.
  2. In the left sidebar, go to Apps > App and sales channel settings.
    Shopify settings
  3. Click Develop apps > Create an app.
  4. Enter a name for the app.
  5. Select the relevant developer in the App developer dropdown menu.
  6. Click Create app > Create Custom App.
  7. From the app configuration screen, select the event version as 2022-10.
  8. Configure the following Admin API scopes:
    • read_customers
    • read_draft_orders
    • read_inventory
    • read_locations
    • read_orders
    • read_price_rules
    • read_products
    • read_shopify_payments_payouts
    • read_content
    • read_fulfillments
    • read_assigned_fulfillment_orders
    • read_merchant_managed_fulfillment_orders
    • read_third_party_fulfillment_orders
    • read_discounts
    • read_script_tags
    • read_themes
    • read_files
    • read_publications
    • read_online_store_pages
    • read_product_feeds
  9. Click Install App.
  10. You can see the Admin API access token in the API credentials tab.

Destination settings

The following settings specify how RudderStack sends the data ingested from Shopify to the connected warehouse destination:

  • Table prefix: RudderStack uses this prefix to create a table in your data warehouse and loads all your Shopify data into it.
warning
Note that RudderStack does not add special characters like - or _ to the prefix by default. Hence, you need to specify it while setting the prefix.
  • Schedule Settings: RudderStack gives you three options to ingest the data from Shopify:
    • Basic: Runs the syncs at the specified time interval.
    • CRON: Runs the syncs based on the user-defined CRON expression.
    • Manual: You are required to run the syncs manually.
info
For more information on the schedule types, refer to the Common Settings guide.

Selecting the data to import

You can choose the Shopify data you want to ingest by selecting the required resources. The below table mentions the sync type, API endpoints and the required scopes for these resources where id is a common primary key for all:

ResourceSync typeShopify API endpoint
ArticlesIncremental/articles.json
MetafieldArticlesIncremental/articles/[object_id]/metafields.json
BlogsIncremental/blogs.json
MetafieldBlogsIncremental/blogs/[object_id]/metafields.json
CustomersIncremental/customers.json
MetafieldCustomersIncremental/customers/[object_id]/metafields.json
OrdersIncremental/orders.json
MetafieldOrdersIncremental/orders/[object_id]/metafields.json
DraftOrdersIncremental/draft_orders.json
MetafieldDraftOrdersIncremental/draft_orders/[object_id]/metafields.json
ProductsIncremental/products.json
MetafieldProductsIncremental/products/[object_id]/metafields.json
ProductImagesIncremental/products/{product_id}/images.json
MetafieldProductImagesIncremental/product_images/{image_id}/metafields.json
ProductVariantsIncrementalproducts/{product_id}/variants.json
MetafieldProductVariantsIncrementalvariants/{variant_id}/metafields.json
AbandonedCheckoutsIncrementalcheckouts.json
CustomCollectionsIncrementalcustom_collections.json
SmartCollectionsIncrementalsmart_collections
MetafieldSmartCollectionsIncremental/smart_collections/[object_id]/metafields.json
CollectsIncrementalcollects.json
CollectionsIncrementalcollections/{collection_id}.json
MetafieldCollectionsIncrementalcollections/{object_id}/metafields.json
BalanceTransactionsIncrementalshopify_payments/balance/transactions.json
OrderRefunds(Sub resource)Incrementalorders/{order_id}/refunds.json
OrderRisksIncrementalorders/{order_id}/risks.json
TransactionsIncrementalorders/{order_id}/transactions.json
TenderTransactionsIncrementaltender_transactions.json
PagesIncrementalpages.json
MetafieldPagesIncremental/pages/[object_id]/metafields.json
PriceRulesIncrementalprice_rules.json
DiscountCodesIncrementalprice_rules/{price_rule_id}/discount_codes.json
LocationsFull Refreshlocations.json
MetafieldLocationsIncremental/locations/[object_id]/metafields.json
InventoryLevelsIncrementallocations/{location_id}/inventory_levels.json
InventoryItemsIncrementalinventory_items.json?ids={ids}
FulfillmentOrdersIncrementalorders/{order_id}/fulfillment_orders.json
FulfillmentsIncrementalorders/{order_id}/fulfillments.json
ShopFull Refreshshop.json
MetafieldShopsIncrementalmetafields.json
info
For more information on the Full Refresh and Incremental sync modes, refer to the Common Settings guide.

Shopify is now configured as a source. RudderStack will start ingesting data from Shopify as per your specified schedule and frequency.

You can further connect this source to your data warehouse by clicking on Add Destination, as shown:

Adding a destination
success
Use the Use Existing Destination option if you have an already-configured data warehouse destination in RudderStack. To configure a data warehouse destination from scratch, select the Create New Destination button.

FAQ

Is it possible to have multiple Cloud Extract sources writing to the same schema?

Yes, it is.

RudderStack associates a table prefix for every Cloud Extract source writing to a warehouse schema. This way, multiple Cloud Extract sources can write to the same schema with different table prefixes.

How does RudderStack count the events for Cloud Extract sources?

RudderStack counts the number of records returned by the source APIs when queried during each sync. It considers each record as an event.

How does RudderStack set the table name for the data sent via Cloud Extract sources?

RudderStack sets the table name for the resource you are syncing to the warehouse by adding rudder_ to the Table prefix you set while configuring your Cloud Extract source in the dashboard.

Cloud Extract table prefix

For example, if you set test_ as the Table prefix in the dashboard, RudderStack sets the table name as test_rudder_<resource_name>, where <resource_name> is the name of the resource you are syncing (for example, contacts, messages, etc.).

warning
Note that RudderStack does not add the character _ to the prefix by default. Hence, you need to specify it while setting the prefix.

Questions? Contact us by email or on Slack