The RudderStack warehouse table schemas are fully compatible with Segment. You can start routing new events to your existing warehouse tables through RudderStack, without losing any historical data.
This guide lists the steps and best practices for switching over your warehouse destinations from Segment to RudderStack.
Since we can predictably and reliably upgrade all the backend services, switching over server-side sources is simpler, as compared to the client-side sources.
Create a new warehouse destination and set the namespace to be the same as the schema that Segment is writing to. RudderStack will then write to the same set of tables as Segment.
Follow these steps:
- Switch all the server-side clients to route your event data to RudderStack
- Any events that are pending at Segment will be routed into your warehouse
- RudderStack will then start routing new events into your warehouse
If both Segment and RudderStack try to write to the same tables at the same time, it could result in Serializable isolation errors in some warehouses like Redshift. This is an intermittent issue, and would succeed on retrying.
There could be a scenario where some clients which are still using the old version of your application (e.g. Android / iOS) and sending the events to Segment.
Follow the below steps to easily migrate to RudderStack and storing the event data in the same tables as Segment.
- Create a new warehouse destination and set the namespace to be the same as the schema that Segment is writing to. RudderStack will then write to the same set of tables as Segment.
- Create a new source of type
Segment. This is to collect events from users who are still sending events to Segment.
- Copy the webhook URL. Replace
<DATA_PLANE_URL>with your data plane URL.
- Create a new webhook destination to the source where your data warehouse is connected.
- Configure the webhook URL in the webhook settings.
- Once the sync intervals are configured, as mentioned in the Step 4 below, the webhook destination should be enabled.
Some important points to note here:
- We want to capture all events in RudderStack and load them after the final switchover.
- Configure RudderStack's warehouse destination such that it will start syncing your data after Segment's warehouse load completes. For example, if Segment is going to finish loading the latest batch at 10 PM, then RudderStack's warehouse loading should start after that. (10.30 PM, or 11 PM, and so on).
If both Segment and RudderStack try to write to the same tables at the same time, it could result in Serializable isolation errors in some warehouses like Redshift. This is an intermittent issue, and should succeed after retrying.
- Once the latest Segment's warehouse load is complete, we can disable Segment's warehouse destination. RudderStack will have the events from the webhook, they will be de-duplicated when uploaded to the warehouse.
- RudderStack will keep loading new events as per the configured schedules.
- Once all clients are migrated over to RudderStack, you can disable the webhook destination in Segment.
If you come across any issues while migrating your warehouse destinations from Segment to RudderStack, please feel free to contact us. You can also start a conversation in our Slack community; we will be happy to talk to you!