April 21, 2021
Back in 2018, Google announced that loading speed would be a factor for Google Search and Google Ads. This started a big conversation among marketeers around performance. Performance has always been important, especially in eCommerce contexts, but it became an urgent issue when Google raised the stakes.
As companies like Google and Facebook have set the standards for an instant, frictionless app and website performance, loading speed has become one of the most important factors of any internet experience. If your load time pushes past 2 seconds, users simply won’t stay long enough on your page.
Thankfully, top companies increasingly treat site speed as an engineering issue, especially when load time translates directly to revenue.
While engineering teams might not avoid every SDK marketing wants, there are modern tools that allow engineering to provide marketing with the data they need without sacrificing performance.
One of the main factors that impact the performance of a web page is the 3rd-party libraries that are used. There are two ways that these libraries slow load times:
- Libraries have to be loaded to the browser. If the libraries have a considerable size, we will have to wait.
- Libraries perform tasks. In many cases, they have to communicate over the network to perform those tasks, which adds latency.
There are also other factors. For example, libraries might have bugs or lack optimized implementation, common issues with the lifecycle of any software product. Though understandable, bugs and slow code can literally impact revenue if a site loads too slowly.
For products like RudderStack, things are even more complicated. Here are the two primary complexities related to performance:
Destination Integrations Impact Library Size
Variations in User-Defined JSON Payloads
Specifically, RudderStack users define arbitrary JSON documents sent over the network when a specific change on the DOM is detected. As the DOM is affected by user actions, some metadata is captured and included in the payload. The library then makes sure that the JSON documents are delivered in the right order to the server.
Towards a Solution
Because of the above two conditions, it’s very hard to offer a consistent experience across all library uses or reason about performance before the SDK is deployed in production.
Here’s how we did it.
The two problems we mentioned earlier require different tactics for improving speed. We will start by first explaining what we did to handle the arbitrary size of the used library.
Reducing the Size of the Library
Solution: Only Loading What you Need
The high-level solution here was quite simple, thanks in large part to some good thinking by David and his team.
Let’s look at a quick example to see how this works in practice.
rudderanalytics.requireIntegration("GA"), it automatically fetches the Google Analytics instrumentation code (such as GAPlugin.js) that handles the transformation and mapping logic for the RudderStack event payload. This includes the call type and the API calls.
The core SDK maintains a queue for all the calls. When a
requireIntegration call happens, any related calls are queued, and the SDK starts fetching the necessary library. When this is done, the queue will start executing the enqueued calls.
By this strategy accomplishes the following:
- Any native libraries are fetched to the client only when required (or, said another way, only if a call happens that requires the library).
- Simultaneously, any subsequent calls are queued, so there’s no blocking on the client and thus no added performance penalty.
By implementing the above and removing unnecessary integration snippets, load times for Loveholidays decreased almost 10x than their previous solution (Segment’s analytics.js).
Controlled tests clocked load times between 20-60ms, down from 200-300ms.
XHR vs. sendBeacon or Sync vs. Async
To further improve the performance of our SDK, we decided to experiment with sendBeacon.
The Beacon API is used to send a small amount of data asynchronously to a web server. The main reason it was introduced was for analytics use cases, which perfectly fits many of our performance-conscious eCommerce customers.
The most important aspect, though, is Beacon’s asynchronous nature. We assumed that by using sendBeacon, we could reduce the input delay even further.
The Details: Beacon vs. XHR
Running synchronous XHR calls for tracking can impact the first input delay (FID) of the page, slowing down the responsiveness when measured by tools like a lighthouse.
We worked with David and the Loveholidays team to A/B test Beacon vs. XHR for the RudderSTack SDK. Spoiler alert: it worked. We saw the average FID drop from 200 to 20ms using Beacon. Much of the reasoning for this was the offloading to batch + async calls supported by the new RudderStack SDK.
This is an actual chart from the Loveholidays test:
Actual Chart from the Loveholidays Test
While sending data asynchronously via Beacon might not be right for every company, it can drive material improvement for high performance-sensitive use cases.
At RudderStack, our mission is to build the most performant customer data pipelines possible. We understand that whatever we do with our SDKs will affect the experience the end-user has, and for this reason, we take performance extremely seriously.
Of course, as engineers, there are still many things we are exploring to improve performance even further. One of our next experiments will be implementing the Beacon API together with a queue.
Performance is fascinating and increasingly critical, so stay tuned for updates from the engineering team on how we’re ensuring our SDKs are as optimized as possible.
Sign up for Free and Start Sending Data
Test out our event stream, ELT, and reverse-ETL pipelines. Use our HTTP source to send data in less than 5 minutes, or install one of our 12 SDKs in your website or app. Get started.
We'll send you updates from the blog and monthly release notes.