How to load data from the AdWords to PostgreSQL
Extract your data from Google AdWords
The AdWords API allows applications to interact directly with the AdWords platform. You can build applications to more efficiently manage large or complex AdWords accounts and campaigns. Contrary to the rest of the APIs that we have covered in this series of posts, the Google AdWords API is implemented only using the SOAP protocol and it doesn’t offer a RESTful web implementation.
Nevertheless, they offer a number of client libraries that you can use for your language or framework of choice. They officially support clients in the following languages
The API of AdWords is a quite complex product that exposes a lot of functionality to the user, ranging from reporting to do the bidding and programmatic advertisement. As the scope of this post is the extraction of its data, with the aim of loading the data to a data warehouse for further analysis, we’ll focus only on that part of the AdWords API.
There are many ways of interacting with the data that AdWords API gathers. One way is to link your Google Analytics and AdWords accounts and actually enrich the data of your analytics with data coming from AdWords. The other possible way, if you have the luxury to afford a Google analytics premium account, is to load your data directly to Google BigQuery. From there, you can either do your analysis from BigQuery or export your data to another data warehouse.
We’ll assume that you do not have a Google Analytics premium account, to be honest, if you had you wouldn’t be looking at this post anyway, but you still want to extract data and load it to your own data warehouse solution. To do that we’ll utilize the Report related functionality of the AdWords API. The API supports a huge number of reports that you can request, and it is possible to change the granularity of your results by passing specific parameters. Defining what kind of data you want to get back as part of your report can be done in two different ways.
- Using an XML-based report definition.
- Using an AWQL-based report definition.
If you want to use an XML based report definition you have to include a parameter named “__rdxml” that will contain an XML serialised definition of the report you want to retrieve.
<reportDefinition xmlns="https://adwords.google.com/api/adwords/cm/v201509"><selector><fields>CampaignId</fields><fields>Id</fields><fields>Impressions</fields><fields>Clicks</fields><fields>Cost</fields><predicates><field>Status</field><operator>IN</operator><values>ENABLED</values><values>PAUSED</values></predicates></selector><reportName>Custom Adgroup Performance Report</reportName><reportType>ADGROUP_PERFORMANCE_REPORT</reportType><dateRangeType>LAST_7_DAYS</dateRangeType><downloadFormat>CSV</downloadFormat></reportDefinition>
AWQL is a SQL-like language for performing queries against most common AdWords API services. Any service with a query method is supported; queryable fields for each service are listed here.
As a comparison you can see the difference between using XML and AWQL below:
CampaignPage p = campaignService.query("SELECT Id, NameWHERE Status = 'ENABLED'ORDER BY NameDESC LIMIT 0,50");
As we can see, Google’s API for AdWords has a very expressive way of defining what data we want to get from it and various options to do that. If you feel more comfortable with SQL-like languages you can use AWQL, or if you prefer XML you can use that for defining your reports.
Regarding the format of the results you get from the API, there are also multiple options supported.
- CSVFOREXCEL – Microsoft Excel compatible format
- CSV – comma-separated output format
- TSV – tab separated output format
- XML – xml output format
- GZIPPED-CSV – compressed csv
- GZIPPED-XML – compressed xml
Google AdWords, exposes a very rich API which offers you the opportunity to get very granular data about your accounting activities and use it for analytic and reporting purposes. This richness comes with a price though, a large number of complex resources that have to be handled through an also complex protocol.
About Google AdWords
Google AdWords is an online advertising service Google for businesses wanting to display ads on Google and its advertising network. At its, core AdWords is a Real-Time Bidding system where advertisers compete to display their advertising material to web users who are using Google products like its search engine. Programmatic and instantaneous auctions are performed, similar to how financial markets operate. Among the benefits of AdWords are:
- Pay-per-click – advertisers pay only for ads that have been clicked by the user
- Any budget – You can start with any budget, although you have to be aware of the Real-Time Bidding nature of AdWords, which means that the effectiveness of your campaigns is linked to what your competitors are also willing to pay.
- Reach – you can reach billions of people worldwide.
Additionally, AdWords, just like every other product from Google has excellent support and it exposes a rich ecosystem of tools and APIs that you can use to get the most out of their services.
Google AdWords Data Preparation for PostgreSQL
To populate a PostgreSQL database instance with data, first, you need to have a well-defined data model or schema that describes the data. As a relational database, PostgreSQL organizes data around tables.
Each table is a collection of columns with a predefined data type as an integer or VARCHAR. PostgreSQL, like any other SQL database, supports a wide range of different data types.
A typical strategy for loading data from a source like AdWords to a PostgreSQL database is to create a schema where you will map each API endpoint to a table. Each key inside the API of AdWords API endpoint response should be mapped to a column of that table and you should ensure the right conversion to a PostgreSQL compatible data type.
For example, if an endpoint from AdWords returns a value as String, you should convert it into a VARCHAR with a predefined max size or TEXT data type. tables can then be created on your database using the CREATE SQL statement.
Of course, you will need to ensure that as the data types from the AdWords API might change, you will adapt your database tables accordingly, there’s no such thing as automatic data typecasting.
After you have a complete and well-defined data model or schema for PostgreSQL, you can move forward and start loading your data into the database.
PostgreSQL or simply Postgres is one of the most well-known, popular, and well-supported databases. It can be used for different workloads, from simple single-machine applications to large-scale data warehousing scenarios.
PostgreSQL is ACID-compliant, transactional, and has one of the richest feature sets; including materialized views, triggers, foreign keys, stored procedures and an architecture that encourages its extensibility.
Especially its last characteristic has made Postgres one of the most forked databases. Amazon Redshift is based on an earlier version of PostgreSQL as other database systems like Citus Data and Greenplum Database.
All the above characteristics of PostgreSQL, a rich set of aggregation functions, the ability to define both simple and materialized views, the support for user-defined functions, and the ability to scale to pretty large datasets, make it an ideal database for analytics-related tasks.
Let’s see what it takes to populate and maintain a PostgreSQL database with data for analytics and business intelligence purposes.
Load data from AdWords to PostgreSQL
Once you have defined your schema and you have created your tables with the proper data types, you can start loading data into your database.
The most straightforward way to insert data into a PostgreSQL database is by creating and executing INSERT statements. With INSERT statements, you will be adding data row-by-row directly to a table. It is the most basic and straightforward way of adding data into a table but it doesn’t scale very well with larger data sets.
The preferred way for adding larger datasets into a PostgreSQL database is by using the COPY command. COPY is copying data from a file on a file system that is accessible by the PostgreSQL instance, in this way much larger datasets can be inserted into the database in less time.
You should also consult the documentation of PostgreSQL on how to populate a database with data. It includes a number of very useful best practices on how to optimize the process of loading data into your PostgreSQL database.
COPY requires physical access to a file system in order to load data. Nowadays, with cloud-based, fully managed databases, getting direct access to a file system is not always possible. If this is the case and you cannot use a COPY statement, then another option is to use PREPARE together with INSERT, to end up with optimized and more performant INSERT queries.
Updating your AdWords data on PostgreSQL
As you will be generating more data on AdWords, you will need to update your older data on PostgreSQL. This includes new records together with updates to older records that for any reason have been updated on AdWords.
You will need to periodically check AdWords for new data and repeat the process that has been described previously while updating your currently available data if needed. Updating an already existing row on a PostgreSQL table is achieved by creating UPDATE statements.
Another issue that you need to take care of is the identification and removal of any duplicate records on your database. Either because AdWords does not have a mechanism to identify new and updated records or because of errors on your data pipelines, duplicate records might be introduced to your database.
In general, ensuring the quality of the data that is inserted in your database is a big and difficult issue and PostgreSQL features like TRANSACTIONS can help tremendously, although they do not solve the problem in the general case.
The best way to load data from AdWords to PostgreSQL
So far we just scraped the surface of what you can do with PostgreSQL and how to load data into it. Things can get even more complicated if you want to integrate data coming from different sources.
Are you striving to achieve results right now?
Instead of writing, hosting, and maintaining a flexible data infrastructure use RudderStack that can handle everything automatically for you.
RudderStack with one click integrates with sources or services, creates analytics-ready data, and syncs your AdWords to PostgreSQL right away.
Don't want to go through the pain of direct integration? RudderStack's Google Ads integration makes it easy to send data from Google Ads to PostgreSQL.