One of the biggest downsides of using only Google Analytics to track your data for high-traffic websites is the automated sampling, where results are no longer fully-relevant for the entire data-set, but rather for a small portion of your audience. As a result, you have less accurate data and you cannot benefit from the full extent of data analysis.
In this article I’m going to cover the benefits of using BigQuery, one of the best complementary solutions for Google Analytics. BigQuery is a web service that can store terabytes of data and can make big data processing more accessible. The cloud data warehouse comes as a solution for managing not only your Google Analytics data, but also any other data sets that you need to store and analyze.
How do you connect to BigQuery?
You can connect to BigQuery through the BigQuery Web UI, the bq command line, BigQuery Connector for Excel or Hadoop Connector. BigQuery is also integrated with Google Drive, so you can save the results of a query from the BigQuery UI into Google Sheets or automatically create tables in BigQuery from the files on your Google Drive.
You can also use third party tools to connect to BigQuery through Tableau, R or ODBC. Tableau provides a connector to the cloud data warehouse in order to create insightful dashboards and to offer a creative visualization of the data stored in BigQuery.
How do you retrieve data from BigQuery?
To retrieve the data located in the cloud, you will need to run SQL queries to interrogate the data. BigQuery supports both standard SQL, as well as legacy SQL. In order to select the SQL dialect you want to run your query in, you can go to the Query Pane in the Web UI, click on Show Options and select the desired SQL version.
BigQuery or Google Analytics? Which one is better?
Obviously, there are pros & cons to both solutions, and the best tool for you depends on the data that is available to you, the analysis that you want to perform and the desired output.
There are two main reasons why BigQuery is preferred over Google Analytics. The first and most important one is Google Analytics sampling, where the results of a report are based on a subset of data, instead of the entire dataset.
This occurs when a report retrieves more than 100 million sessions for Google Analytics 360 users (and more than 500 thousand sessions for regular users). Moreover, the use of more complex segments and multiple dimensions in your report can also lead to sampling.
Sampling can be prevented by downloading an unsampled report from the Google Analytics interface, however this is available only in Google Analytics 360 accounts, there is a limit to the maximum number of records, and you will need to export the report in the desired file.
BigQuery is a much more practical solution to the sampling issue, due to the fact that it allows you to query unsampled data in a short amount of time and you can connect it and query it directly from several data visualization tools such as Tableau to create interactive dashboards.
This results in a more accurate reporting compared to the sampled Google Analytics output.
The second reason for which BigQuery is more advantageous to use compared to Google Analytics is customization. If you want to perform more advanced analyses, you might find that Google Analytics will not provide the sufficient bandwidth for you to do that. On the other hand, you can use SQL in BigQuery to create complex metrics & in-depth analysis.
However, there are some drawbacks to BigQuery. First of all, you would need to know SQL in order to retrieve data from the tool, and the complexity of the metrics & analysis that you can create will depend on your knowledge of the SQL language.
Secondly, in order to retrieve metrics correctly, you need to have an in-depth overview on how the data is structured into the BigQuery tables. In this regard, the first thing that you need to learn when working with BigQuery is how sessions and hits are organized in the BigQuery table schema.
How can you migrate to BigQuery?
To migrate from Google Analytics to BigQuery, you will need a Google Analytics 360 account. With BigQuery, there are associated costs with both data storing and data processing. With regards to the latter, you are charged for ad-hoc queries, however if you are using automated reporting tools (such as Tableau dashboards that are connected to BigQuery & updated regularly), the costs can increase faster.
All in all, BigQuery is a great solution if you are looking to avoid Google Analytics sampling, if you are planning to create more complex analyses that cannot be done using segments & custom reports in Google Analytics and if you have some basic SQL knowledge. BigQuery also offers some great resources to get you started, such as the BigQuery Export Schema that provides an overview on how the data coming from Google Analytics is structured into tables.
Sources and more in-depth reviews of BigQuery