Five Practical Reasons to Use the GA4 API Over the Native BigQuery Transfer

by Paul Cote on Feb 08, 2023

We are big fans of the native Google BigQuery streaming feature in Google Analytics 4, and we have an overwhelming abundance of good things to say about this feature and how transformative it is for the Google Analytics product. The feature gives users the ability to stream up to one million events per day, and there are options for same-day streaming, as well as previous-day event data that will load on the following day. We've put together just a few of the reasons why you should consider using the GA4 API instead of the native Google BigQuery workstream.

Here are five practical reasons that will lead you to use the GA4 API (in no particular order):

  • Session and User Metrics Are Consistent Between Platforms
  • Data is consistently populated at your desired time
  • Historic data can be backfilled for missing dates
  • Querying the data is easier
  • You do not have access to the BigQuery dataset

Session and User Metrics Are Consistent Between Platforms

If you have tried to build dashboards using the events data that GA4 streams to Google BigQuery, you have likely noticed that many of your metrics do not match one-for-one with what you are seeing in the GA4 admin view. The difference in your metrics built from the events table in BigQuery is likely due to the approximation algorithm Google uses to calculate some metrics, such as sessions and users.

You are probably thinking to yourself right now, "If the data in the GA4 admin view is approximated, then isn't the data in my Google BigQuery table more accurate?" The answer is a resounding yes. So why would we use the API if we know BigQuery is giving us the true values? In short, sometimes ignorance is bliss.

The data from the GA4 API is going to match the values Google is displaying in the admin view. Many users love to see their data matching exactly with what they are seeing in the platform's native reports, and they will usually question the accuracy of the data if the numbers differ. The values themselves, from what we have seen, are generally not more than 1-2% different from the values in Google BigQuery, and, from what we can tell, they are not approximating conversion metrics. Sometimes it's just easier to let ignorance be bliss and give your stakeholders the same numbers they see in the admin view.

Data Is Consistently Populated at Your Desired Time

You can't control when yesterday's data is populated in your BigQuery table from the native GA4 stream. This is okay if the reports you're sharing only need to be updated as of two days prior, but if your goal is to deliver reports with no more than 24 hours' latency in the data, then you'll need to consider the API. Sure, yesterday's data will load at some point during the following day, but if the time is unreliable or it's late in the work day, it's going to have limited value for your stakeholders.

Historic Data Can Be Backfilled for Missing Dates

As many Google Analytics users transition to GA4, they will also begin to use GCC and configure the BigQuery GA4 data stream. Many users will decide to activate their data stream at later dates, which is problematic because there is currently no way to load historic dates using the GA4 BigQuery transfer. The data will only populate from the time the link is established. Thankfully, you can use the API to get around this problem. To backfill data for your missing dates, you will need to extract the data from the API and load it in BigQuery using an ETL (extract, transform, and load) application with a source integration for GA4.

Querying the Data Is Easier

The GA4 events table that is built into Google BigQuery is great, and it's an absolute dream for a data analyst. If you're not a full-time data analyst and are not familiar with the SQL needed to unnest records, you should consider using an ETL (extract, transform, and load) application, like Launchpad. Launchpad writes GA4 data as a flat table, so there is very little — if any — SQL required to begin building visualizations in most cases. If your primary goal is to avoid the token limitations in Google Looker Studio and are looking for the most straightforward options to load the data in your dashboard without limitations, then Launchpad is the solution for you.

You Do Not Have Access to The BigQuery Dataset

Many organizations, whether they realize it or not, are beginning to stream their GA4 data to a partner-controlled Google Cloud Console account. We've seen the same trend with many other Google product launches over the years (Google Ads, Google Analytics, and Google My Business to name a few), where the account ownership begins with an agency or partner, and as time goes on, the business owner asks to take control. To avoid a difficult and potentially costly transfer process, we suggest for each business to have its own Google Cloud Console project. Ownership of the project ensures that the business maintains control and data is not disrupted as partners change. Ownership also provides the business with the flexibility to share access with any partner or platform.

If you've ever worked with external agencies or partners and tried to get direct access to your data, then these excuses may sound familiar:

  • It's against their company policy
  • It would give you access to other customers' data
  • It would provide access to their intellectual property
  • It's not possible because of other platform connections
  • They cannot locate the owner of the admin user

If you have contracted with an organization to provide dashboards or visualizations and you are not able to access their GA4 Google BigQuery dataset because it's controlled by another vendor, you may need to consider using the GA4 API to access the data.

We're Here to Help

If you need any assistance navigating the transition to Google Analytics 4, then you've come to the right place. The data experts at Calibrate Analytics will be more than happy to guide you through our full-service platform, Launchpad, or create a custom solution specifically to fit your unique business needs. Don't hesitate to get in touch!

Schedule A Demo

Share this post:
  • Paul Cote

    About the Author

    Paul is head of analytical products at Calibrate Analytics. He is responsible for creating digital analytical solutions that enable better business decisions. He has over 19 years of digital focused leadership, along with vast experience in analytics solutions aiming to deliver the right insights.