In addition to the standard relational database method of one-to-one relationships within a record and it's fields, Google BigQuery also supports schemas with nested and repeated data. This allows BigQuery to store complex data structures and relationships
Make no mistake however, these are the most complex data types that you can find in BigQuery datasets!
An array is a data type supported in SQL, even outside of Google BigQuery. Within an array, each element must have the same data type and the order of values is preserved. A struct can contain elements of different data types, including arrays. In BigQuery, nested records are arrays of structs.
For analyses, working with SQL on nested data can be very performant because nested data with arrays basically means working on pre-joined tables. Querying them can be very efficient but a lot of analysts are unfamiliar with semi-structured, nested data and struggle to make use of its full potential.
Maintaining nested records eliminate the need for repeating data, creating additional subtables or using expensive joins during data analysis, resulting in better query performance and lower storage costs.
To extract information from a repeated field in BigQuery, you must use a more exotic pattern. This is normally done using the UNNEST function, which converts an array of values in a table into rows. These can then be joined to the original table to be queried
Conclusion
Using scoped aggregation over nested and repeated fields is one of BigQuery's most powerful features, which can often eliminate expensive joins in queries.
Luckily, Launchpad supports the automated creation and importing of data represented in nested records.
Launchpad is our powerful and easy to use one stop shop to transfer data from an array of domains to destinations of your choice. It helps customers easily access their data and transfer it to their preferred data warehouse where it can then be transformed, analyzed and visualized.