Sometime ago the BigQuery Team renamed its SQL dialect to Legacy SQL and introduced beta support of ANSI 2011 SQL named Standard SQL in the BigQuery world.
This made life easier for the huge army of Digital Analysts, Data Science specialists, SQL developers and surely for me. Except that now I have had to migrate all the queries written in the last couple of years to a new Standard SQL. While it is true that in BigQuery, Standard SQL dialect is still at a beta stage it is pretty usable and tools like Tableau force users to switch.
Migrating from Legacy SQL to Standard SQL in BigQuery is a very straightforward process and for every single query I was able to get exactly the same query result by only fixing the dialect syntax leaving semantical structure untouched. As a second stage in the process, I modified some queries to better reflect Standard SQL (ANSI 2011) ideology, making queries simpler and easier to read. I didn't notice any substantial performance gain though. In fact, many queries written and adopted for Standard SQL executed 1.2 - 1.5 times slower than in Legacy SQL! What I have also noticed is that the internal caching engine does not work as well as it did for Legacy SQL. Well, that's probably why BigQuery Standard SQL is still in beta.
Changes required for transition from Legacy SQL to Standard SQL in BigQuery
- Comma replacements
Basically, you can't have comma "," before the FROM keyword. BigQuery Standard SQL is not that forgiving, plus the error message is not clear - so it is good to remember this.
Instead of using the square brackets, we now use the "`" symbol and project name and dataset name is now separated with a dot instead of a semicolon
The changes above are very straightforward and could even be done on multiple queries using regex
- Querying wildcard tables
The TABLE_DATE_RANGE function does not exist anymore. Instead there is a _TABLE_SUFFIX pseudo-column. Consider the following query extracting a maximum transaction revenue between 01 Jan 2017 and 31 Jan 2017
SELECT ROUND(MAX(totals.totalTransactionRevenue) / 1000000) as maxTransactionRevenue
WHERE _TABLE_SUFFIX between '20170101' and '20170131'
The pseudo-column _TABLE_PREFIX really made queries look cleaner and allowed for more freedom in some complex JOINs
- Substring search
LIKE operator is preferable for simple matches.
REGEXP_MATCH function is replaced with REGEXP_CONTAINS using same signature, so that's a quick fix
- UNION operation
UNION operation in BigQuery Standard SQL requires columns to be in exactly the same order while BigQuery Legacy SQL did matching by name
- WITHIN RECORD
This is where things need to get creative. In Legacy SQL to find if session had a specific Hit Scope Custom Dimension (or a specific Event) you could do something like
MAX(IF(hits.customDimensions.index = 3, hits.customDimensions.value, FALSE)) WITHIN RECORD AS ...,
This will not inflate rows and give you MAX value of the Hit Scope Custom Dimension within session.
There is no direct alternative to it in Standard SQL, so the solution could be to use a subquery
(SELECT MAX(IF(index = 3, value, FALSE)) from UNNEST(hits.customDimensions) ) AS ...,
This may not be optimal, especially if you are querying tens of Custom Dimensions this way.
A few tips & tricks:
As an Analytics 360 (formerly Google Analytics Premium) reseller and service provider we provide full support for Analytics 360 products and services including BigQuery. Talk to us if you need help migrating Legacy SQL queries to a Standard SQL. Pick up the phone, Live Chat, or email us if you would like us to share our skills and knowledge to achieve your business goals and targets.
Internetrix combines digital consulting with winning website design, smart website development and strong digital analytics and digital marketing skills to drive revenue or cut costs for our clients. We deliver web-based consulting, development and performance projects to customers across the Asia Pacific (APAC), Australia, Sydney, and Wollongong, ranging from small business sole traders to ASX listed businesses and all levels of Australian government. Thank you for browsing this post, stay tuned for more from Internetrix.