Hi, this is Jing, a data scientist with great passion for applying data science and big data technology in the industry. Today you are going to learn how to set up a Data Studio report with data from BigQuery to monitor metrics you are interested in on a daily basis. This is method here not limited for AB Testing. I just use AB test as an example here.
There are other more powerful BI tools, like Looker or Tableau. But due to the simplicity of Data Studio, it always becomes my first choice when it comes to some short live reports, such as monitoring the performance of an AB testing or a campaign on daily basis when the BigQuery is the main data warehouse.
First, I will introduce a SQL template to inspire on how you should construct a readable query for an AB test. Then I will talk about how to visualise the metrics in Data Studio for monitoring the daily performance of the AB testing.
SQL template for test events in BigQuery
If you are trying to monitoring the performance of testing a new module on a page, you can refer to this template SQL and adjust based on your own need. The SQL example is given on the test data tracked in Google Analytics and exported to BigQuery. For each individual case, the SQL will looks very differently since the data structure can be so different depends on how the data was tracked on your side. But the main logic should applied to most cases when you try to find out who actually converted better after viewed the test module.
- Step 1: Extract all the test data during the test period for the tested country, as test_dataset
WITH ---- get all the test sessions ---- test_dataset AS ( SELECT date, CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING))AS sessionID, fullVisitorId, device.deviceCategory, hits.isEntrance, hits.isExit, ( SELECT value FROM hits.customDimensions WHERE INDEX = 1) AS countryName, ( SELECT value FROM hits.customDimensions WHERE INDEX = 2) AS testVariant, hits.eventInfo.eventAction, hits.eventInfo.eventCategory, hits.eventInfo.eventLabel FROM `jingsblog.2020.ga_sessions_*`, UNNEST (hits)AS hits WHERE ----- test period ----- _table_suffix BETWEEN '20201031' AND '20201212' ----- remove the session without interaction events ----- AND totals.visits = 1
- Step 2: Select out all the tested users who actually experienced the test as impression
impression AS ( SELECT date, deviceCategory, testVariant, sessionID, fullVisitorId, ----- check whether the visitor landed on the test page ----- MAX( IF (isEntrance, sessionID, NULL)) AS landing, ----- check whether the visitor bounced on the test page ----- MAX( IF (isEntrance AND isExit, sessionID, NULL)) AS bounce, ----- check whether the visitor exited on the test page ----- MAX( IF ( isExit, sessionID, NULL) )AS exit FROM test_dataset WHERE ----- test variant ----- testVariant LIKE '%testname' GROUP BY 1, 2, 3, 4, 5)
- Step 3: Select out the metrics you are interested in, for example how many of them converted
conversions AS ( SELECT DISTINCT sessionID, fullVisitorId FROM test_dataset WHERE conversion conditions)
- Last step: By join the impression and the conversions, you could see how many of the tested users converted after viewed the tested module.
FROM impression LEFT JOIN conversions ON impression.sessionID = conversions.sessionID
Visualise in Data studio
When you have a BigQuery query or an existing Table in a dataset, what is the best way to connect Data Studio to BigQuery comes as a question. Here is how it works between BigQuery and Data Studio.
How BigQuery is connected with Data Studio
Every time Data Studio refreshes the data in your reports, certain queries are sent to BigQuery, getting fresh data in return, but this also incurs certain charges. Data Studio is advertised as a free service and it technically is. But BigQuery is not free. The price you pay for BigQuery depends on how much data you store there and how much you query for.
There are mainly three ways to connect Data Studio to BigQuery:
- Custom Query Data Studio makes separate requests for every single plot if I use custom query as a connection type. In BigQuery if you make the same requests multiple times the result is cached and is charged only for the first time. But Data Studio makes different requests to build every plot in the report you are building.
- View in the BigQuery Data Studio made different request to that view and the problem was the same. There is no big difference between connect Data Studio to a custom query and to a view in the BigQuery data warehouse.
- Materialised Table in the BigQuery Save the results of a query into a separate table in the Bigquery. This table is called Materialised table. Connect Data Studio directly to this table will make the calculation much cheaper and quicker since the data studio only process the table it connects to instead of running the QUERY which creates this table. The main drawback of this connection is the table is static and if you want to have a dynamic report, you need to updated the table accordingly. Consequently, scheduling is needed for updating the table.
By understanding this, to save the cost the third option is recommended. As for details for how to set up the Data Studio report in a daily scheduling way, I will write a separate blog about it.
You might have been curious from the very beginning, aren’t most the AB test tool providing auto-mating reporting and analysis. That’s not might be true. However, implement an AB testing does not always require an AB test tool, all you need is a fanatics developer. Then in this case, a good data scientist will work with what he/she has – being able to set up monitoring report by him or herself. Besides, writing a customised query and report separately could give you more freedom on interested metrics beyond the limitation of the AB test tool. That’s why I am sharing this solution here. Hope this will be helpful for you. Thanks for reading. I am Jing, a data scientist aiming to be better and better.