Create a session
The package provides a Session class to create a session table from GA4 BigQuery raw data. You could find more details about Session Class API.
Declare GA4 BigQuery sources#
Before you can use Session class you should declare GA4 BigQuery sources. You could do it using declareSources method like this:
const ga4 = require("dataform-ga4-sessions");// Define your configconst config = { dataset: "analytics_XXXXXX", incrementalTableName: "events_XXXXXX",};// Declare GA4 source tablesga4.declareSources(config);The method expects config with these keys:
- database - GCP project id, if you query data from another project
- dataset - GA4 dataset name, like
analytics_XXXXXX - incrementalTableName - GA4 events table name, like
events_XXXXXXfor incremental context - nonIncrementalTableName - GA4 events table name, like
events_XXXXXXfor non-incremental context
After that you could use ref("events_XXXXXX") in your actions to SELECT from GA4 export data.
Base session action#
To create a session table you should create a new Session object and call publish() method. Like this:
const ga4 = require("dataform-ga4-sessions");
// Create sessions objectconst sessions = new ga4.Session(config);// Publish session tablesessions.publish();Before executing these actions, you could check the Compiled Queries for incremental and non-incremental tables. Run these queries in Dataform or copy and check them in BigQuery Studio.
After that just execute this action. As a result in current GCP project in dataset dataform_staging the package generates sesions table:
- with unique session_id column, more details here
- predefined "standard" schema. You could find more details about it here
- source / medium fields. Read more about how to add custom source / medium rules
- two attribution models: last click and linear. Read how to customize them
- channel grouping field. Read more about channel grouping
Use cases#
Change the result table name and dataset#
By default, the package creates sessions table in dataform_staging dataset, but you could change result table name and dataset like this:
const sessions = new ga4.Session(sessionConfig);
sessions.target = { schema: "my_schema", tableName: "my_sessions_table",};
sessions.publish();More about target property
Add custom columns from raw GA4 data#
If you want to extend the result table with custom columns from raw GA4 data you could use addColumns method like this:
const sessions = new ga4.Session(sessionConfig);sessions.addColumns([ { name: "device.web_info.browser", columnName: "browser" },]);sessions.publish();warning
If you add columns with RECORD type, like: device.*, geo.* you should specify the columnName. Because the result column name couldn't have a dot (.) symbol in the name.
Using this method you could add any column from raw GA4 data. If you need to unnest values from event_params or user_properties by key name the class provide special methods addEventParams and addUserProperties.
note
This method returns a session scope value. It means it tries to get the first not null column value from all events during the sessions (with the same session_id). So it means it not nesseary the value from the first event.
More about addColumns method
Add value from event_params#
You could unnest value from event_params by key name using addEventParams method like this:
const sessions = new ga4.Session(sessionConfig);
sessions.addEventParams([{ name: "ignore_referrer", type: "string" }]);sessions.publish();Because event_params contains values of different types, you should specify the value type. This types are supported: string, int, double, float, coalesce, coalesce_float.
Use COALESCE type to get the first not null value of any type converted to a string.
Use COALESCE_FLOAT to get the first not null numeric value converted to float.
For more details check out getSqlUnnestParam
note
This method returns a session scope value. It means it tries to get the first not null value from all events during the sessions (with the same session_id).
More about addEventParams method
More about addUserProperties method
Add value from page_location query parameter#
A rather common task is to extract query parameters from URL. For example to collect fbclid or other platforms click id. The class provides addQueryParameters method. This method tries to extract query parameter from event_param with a key page_location from any session event. And returns the first not null value.
You could use this method like this:
const sessions = new ga4.Session(sessionConfig);
sessions.addQueryParameters([ { name: "fbclid" }, { name: "ttclid" }, { name: "gclid", columnName: "gclid_url" },]);sessions.publish();In this example we change the column name for gclid from query parameters because the standard schema already has column gclid from event_params.
note
The method returns session-scope value (the first value during the session) not the event-scope value.
More about addQueryParameters method
Apply extend preset#
You could apply extended preset to get more session scope columns. Like this:
const sessions = new ga4.Session(sessionConfig);sessions.applyPreset("extended");sessions.publish();You could compare schemas for standard and extended presets here.
Delete columns from the result table#
If you don't want to store some columns but you need them for processing steps you could delete them using sessions.postProcessing.delete like this:
const sessions = new ga4.Session(sessionConfig);
sessions.postProcessing.delete = [ ...sessions.postProcessing.delete, ...["gclid", "content"],];
sessions.publish();By default postProcessing.delete equals [source, medium, campaign]. These columns should be deleted because they are already stored in the last_click_attribution column.
More about postProcessing
Add where condition for non-incremental table#
You could add diferent WHERE condition for incremental and non-incremental tables.
const sessionConfig = { dataset: "analytics_XXXXXX", incrementalTableName: "events_XXXXXX", nonIncrementalTableEventStepWhere: "_table_suffix between format_date('%Y%m%d',date_sub(current_date(), interval 3 day)) and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))",};const sessions = new ga4.Session(sessionConfig);sessions.publish();For incremental you could use the similar property incrementalTableEventStepWhere.
note
This method applies to the first step, not to the last. So use it carefully as you could filter events before extracting the needed columns from them. For example source / medium values.
For example you could use different WHERE conditions for schedule daily updates.
More about source property
Adding updatePartitionFilter#
You could filter rows in a incremental table adding updatePartitionFilter:
const sessions = new ga4.Session(sessionConfig);sessions.updatePartitionFilter = "date >= date_sub(current_date(), interval 5 day)";sessions.publish();Actually this way you can add any of IBigQueryOptions
Read more about:
More details#
- if you want to learn more about how the package works, you could read about processing steps
- and how to deal with intraday sessions