Create a session
The package provides a Session class to create a session table from GA4 BigQuery raw data. You could find more details about Session Class API.
#
Declare GA4 BigQuery sourcesBefore you can use Session class you should declare GA4 BigQuery sources. You could do it using declareSources method like this:
const ga4 = require("dataform-ga4-sessions");// Define your configconst config = { dataset: "analytics_XXXXXX", incrementalTableName: "events_XXXXXX",};// Declare GA4 source tablesga4.declareSources(config);
The method expects config with these keys:
- database - GCP project id, if you query data from another project
- dataset - GA4 dataset name, like
analytics_XXXXXX
- incrementalTableName - GA4 events table name, like
events_XXXXXX
for incremental context - nonIncrementalTableName - GA4 events table name, like
events_XXXXXX
for non-incremental context
After that you could use ref("events_XXXXXX")
in your actions to SELECT from GA4 export data.
#
Base session actionTo create a session table you should create a new Session object and call publish() method. Like this:
const ga4 = require("dataform-ga4-sessions");
// Create sessions objectconst sessions = new ga4.Session(config);// Publish session tablesessions.publish();
Before executing these actions, you could check the Compiled Queries for incremental and non-incremental tables. Run these queries in Dataform or copy and check them in BigQuery Studio.
After that just execute this action. As a result in current GCP project in dataset dataform_staging
the package generates sesions
table:
- with unique session_id column, more details here
- predefined "standard" schema. You could find more details about it here
- source / medium fields. Read more about how to add custom source / medium rules
- two attribution models: last click and linear. Read how to customize them
- channel grouping field. Read more about channel grouping
#
Use cases#
Change the result table name and datasetBy default, the package creates sessions
table in dataform_staging
dataset, but you could change result table name and dataset like this:
const sessions = new ga4.Session(sessionConfig);
sessions.target = { schema: "my_schema", tableName: "my_sessions_table",};
sessions.publish();
More about target property
#
Add custom columns from raw GA4 dataIf you want to extend the result table with custom columns from raw GA4 data you could use addColumns
method like this:
const sessions = new ga4.Session(sessionConfig);sessions.addColumns([ { name: "device.web_info.browser", columnName: "browser" },]);sessions.publish();
warning
If you add columns with RECORD
type, like: device.*
, geo.*
you should specify the columnName
. Because the result column name couldn't have a dot (.) symbol in the name.
Using this method you could add any column from raw GA4 data. If you need to unnest values from event_params or user_properties by key name the class provide special methods addEventParams
and addUserProperties
.
note
This method returns a session scope value. It means it tries to get the first not null column value from all events during the sessions (with the same session_id
). So it means it not nesseary the value from the first event.
More about addColumns method
#
Add value from event_paramsYou could unnest value from event_params by key name using addEventParams
method like this:
const sessions = new ga4.Session(sessionConfig);
sessions.addEventParams([{ name: "ignore_referrer", type: "string" }]);sessions.publish();
Because event_params
contains values of different types, you should specify the value type. This types are supported: string, int, double, float, coalesce, coalesce_float.
Use COALESCE
type to get the first not null value of any type converted to a string.
Use COALESCE_FLOAT
to get the first not null numeric value converted to float.
For more details check out getSqlUnnestParam
note
This method returns a session scope value. It means it tries to get the first not null value from all events during the sessions (with the same session_id
).
More about addEventParams method
More about addUserProperties method
#
Add value from page_location query parameterA rather common task is to extract query parameters from URL. For example to collect fbclid
or other platforms click id. The class provides addQueryParameters
method. This method tries to extract query parameter from event_param
with a key page_location
from any session event. And returns the first not null value.
You could use this method like this:
const sessions = new ga4.Session(sessionConfig);
sessions.addQueryParameters([ { name: "fbclid" }, { name: "ttclid" }, { name: "gclid", columnName: "gclid_url" },]);sessions.publish();
In this example we change the column name for gclid
from query parameters because the standard schema already has column gclid from event_params
.
note
The method returns session-scope value (the first value during the session) not the event-scope value.
More about addQueryParameters method
#
Apply extend presetYou could apply extended preset to get more session scope columns. Like this:
const sessions = new ga4.Session(sessionConfig);sessions.applyPreset("extended");sessions.publish();
You could compare schemas for standard and extended presets here.
#
Delete columns from the result tableIf you don't want to store some columns but you need them for processing steps you could delete them using sessions.postProcessing.delete
like this:
const sessions = new ga4.Session(sessionConfig);
sessions.postProcessing.delete = [ ...sessions.postProcessing.delete, ...["gclid", "content"],];
sessions.publish();
By default postProcessing.delete equals [source, medium, campaign]. These columns should be deleted because they are already stored in the last_click_attribution column.
More about postProcessing
#
Add where condition for non-incremental tableYou could add diferent WHERE
condition for incremental and non-incremental tables.
const sessionConfig = { dataset: "analytics_XXXXXX", incrementalTableName: "events_XXXXXX", nonIncrementalTableEventStepWhere: "_table_suffix between format_date('%Y%m%d',date_sub(current_date(), interval 3 day)) and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))",};const sessions = new ga4.Session(sessionConfig);sessions.publish();
For incremental you could use the similar property incrementalTableEventStepWhere
.
note
This method applies to the first step, not to the last. So use it carefully as you could filter events before extracting the needed columns from them. For example source / medium values.
For example you could use different WHERE
conditions for schedule daily updates.
More about source property
#
Adding updatePartitionFilterYou could filter rows in a incremental table adding updatePartitionFilter:
const sessions = new ga4.Session(sessionConfig);sessions.updatePartitionFilter = "date >= date_sub(current_date(), interval 5 day)";sessions.publish();
Actually this way you can add any of IBigQueryOptions
Read more about:
#
More details- if you want to learn more about how the package works, you could read about processing steps
- and how to deal with intraday sessions