Skip to main content

Create a session

The package provides a Session class to create a session table from GA4 BigQuery raw data. You could find more details about Session Class API.

Declare GA4 BigQuery sources#

Before you can use Session class you should declare GA4 BigQuery sources. You could do it using declareSources method like this:

const ga4 = require("dataform-ga4-sessions");// Define your configconst config = {  dataset: "analytics_XXXXXX",  incrementalTableName: "events_XXXXXX",};// Declare GA4 source tablesga4.declareSources(config);

The method expects config with these keys:

  • database - GCP project id, if you query data from another project
  • dataset - GA4 dataset name, like analytics_XXXXXX
  • incrementalTableName - GA4 events table name, like events_XXXXXX for incremental context
  • nonIncrementalTableName - GA4 events table name, like events_XXXXXX for non-incremental context

After that you could use ref("events_XXXXXX") in your actions to SELECT from GA4 export data.

Base session action#

To create a session table you should create a new Session object and call publish() method. Like this:

const ga4 = require("dataform-ga4-sessions");
// Create sessions objectconst sessions = new ga4.Session(config);// Publish session tablesessions.publish();

Before executing these actions, you could check the Compiled Queries for incremental and non-incremental tables. Run these queries in Dataform or copy and check them in BigQuery Studio.

After that just execute this action. As a result in current GCP project in dataset dataform_staging the package generates sesions table:

Use cases#

Change the result table name and dataset#

By default, the package creates sessions table in dataform_staging dataset, but you could change result table name and dataset like this:

const sessions = new ga4.Session(sessionConfig);
sessions.target = {  schema: "my_schema",  tableName: "my_sessions_table",};
sessions.publish();

More about target property

Add custom columns from raw GA4 data#

If you want to extend the result table with custom columns from raw GA4 data you could use addColumns method like this:

const sessions = new ga4.Session(sessionConfig);sessions.addColumns([  { name: "device.web_info.browser", columnName: "browser" },]);sessions.publish();
warning

If you add columns with RECORD type, like: device.*, geo.* you should specify the columnName. Because the result column name couldn't have a dot (.) symbol in the name.

Using this method you could add any column from raw GA4 data. If you need to unnest values from event_params or user_properties by key name the class provide special methods addEventParams and addUserProperties.

note

This method returns a session scope value. It means it tries to get the first not null column value from all events during the sessions (with the same session_id). So it means it not nesseary the value from the first event.

More about addColumns method

Add value from event_params#

You could unnest value from event_params by key name using addEventParams method like this:

const sessions = new ga4.Session(sessionConfig);
sessions.addEventParams([{ name: "ignore_referrer", type: "string" }]);sessions.publish();

Because event_params contains values of different types, you should specify the value type. This types are supported: string, int, double, float, coalesce, coalesce_float.

Use COALESCE type to get the first not null value of any type converted to a string. Use COALESCE_FLOAT to get the first not null numeric value converted to float.

For more details check out getSqlUnnestParam

note

This method returns a session scope value. It means it tries to get the first not null value from all events during the sessions (with the same session_id).

More about addEventParams method

More about addUserProperties method

Add value from page_location query parameter#

A rather common task is to extract query parameters from URL. For example to collect fbclid or other platforms click id. The class provides addQueryParameters method. This method tries to extract query parameter from event_param with a key page_location from any session event. And returns the first not null value.

You could use this method like this:

const sessions = new ga4.Session(sessionConfig);
sessions.addQueryParameters([  { name: "fbclid" },  { name: "ttclid" },  { name: "gclid", columnName: "gclid_url" },]);sessions.publish();

In this example we change the column name for gclid from query parameters because the standard schema already has column gclid from event_params.

note

The method returns session-scope value (the first value during the session) not the event-scope value.

More about addQueryParameters method

Apply extend preset#

You could apply extended preset to get more session scope columns. Like this:

const sessions = new ga4.Session(sessionConfig);sessions.applyPreset("extended");sessions.publish();

You could compare schemas for standard and extended presets here.

Delete columns from the result table#

If you don't want to store some columns but you need them for processing steps you could delete them using sessions.postProcessing.delete like this:

const sessions = new ga4.Session(sessionConfig);
sessions.postProcessing.delete = [  ...sessions.postProcessing.delete,  ...["gclid", "content"],];
sessions.publish();

By default postProcessing.delete equals [source, medium, campaign]. These columns should be deleted because they are already stored in the last_click_attribution column.

More about postProcessing

Add where condition for non-incremental table#

You could add diferent WHERE condition for incremental and non-incremental tables.

const sessionConfig = {  dataset: "analytics_XXXXXX",  incrementalTableName: "events_XXXXXX",  nonIncrementalTableEventStepWhere:    "_table_suffix between format_date('%Y%m%d',date_sub(current_date(), interval 3 day)) and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))",};const sessions = new ga4.Session(sessionConfig);sessions.publish();

For incremental you could use the similar property incrementalTableEventStepWhere.

note

This method applies to the first step, not to the last. So use it carefully as you could filter events before extracting the needed columns from them. For example source / medium values.

For example you could use different WHERE conditions for schedule daily updates.

More about source property

Adding updatePartitionFilter#

You could filter rows in a incremental table adding updatePartitionFilter:

const sessions = new ga4.Session(sessionConfig);sessions.updatePartitionFilter =  "date >= date_sub(current_date(), interval 5 day)";sessions.publish();

Actually this way you can add any of IBigQueryOptions

Read more about:

More details#