Move beyond standard reporting by integrating GA4 with BigQuery. This technical guide explains how to calculate Customer Lifetime Value and build advanced dashboards without a background in data science.
The standard Google Analytics 4 interface is an exceptional tool for real-time monitoring and basic trend analysis. However, for organisations aiming to achieve a “Gold Standard” in data maturity, the interface eventually becomes a bottleneck. Constraints such as data sampling, the 14-month limit on user-level data retention, and the lack of flexibility in custom calculations often leave analysts wanting more. This is where the BigQuery export becomes indispensable.
BigQuery is Google’s serverless, highly scalable data warehouse that allows you to store and query petabytes of data using SQL. By exporting your GA4 data to BigQuery, you move from a “pre-packaged” reporting environment to a raw data environment. This transition is the prerequisite for calculating one of the most vital metrics in business: Customer Lifetime Value (LTV). Understanding LTV allows you to identify which acquisition channels bring in the most profitable customers over time, rather than just focusing on the initial click.
1. Why the BigQuery Export is Non-Negotiable
In the legacy era of Universal Analytics, the BigQuery export was a luxury reserved for GA360 (Enterprise) customers. With GA4, Google has democratised this feature, making it available to every account for free. The primary reason to activate this link is data ownership. When you rely solely on the GA4 UI, you are essentially “borrowing” your own data from Google’s processing engine.
By moving your data to BigQuery, you gain full control over the raw event logs. This means you can join your website data with internal CRM data, offline sales records, or even weather and economic indices. Furthermore, it bypasses the “Thresholding” and “Sampling” issues that often occur in the GA4 interface when dealing with small audiences or high-cardinality dimensions.
2. Setting Up the Pipeline: GA4 to BigQuery
The integration process is relatively straightforward but requires careful configuration of the “Daily” vs “Streaming” export options. The Daily export sends a complete batch of the previous day’s events, while the Streaming export sends data in near real-time.
To set this up, you must create a project in the Google Cloud Console and link it to your GA4 property under the “Product Links” section. It is vital to ensure that your BigQuery dataset is located in the same geographic region as your primary business operations (e.g., London for UK-based firms) to comply with data residency regulations and minimise query latency.
3. The Architecture of the GA4 BigQuery Schema
One of the steepest learning curves for non-data scientists is the “nested” structure of the GA4 BigQuery schema. Unlike a traditional spreadsheet where each row is a simple entry, GA4 uses a “repeated” format. For every event, there is a sub-table of parameters.
For example, a purchase event doesn’t just have a column for value. It has a nested record containing keys and values for transaction_id, shipping, tax, and coupon. To access these, you must learn to use the UNNEST function in SQL. This architectural choice is what makes GA4 so flexible, but it requires a shift in how you write your queries to avoid common “counting” errors.
4. Defining Customer Lifetime Value (LTV)
Before writing the SQL code, we must define what LTV means for your specific sector. In an e-commerce context, LTV is the total revenue generated by a unique user across all their sessions. In a SaaS context, it might be the total subscription fees paid minus the cost of support.
A common formula used in these dashboards is: LTV = (Average Purchase Value) x (Purchase Frequency) x (Customer Lifespan).
By calculating this at the user level in BigQuery, you can segment your customers into “Gold,” “Silver,” and “Bronze” tiers. You can then export these audiences back into GA4 or Google Ads for highly targeted remarketing campaigns that focus on retaining your most valuable users.
5. SQL Masterclass: Calculating LTV from Raw Events
To calculate LTV, your SQL query needs to aggregate total revenue grouped by the user_pseudo_id. This is the unique identifier GA4 assigns to a browser or device.
The query involves three main steps. First, you must filter for events where the event_name is purchase. Second, you unnest the event_params to find the value key. Third, you sum these values while grouping by the unique user ID. This basic query forms the foundation of your LTV dashboard. Once you have this baseline, you can add layers of complexity, such as “Recency” (how long since the last purchase) and “Frequency” (how many purchases in the last 6 months).
6. Joining CRM Data for a 360-Degree View
The real magic happens when you bring in first-party data. GA4 knows what a user does on your site, but it doesn’t know what they do in your store or after they talk to your sales team.
By using the user_id feature in GA4 (which ties a web session to your internal database ID), you can join the BigQuery table with your CRM table. This allows you to see that a user who looked at five blog posts and then disappeared actually closed a £50,000 deal via a phone call three weeks later. Without BigQuery, that marketing touchpoint would look like a failure; with BigQuery, it is revealed as a high-value lead generator.
7. Visualising the Data in Looker Studio
Data sitting in a BigQuery table is useless if it isn’t communicated to stakeholders. Looker Studio (formerly Data Studio) has a native connector for BigQuery that makes visualisation seamless.
Instead of connecting Looker Studio directly to the raw GA4 export (which can be expensive due to query costs), it is best practice to create a “Processed Table” or “View” in BigQuery. This pre-calculated table contains only the final LTV metrics. This approach ensures your dashboards load instantly and your Google Cloud costs remain under control.
8. Predictive LTV: Using BigQuery ML
For those who want to move from descriptive analytics to predictive analytics, BigQuery ML (Machine Learning) allows you to build models using standard SQL. You can train a “Linear Regression” model on your historical GA4 data to predict the future LTV of new users.
The model looks at early-stage behaviours—such as the number of pages viewed in the first session or the time spent on the pricing page—and predicts how much that user is likely to spend over the next 12 months. This allows you to bid more aggressively in Google Ads for users who “look like” your future high-LTV customers.
9. Managing Costs and Performance
BigQuery is “pay-as-you-go,” which is both a blessing and a curse. A poorly written query that scans your entire historical database can cost several pounds. To keep costs low, always use “Partitioned Tables.”
GA4 exports its data into daily partitions by default. When writing your SQL, always include a WHERE _TABLE_SUFFIX BETWEEN... clause to limit the scan to specific dates. Additionally, avoid using SELECT * as it forces the engine to read every column, including the heavy nested parameters you might not need.
10. Conclusion: The Path to Data Maturity
Moving from the GA4 interface to BigQuery is the single most important step in an analyst’s career. It marks the transition from being a “report consumer” to a “data architect.”
While the initial hurdle of learning SQL and understanding nested schemas may seem daunting, the rewards are immense. You gain the ability to answer complex business questions, prove the long-term value of marketing efforts, and build a resilient data foundation that scales with your organisation. Start small with a simple LTV calculation, and soon you will be building the automated data pipelines that drive global business decisions.