Two of the most impactful features of GA4 are data thresholds (thresholding), and reporting identity. You must understand how these features impact GA4 predefined reports, and GA4 Explorations, to interpret report data correctly.
GA4 thresholding
GA4 thresholding is a feature that withholds certain data from reports to protect user privacy. This is done by removing rows with very small numbers of users or events, preventing identification of individual users based on their data in GA4 reports. GA4 thresholding is applied when all of the following conditions are met:
- Google Signals is enabled.
- The reporting identity is either Blended or Observed.
- A report contains rows with small user or event numbers.
The exact number of users or events that triggers thresholding is not known, but it is typically around 50 or below. Also note that a GA4 360 license does not prevent GA4 thresholding – it impacts GA4 standard and 360-enabled properties alike.
GA4 reporting identity
A reporting identity is a way for GA4 to identify users across different sessions and devices. There are three different reporting identities:
- Device-based: This reporting identity uses the device ID to identify users.
- Benefit = Privacy: This reporting identity prevents the identification of individual users with GA4 data, and therefore is not impacted by data thresholds.
- Observed: This reporting identity uses the user ID, Google Signals, and the device ID to identify users.
- Benefit = Accuracy: This reporting identity can identify users across different sessions and devices, which can improve the accuracy of your reports.
- Blended: This reporting identity uses a combination of the user ID, Google Signals, the device ID, and modeled data to identify users.
- Benefit = Accuracy beyond observed behavior: This reporting identity can identify users across different sessions and devices, and it can also use modeled data to fill in gaps in the data.
The reporting identity that you choose will affect the data that is available in your reports. For example, if you choose the Device-based reporting identity, you will not be able to see data about users who have visited your website or app on multiple devices.
You can change the reporting identity for your property in the Google Analytics 4 admin console. To do this, go to Admin > Property > Reporting identity.
What data is affected by GA4 thresholding?
The following could be incompletely reported due to GA4 thresholding in predefined reports, GA4 Explorations, and in data returned by the GA4 Data API:
- User counts.
- Event counts.
- User demographics.
- User interests.
- User engagement.
How can you see if GA4 thresholding is applied to your reports?
If you see a warning message in your GA4 reports that says “Data thresholds are applied,” this means that thresholding is in effect. You can still view the data in these reports, but you will not be able to see individual user or event data.
How to remove GA4 thresholding
Any of the following can remove GA4 thresholding:
- Change to the device-based reporting identity.
- Increase your report date range.
- Turn off Google Signals.
- Report using GA4 data exported to BigQuery.
To eliminate data thresholds, change your reporting identity to device-based, or increase your report date range so your reported user data is above the GA4 data threshold trigger. If you do not want to change your report identity or need to report against a smaller date range, to remove thresholding you can turn off Google Signals. However, reports that contain user counts may continue to be subject to thresholding for a period of time after Google signals is disabled.
Note if you turn off Google Signals, the extra data it collects will stop recording. We recommend you keep Google Signals engaged, and simply change your reporting identity to remove data thresholds. Changing the reporting identity does not impact underlying GA4 data, it simply changes how it is reported in the UI. However, Google Signals does impact the underlying data, and if you turn it off you will not be able to restore that extra data that would have been recorded while it is disengaged.
Reporting from GA4 data exported to BigQuery is also an option to eliminate data thresholds as well as GA4 sampling & GA4 (other) row instances (Universal Analytics also has sampling & row limits). BigQuery exported GA4 data also does not include the extra data from Google Signals, nor GA4 modeled data. As such, there will be meaningful differences between reporting with BigQuery exported GA4 data compared to GA4 predefined reports & GA4 Explorations.
Takeaways
- Always be on the lookout for the possible impact of data thresholds on your GA4 reporting.
- Switch between reporting identities to see the related impact on your GA4 reporting before drawing conclusions with data reported by a single identity.
- Keep Google Signals engaged as that extra data will not be recorded if this feature is disengaged.
- Remove data thresholds from GA4 reporting by switching to the device-based reporting identity, or increasing your report date range.
- Start exporting your GA4 data to BigQuery as soon as possible so this valuable data source is available as far back in time as possible when your organization starts using it for analysis (which will likely happen sooner than later).