Regex for GA (For Dummies)
Are you an absolute wizard with regular expressions in GA4? If so, this blog is almost certainly not for you. If, however, you are are new to regex and / or GA4, you’ve come to the right place. Welcome to the (hopefully) practical guide to regex for GA4. We’ll learn about some regex basics and review 4 specific ways we can use regular expressions in Google Analytics 4.
Here’s what we’re going to cover in this guide.
- What is regex? And what does “full match” vs. “partial match” mean?
- How we can use regex for GA4 Explorations?
- Where can we use regex for creating GA4 segments and audiences?
- Can we use regex in creating custom channel groupings? (Spoiler alert: yes)
- How can use regex to create custom events in GA4?
You can read all about it below. Or watch the video version, if you’d prefer that.
And please keep the following offer in mind. If you’ve found more helpful ways to use regex in your Google Analytics work, consider this your invitation to write a guest blog post or serve as a guest instructor on the YouTube channel.
Can I Use Regex in GA4?
Short answer: yes! You can use regular expressions in Google Analytics 4.
Longer answer: yes, but…it’s different than using regex in Universal Analytics.
So, What is Regex Exactly?
Regex is short for “Regular expression.” A regular expression is a string of text that helps to create patterns that can be used for finding and matching.
Regex (rhymes with “edge-x” not “egg-x”) can be used in GA4 to perform a variety of useful matching functions.
When regex does its matching work, there are multiple ways that it can go about the matching process.
Partial Match Regex vs. Full Match Regex in GA4
The default behavior of regular expressions is something called a “partial match.” With partial match, a regular expression will be true if your regex pattern is found anywhere within the data.
Google provides this example in the regex Analytics support article to explain partial match. Partial match is the default way that regular expressions work in Universal Analytics.
“…If you provide the pattern “India” the regex matches “India”, “Indian”, “Indiana”, “Indianapolis”, and so on. You don’t need to use metacharacters to achieve the partial match.”
Full match, on the other hand, is much more restrictive. With full match (sometimes called exact match), the regular expression must match exactly in order for it to be true. Full match is the default way that regular expressions work in Google Analytics 4.
Here is what Google has to say for itself about full match in that previously linked support article. Pay attention to the red arrow. This is where we get into things called “metacharacters”.
In the example above the dot asterisk (.*) is a metacharacter that functions as a wildcard. In other words, it will match any word that begins with India (including India, and also Indian, Indiana, Indianapolis, etc, etc).
What is a Metacharacter in Regex?
A metacharacter has some sort of special meaning in a regular expression. Different metacharacters have different regex super powers. Here are some of the most important of these special characters.
Important Regex Metacharacters for GA4
Here is a partial list, prioritized based on potential utility in Google Analytics.
- Dot (.) match a single character
- Dot asterisk (.*) wildcard match
- Forward slash (/) another special character that delineates a regular expression literal (since this is a special character, we need to use the following special character when using forward slash symbol in any regex that uses URL components)
- Back slash (\) treat the next character literally if it’s a meta character
- Caret (^) a position “anchor” that indicates the beginning of the string
- Dollar Sign ($) a position “anchor” that indicates the end of the string
- Question Mark (?) treat the prior character zero or one times
- Pipe (|) OR
Personally, I find myself using the pipe symbol most often in my GA4 regular expressions. We’ll see an example of that in the example Exploration below.
How to Use Regex in GA4 Explorations
Here’s an example of an Exploration that relies on a regex filter. The yellow arrow shows the landing page dimension. The red arrow shows the event name dimension. The metric is event count. But the report is not showing all events. Instead, it’s only showing two.
How did that happen? The answer is a regex filter.
Here’s the filter. We filter on Event name and set the filter type to “matches regex”. In this particular filter, we’re using the vertical pipe (|) metacharacter.
We used the caret (^) followed by the open parenthesis to begin the expression. Then, we entered the first event (page_view), the pipe symbol representing OR (|), and the second event name (internal_link_click), before the closing parenthesis. A couple clarifying notes:
- This expression will work with our without the caret and parentheses. In other words page_view|internal_link_click will work just as well as ^(page_view|internal_link_click).
- The internal_link_click event is a custom event. If you haven’t set up this tracking your GA4 property, you’ll need to pick a different event. Or, you can always follow this guide to set up your own custom event to track internal link clicks.
Boom! You are totally using regex now.
How to Use Regex for Creating Segments and Audiences
We can also use regular expressions for creating GA4 segments.
When we build a custom segment, we specify whether we want a “User segment”, an “Event segment” or a “Session segment”. If you’re not familiar with segments, this piece on GA4 segments might be a good stop first. Otherwise, let’s proceed with creating a segment based on a specific subset of our sessions.
When we create a session segment we specify the conditions of which particular sessions to include (or exclude). The session segment below groups our sessions originating from Google Organic Search and Bing Organic Search. Since we want to group traffic this way, we are going to look at the Session source / medium traffic dimension. We can specify the conditions for that traffic with a regular expression.
You can see our expression in the red box below. We’ve changed the matching condition to “matches regex” and then have entered the following expression: google / organic|bing / organic.
There are other ways we could create this segment without using regex. But some people may find it simpler or just more comfortable to use a regular expression.
How to Use Regex in GA4 Custom Channel Groupings
Google announced custom channel groupings for GA4 in the spring of 2023.
When we use custom channel groupings we change the default channel definitions that Google has set for classifying traffic into channels. Here is a list of those default channel groupings for reference.
One such traffic channel is Affiliate traffic. As you can see in the definition below, traffic flows into the “Affiliates” traffic channel when the medium equals affiliate.
But what happens if your marketing team has been using both affiliate and affiliates in your UTM tagged affiliate links? How can you get both UTM variants into your Affiliates traffic channel?
Enter custom channel groupings. Here’s what the traffic channel looks like by default.
Change Your Channel Groupings
To change your channel groupings, access “Channel Groups” within your Data Settings in the admin panel. Hit the blue “create new channel group” button.
Scroll down to where you see the Affiliates channel. Here is the default.
Change the conditions to match your specific needs. In our case we want the traffic Medium to match either affiliate or affiliates. So we can use the vertical pipe to include both in our matches regex conditions.
Don’t worry about casing here. Regular expressions (unlike UTMs!) don’t care about uppercase or lowercase. Apply the change and you’re on your way.
How to Use RegEx to Create Custom GA4 Events
There are two primary ways to create events in GA4. One way is to use Google Tag Manager (GTM) and to create GA4 event tags that – after being paired with a proper trigger – send event data into GA4. You can create more sophisticated conditions using things like trigger groups by using GTM, but you don’t always need that level of detail. In those simpler cases, you can use the built in event creation tool in GA4. This second way does not include Tag Manager at all. There are some limitations relative to the first method, but it can also be a quick and efficient way to create custom GA4 events.
It’s possible to use regular expressions to create custom events in the built in GA4 event creation tool. You can also use regex in Google Tag Manager, but we’re not covering that right now.
When creating custom events with the built-in event creation tool, we specify certain event conditions. When those conditions match, we can create a brand new event. We can use the operator “matches regular expression” to make this happen.
Let’s see an example.
Sample GA4 Custom Event With Regex
The screenshot below shows a new custom event called page_view_tag_page. This creates an event of the same name every time multiple conditions are matched at the same time. Our specific goal is to create an event (called page_view_tag_page) that will fire every time a visitor views one of the tag pages on the Root and Branch site. By “tag pages” I mean all pages that begin with https://www.rootandbranchgroup.com/tag/.
The first condition is where the event_name equals page_view. That is pretty straightforward. The second condition is more complex and is where regex comes into play.
We can first change the Operator to “matches regular expression”. You can see this at the purple arrow below. Then, we select what specific event parameter you want to match with our expression. You can see this to the left of the purple arrow. Finally, you type your regular expression into the “Value” field. You can see this marked at the red arrow.
Here is the full version.
What does it mean, exactly?
What Is Happening in This Regular Expression?
Let’s diagnose this regular expression in the red box below. There are some specific special characters (metacharacters) that are doing work.
We see three back slash symbols (\) marked by the red arrows. The back slash signifies that the next character should be treated literally and not as a meta character. The characters that follow the back slashes are a dot (.) and two forward slash symbols (/). These can have significance as meta characters, but we need them as literal characters since they are used in the URL string that we want to match.
The second important part of the expression is the dot asterisk (.*) at the end of the expression. Remember the dot asterisk (.*) from Google’s example of India.* in explaining full match regex? This is exactly what we’re doing here. But instead of India.* matching Indian, Indiana, Indianapolis and so on, we want https://www.rootandbranchgroup\.com\/tag\/.* to match https://www.rootandbranchgroup.com/tag/ga4-custom-events/ and https://www.rootandbranchgroup.com/tag/ga4-events/ and https://www.rootandbranchgroup.com/tag/ga4-reports/ and so on.
Now we’re cooking with some GA4 regex gas!
Are There More Places to Use Regex in GA4?
Yes, there are. You can also use regular expressions in defining your internal traffic (for filtering) and for creating your referral exclusion list. Sound confusing? If so, you might consider reviewing this 10 step tutorial for properly setting up and configuring your GA4 property.
Looking for something else? Feel free to leave a note in the comments or on the Root and Branch YouTube channel at youtube.com/@rooted-digital. We’re always looking for new content ideas!
Wrapping Up
Don’t forget that Universal Analytics will stop processing data beginning July 1, 2023. In other words, now is the time to get more comfortable with GA4.
If you’re still someone learning about GA4 (as I am), I’d recommend checking out this GA4 vs. UA comparison or this list of updated GA4 questions. You can also subscribe to the Root and Branch YouTube channel for an updated video every week or so. I’ll see you there! There are explainers and tutorials for tracking like this.
- GA4 page timer tracking
- Bounce rate in GA4 vs UA
- How to set up a GA4 form submission conversion
- View your UTM tagged campaign data in GA4
- How to create the Source/Medium traffic report in GA4
- Set up a custom dimension in 7 steps (and why you need to if you want to see event parameter data)
- Goals in GA4 explained vs UA
- How to link Google Ads and GA4
- How to link Google Search Console and GA4
About Root & Branch
Root & Branch is a certified Google Partner agency and focuses on paid search (PPC), SEO, Local SEO, and Google Analytics. You can learn more about us here. Or hit the button below to check out YouTube for more digital marketing tips and training resources.
Hey, thanks for article, but it has a few mistakes, could you please clarify?
“forward slash symbols (\) ” → That’s called a backslash.
How forward slashes (/) are interpreted in GA4? I found some sources saying you have to escape it with a backslash, some saying you don’t have to escape it. Unfortunately neither approach seem to work for me in GA4.
Regex “https://rootandbranchgroup.com\/…” → You’re mixing the 2 approaches. So how is it correctly?
Thanks.
Tom, hello!
Thanks for the thoughtful question and critiques. First, I appreciate the heads up on my forward slash (/) and back slash (\) goofs. I suppose I did say it was written by a dummy. Anyway, I’ve edited that and you have my thanks.
In terms of how forward slashes (/) are interpreted by GA4, my understanding was that they need to be escaped by the back slash (\), at least as how I was trying to use them in the example in the article. When I’ve tested (https://www.freeformatter.com/regex-tester.html) it has seemed to work out.
With that said, I also need to do some more digging on this now. Your comment made me look back at my event data and it’s not firing as I had expected. So clearly I’m missing something for GA4. I’ll do some more digging. In the meantime, if you happen to find the correct approach, I’ll be grateful for anything you can contribute to my continued edification. Thanks again.