Ten years in the wrong regex lane

Today I want to share the story of a couple of regular expressions — a tiny inaccuracy in them ended up costing the world more than 50 million hours of CPU time on iOS devices.

Disclaimer: this post is packed with technical details and might be a tough read if you don’t know how to code.

How content blockers work in Safari

First, let me explain how ad blockers in Safari actually work. The fact is, AdGuard on iOS relies on Safari’s built-in mechanism called Safari Content Blockers.

These days, there’s also support for an alternative approach — declarativeNetRequest (DNR), partially compatible with Chrome and Firefox. Interestingly enough, in Safari’s case, DNR rules are silently translated into Safari Content Blocker rules under the hood.

The very first version of AdGuard for iOS was released in October 2015, and right away we ran into a nasty problem. Historically, ad blockers have always used their own filtering rule syntax. It’s powerful and specifically designed for web filtering. Apple, however, introduced their own take on it, which was quite different from what the community was used to.

To jump ahead a little: when Chrome introduced declarativeNetRequest, they also went their own way. At least when it came to URL matching, their syntax was much closer to what we were familiar with.

Here’s an example of how a standard AdGuard rule gets converted into syntax supported by Safari:

Converting an AdGuard rule into a Safari-accepted rule

Notice what happens to the URL pattern. In AdGuard filters, we use a special wildcard-like syntax tailored specifically to URLs. The reason is simple: traditional regular expressions are too slow for this job.

Regular expressions

Safari Content Blockers, on the other hand, rely on regular expressions — although in a very stripped-down form, so that they can still be compiled into a structure that speeds up matching.

If you’re curious about the internals: Safari builds a DFA bytecode out of regex patterns, which is then executed by a custom interpreter: DFABytecodeInterpreter.cpp.

Regexps are more versatile than the standard ad blocker syntax. Unfortunately, that flexibility isn’t really relevant for web content filtering. What we get instead is very slow rule compilation that eats up resources, plus strict limits on how many rules filters can contain. We’ve written before about Safari’s issues, and most of them are still around.

Pattern conversion v1.0

So, back in 2015, we had to figure out how to convert AdGuard’s URL patterns into regexps that Safari would accept.

We faced two major tasks at the time.

First, we needed to cut down the total number of rules in the final set. Back then, Safari had a hard limit of 50,000 rules. (We described how we tackled this in our post about ad blocking in Safari).

By the way, the current limit has been raised to 150,000. However, due to process memory limits, you can only use around 60–80K in practice. We reported this to Apple a couple of times (Apple Feedback Assistant reports: FB19728743, FB13282146), but to no avail.

The second task was to ensure the regexps we generated were efficient enough to run fast and lightweight enough for iOS to compile them. In those days, we sometimes saw the system kill the com.apple.Safari.ContentBlockerLoader process because it consumed too many resources.

After a lot of manual testing, we settled on what seemed like the optimal conversion rules:

  • The symbol || (“start of URL”) became ^[htpsw]+://([a-z0-9-]\.)?
  • The symbol ^ (“separator”) became [/:&?]

We felt confident we’d done our homework, so we stopped worrying about it and left things as they were — for nearly ten years.

We were wrong

It all started with another bug report. The issue was that our standard conversion method slightly changed the semantics of the special || symbol. On iOS, it ended up matching only a single subdomain level, while in every other version of AdGuard it matched across all levels.

The simple fix was to use the regular expression originally suggested by the WebKit developers back in 2015 — the one we had dismissed as “non-optimal” at the time. But we were so sure of our choice back then that we didn’t even bother to recheck until recently.

The changes we should have made were dead simple:

  • Replace || with ^[^:]+://+([^:/]+\.)?
  • Replace ^ in most cases with [/:]

But was there really that much of a difference? Turns out — oh yes, there was.

Oh boy, how wrong we were

After swapping in the new regexps, we ran a couple of quick tests and the results blew us away. Rule loading speed in Safari didn’t just improve a little — it skyrocketed.

To put it in numbers: compiling the Tracking Protection filter in Safari became 5.5 times faster, and compiling the Base filter became 2.8 times faster.

And to make it more tangible, just take a look at the video below:

When you add up the number of AdGuard users over the past decade, and how many times the app had to recompile filters, the wasted CPU time comes out to at least 50 million extra hours on iOS devices.

I honestly feel ashamed about this mistake — especially knowing that the correct solution was staring us in the face the whole time. In hindsight, it’s clear that these new regexps were obviously going to compile and run faster than what we had chosen.

So what exactly was our mistake back then? I think it all comes down to a flawed testing methodology. We tried to judge “by eye,” instead of:

  1. Defining a clear set of criteria: memory usage, speed, and actual content blocker performance in the browser.
  2. Most importantly: learning how to measure those things precisely. Not by eyeballing Activity Monitor or top, but by using a proper profiler.

The good news is, this problem is now fixed. And I really hope we’ve learned the lessons we needed to, so we won’t repeat mistakes like this in the future.

Liked this post?
19,592 19592 user reviews
Excellent!

AdGuard for Windows

AdGuard for Windows is more than an ad blocker. It is a multipurpose tool that blocks ads, controls access to dangerous sites, speeds up page loading, and protects children from inappropriate content.
By downloading the program you accept the terms of the License agreement
Read more
AdGuard for Windows v7.21, 14-day trial period
19,592 19592 user reviews
Excellent!

AdGuard for Mac

AdGuard for Mac is a unique ad blocker designed with macOS in mind. In addition to protecting you from annoying ads in browsers and apps, it shields you from tracking, phishing, and fraud.
By downloading the program you accept the terms of the License agreement
Read more
AdGuard for Mac v2.17, 14-day trial period
19,592 19592 user reviews
Excellent!

AdGuard for Android

AdGuard for Android is a perfect solution for Android devices. Unlike most other ad blockers, AdGuard doesn't require root access and provides a wide range of app management options.
By downloading the program you accept the terms of the License agreement
Read more
Scan to download
Use any QR-code reader available on your device
AdGuard for Android v4.11, 7-day trial period
19,592 19592 user reviews
Excellent!

AdGuard for iOS

The best iOS ad blocker for iPhone and iPad. AdGuard eliminates all kinds of ads in Safari, protects your privacy, and speeds up page loading. AdGuard for iOS ad-blocking technology ensures the highest quality filtering and allows you to use multiple filters at the same time
By downloading the program you accept the terms of the License agreement
Read more
Scan to download
Use any QR-code reader available on your device
AdGuard for iOS v4.5
19,592 19592 user reviews
Excellent!

AdGuard Content Blocker

AdGuard Content Blocker eliminates all kinds of ads in mobile browsers that support content-blocking technology — namely, Samsung Internet and Yandex Browser. Its features are limited compared to AdGuard for Android, but it is free, easy to install, and efficient
By downloading the program you accept the terms of the License agreement
Read more
AdGuard Content Blocker v2.8
19,592 19592 user reviews
Excellent!

AdGuard Browser Extension

AdGuard is the fastest and most lightweight ad blocking extension that effectively blocks all types of ads on all web pages! Choose AdGuard for the browser you use and get ad-free, fast and safe browsing.
AdGuard Browser Extension v5.1
19,592 19592 user reviews
Excellent!

AdGuard Assistant

A companion browser extension for AdGuard desktop apps. It offers an in-browser access to such features as custom element blocking, allowlisting a website or sending a report.
AdGuard Assistant v1.4
19,592 19592 user reviews
Excellent!

AdGuard Home

AdGuard Home is a network-based solution for blocking ads and trackers. Install it once on your router to cover all devices on your home network — no additional client software required. This is especially important for various IoT devices that often pose a threat to your privacy
AdGuard Home v0.107
19,592 19592 user reviews
Excellent!

AdGuard Pro for iOS

AdGuard Pro has much to offer on top of the excellent iOS ad blocking in Safari already known to the users of the regular version. By providing access to custom DNS settings, the app allows you to block ads, protect your kids from adult content online, and safeguard your personal data from theft.
By downloading the program you accept the terms of the License agreement
Read more
AdGuard Pro for iOS v4.5
19,592 19592 user reviews
Excellent!

AdGuard for Safari

Ad blocking extensions for Safari are having hard time since Apple started to force everyone to use the new SDK. AdGuard extension is supposed to bring back the high quality ad blocking back to Safari.
AdGuard for Safari v1.11
19,592 19592 user reviews
Excellent!

AdGuard for Android TV

AdGuard for Android TV is the only app that blocks ads, guards your privacy, and acts as a firewall for your Smart TV. Get warnings about web threats, use secure DNS, and benefit from encrypted traffic. Relax and dive into your favorite shows with top-notch security and zero ads!
AdGuard for Android TV v4.11
19,592 19592 user reviews
Excellent!

AdGuard for Linux

AdGuard for Linux is the world’s first system-wide Linux ad blocker. Block ads and trackers at the device level, select from pre-installed filters, or add your own — all through the command-line interface
AdGuard for Linux v1.0
19,592 19592 user reviews
Excellent!

AdGuard Temp Mail

A free temporary email address generator that keeps you anonymous and protects your privacy. No spam in your main inbox!
19,592 19592 user reviews
Excellent!

AdGuard VPN

66 locations worldwide

Access to any content

Strong encryption

No-logging policy

Fastest connection

24/7 support

Try for free
By downloading the program you accept the terms of the License agreement
Read more
19,592 19592 user reviews
Excellent!

AdGuard DNS

AdGuard DNS is a foolproof way to block Internet ads that does not require installing any applications. It is easy to use, absolutely free, easily set up on any device, and provides you with minimal necessary functions to block ads, counters, malicious websites, and adult content.
19,592 19592 user reviews
Excellent!

AdGuard Mail

Protect your identity, avoid spam, and keep your inbox secure with our aliases and temporary email addresses. Enjoy our free email forwarding service and apps for all operating systems
19,592 19592 user reviews
Excellent!

AdGuard Wallet

A secure and private crypto wallet that gives you full control over your assets. Manage multiple wallets and discover thousands of cryptocurrencies to store, send, and swap
Downloading AdGuard To install AdGuard, click the file indicated by the arrow Select "Open" and click "OK", then wait for the file to be downloaded. In the opened window, drag the AdGuard icon to the "Applications" folder. Thank you for choosing AdGuard! Select "Open" and click "OK", then wait for the file to be downloaded. In the opened window, click "Install". Thank you for choosing AdGuard!
Install AdGuard on your mobile device