Anycast and BGP: How AdGuard DNS serves millions of queries
AdGuard DNS started as a pet project of ours, as a non-commercial addition to the ad blocker. But over time it has grown into a serious standalone product and has accumulated a great number of users around the world. It currently handles more than 1 million requests per second, and that rate is growing every day.
Originally AdGuard DNS operated as a single server. Once the number of users started to grow higher, we encountered two problems:
- One server сan no longer handle the load.
- Since clients are located all over the world, for some of them the access time to a server somewhere in Europe is unacceptably long.
In this post we will tell you how we solved these problems.
No matter where you are located in the world, AdGuard DNS will respond quickly, as if the server were near you.
But what to do when there are millions of clients around the world? Every one of them wants to connect to the closest server, but at the same time they all type into their router the same IP address you put up on your website. How to decide which client should connect to which server? There's only one solution — and it's to use anycast routing.
I will briefly explain what it is in general, trying to make it as simple as possible and using minimum obscure words.
The first thing we need to talk about is how Internet routing works, and the easiest way to do this is to use an example. Let's assume that Kenny from Colorado is trying to connect to a remote server in another city or even country.
Kenny had already done everything he could: he found out the IP address of the server he wanted to connect to and sent a data packet to it. In fact, he sent the packet to the router of his ISP.
Here's what happens next:
- Kenny's ISP's router sends a data packet to another ISP's router (to which it is directly connected by a physical cable).
- That router sends a data packet to the next router (again, via cable).
- The process continues until the data packet reaches the router to which the target server is connected.
It looks quite simple, but there is one non-obvious point. The entire Internet is a huge network of interconnected servers, routers, and other stuff. Those servers belong to different Internet service providers that supply the Internet connection to end users. And Kenny's ISP is connected to more than one other ISP (we also call such connected ISPs "BGP peers"). How can it find out which of them to hand over the data package to?
It works as follows:
- The router attached to the target server informs all neighboring routers (its BGP peers) that it is "responsible" for all IP addresses with a certain prefix. These addresses include the IP address of the target server. The BGP protocol (Border Gateway Protocol) is used to transmit this information.
- Those neighboring routers broadcast this information further to the other routers to which they are connected.
- Finally, Kenny's router receives information about all the routes that can be used to reach the target server.
- Selecting a particular route is done in a very simple way: the route with the least number of routers involved should be chosen.
Note that I simplify things to make the description more comprehendable. The actual information exchange takes place between so-called autonomous systems, which within themselves may also pass traffic through a bunch of routers. Each autonomous system is a collection of IP addresses and routers, usually managed by the same organization (e.g. an ISP).
Okay, we got the routing figured out. So how does that help us? What is anycast anyway? Let's imagine a situation in which many routers around the world say the same thing to their neighbors: "Hey, I am responsible for all IP addresses with this prefix". As in the previous example, this information eventually reaches Kenny's router.
And this router chooses the shortest route possible for Kenny's packet.
This mechanism is called anycast routing and this is what we use in AdGuard DNS to make sure that the nearest server responds to you.
Drawbacks of anycast
Anycast is a good solution, but unfortunately not an ideal one. The point is that the shortest route does not mean the fastest connection, because only the number of "hops" in the chain is taken into account, but not the quality of each "hop". But even in this case, the BGP protocol allows us to influence the routes within certain limits. For example, we can "artificially" make a certain route longer.
Take the diagram above as an example. Kenny connected to the server in Miami because the route there was shorter, but in fact his connection would have worked much faster had he connected to the server in Amsterdam. Can we do something about it? The answer is "maybe" since it depends on what kinds of customization are provided by the autonomous systems we are "connected" to.
Many autonomous systems allow the use of so-called "BGP communities" for flexible routing setup. Basically, a "BGP community" is a kind of label that is transmitted along with the route information. Based on this label, the router receiving the route can artificially lengthen the route or get rid of the route altogether.
Let's try to use a "BGP community" so that Kenny's traffic will take a faster rather than shorter route.
In this example we are lucky enough: the ISPs on the path to Kenny have allowed us to use BGP communities that "nullify" the route to the Kenny ISP — meaning that Kenny's router won't learn about such route at all.
Thanks to that, Kenny's ISP has not learned about routes to Miami and Singapore and therefore we've achieved our desired goal and managed to fix the routes so that the traffic went to Amsterdam.
Unfortunately, in real life it is not so simple:
- Not all providers allow using BGP communities for flexible configuration.
- Sometimes we have to contact the ISPs directly to find out if such a configuration is available, because this information is not published or buried in the depths of websites.
- Finally, BGP communities do not always provide enough flexibility to achieve everything we need.
There is no one-size-fits-all solution and proper routing is a constant work in progress, which we continue to do.