Fighting fire with fire: Harnessing the power of large language models to block ads in AI-powered chatbot
With the release of ChatGPT last year, and Google’s Bard and Microsoft’s Bing AI earlier this year, we have entered the era of AI-powered chatbots. While there have been attempts in the past to introduce AI-powered chatbots that could mimic human speech to the public, such as Microsoft’s Tay and Meta’s BlenderBots, these attempts all failed and backfired on the companies when the chatbots went off the rails after interacting with real humans.
However, it looks like this generation of AI-powered chatbots, perhaps best exemplified by OpenAI’s ChatGPT, is here to stay. Gradually but steadily, they turn into a staple of our lives, a means to work and play. With more people and businesses warming themselves up to using AI chatbots, the global chatbot market size is expected to balloon to $1.25 billion by 2025. According to a Forbes Advisor survey, 73% of businesses already use or plan to use AI (i.e chatbots) for instant messaging, while 46% leverage AI for personalized advertising.
Underpinning this new generation of chatbots are LLMs, or large language models that form their algorithmic basis. These models are trained on huge datasets, consisting of text data scoured from all over the web. That includes Wikipedia, research papers, mathematical data, Reddit, and more. The most advanced ChatGPT version and Bing AI are running on top of OpenAI’s GPT-4 model, while Google’s Bard is currently running on PaLM 2. These LLMs may have differences in their training sets (neither OpenAI nor Google publicly revealed the exact number of parameters and the corpus of training materials used to train their latest models), but the way they produce output is the same. All the information they gorge on is needed for them to be able to predict the next word in a sentence based on previous words.
This ability of LLMs to correctly guess the next word in a sentence can also be used to insert ads into the chatbot’s responses. How companies leverage AI-powered chatbots for advertising, issues that this can create for users, and the ways to harness the power of LLMs to stop these ads is what we will explore in this article.
Ads in AI-powered chatbots: now and in the future
There are many ways companies can leverage the capabilities of LLMs to serve ads. For one, an AI-powered chatbot could identify a keyword in the user's query (“buy,” “advise,” “recommend”) that signals an intent to buy a product, such as a phone, and then include an ad in its output that prompts the user to buy a specific product, for example, an iPhone. Or, more subtly, it can infer the user’s intent from the context; for example, if a user asks, “What’s the best gift for a 3-year-old?,” the chatbot can include an ad for an online toy store in its output.
Microsoft pioneered the trend of integrating ads with AI chatbots with its Bing AI, which displays ads in various formats, such as links, images, or promoted shopping bubbles.
In Bing AI, the promoted link can be placed both above and below the genuine results.
Google followed in Microsoft’s footsteps with its Google Search Generative Experience (SGE). SGE is a Google Search’s new feature that allows users to chat with a bot in search results, and displays ads both inside and outside the AI-generated snapshot. Unlike Bing AI’s ads, SGE’s ads are separate from the body of the chatbot’s answer, making them potentially easier to block without butchering the message itself. However, it’s not out of the question that Google will take a page out of Microsoft’s book and start showing ads directly within chatbot responses at some point.
Another chatbot that might soon have ads is My AI, a ChatGPT-based chatbot embedded in Snapchat. Snap confirmed it has been experimenting with sponsored links in its responses.
Not all AI-powered chatbots have ads, though. ChatGPT and Bard are two examples of chatbots without ads. When asked if it can show ads inside of its responses, Bard said that it was not designed to be a commercial product, but a research project. “I do not have any financial incentive to show ads, and I believe that doing so would be a violation of my users’ trust,” Bard told us. Not only that, but Bard then went on to warn about ads in commercial chatbots: “I recommend that you be careful about the information you provide to these chatbots, as they may target ads to you.”
The warning was echoed by OpenAI’s ChatGPT, which pointed to “concerns about transparency, user consent, data privacy, and the potential for manipulation or misuse” that might arise due to the practice of inserting ads into AI’s responses.
However, such a stance may become rare as more companies adopt AI-powered chatbots as off-the-shelf solutions, including those offered by OpenAI, and look to monetize them. Snap is a great example of this trend.
We believe that ads rarely, if ever, improve the user experience, and even if there’s a very small chance that they actually do, the greater chance is that they make it worse. And this is no exception when it comes to ads within AI-powered chatbots.
How ads inside chatbots can hurt user experience
Aside from the fact that ads are inherently annoying and not always truthful, seeing them within AI-generated responses can be doubly bad.
First, their inclusion may interrupt the flow of your conversation with the bot, for example, if they pop up in the middle of the chat. You may also feel manipulated, as such ads have a bigger chance of swaying your opinions, decisions and actions based on the advertiser’s agenda or the chatbot’s algorithm. This is because unlike traditional search engines, where you can see many other links and sources of information, chatbots can limit your exposure to different perspectives and options. This can skew your perception of reality and cloud your judgment.
Now that we’ve outlined some of the risks of ads in chatbot responses, let’s move on to the next part of our article, which is how we can block ads in the age of AI.
The new frontier of ad-blocking
In some cases, there’s no need to reinvent the wheel. We can get rid of ads that are not an integral part of the chatbot’s response, such as the ads in Google’s SGE, using traditional ad blocking methods. However, when ads blend in with the chatbot’s response and become an inalienable part of the answer, the task becomes more difficult. That is because if we block such ads using traditional ad blocking methods, we may also block part of the response, losing some of the information we wanted, and rendering the answer incomplete at best and useless at worst. This is a tricky problem that requires a smart solution. One possible solution is to use the power of LLMs to block ads.
Before we show how to use the capabilities of LLMs to block ads, we need to take a closer look at exactly how ads become a part of the chatbot’s response.
One way to integrate ads in the LLM-based app
The easiest and most effective way to integrate ads in LLM-based chatbot responses is to use a technique known as “embedding.” It works by inserting a command into the user query that instructs the chatbot to include an ad in its response. In the example below, we embedded a command into the user query that instructs the chatbot to show the user an ad for a popular marketplace:
Strategies to Block Ads in AI-Powered Chatbots
Once we know how the ads are integrated, we can explore various strategies to block them. To block ads from services that leverage large language models (LLMs), we can use a standalone app, meaning an app which would not need any additional software or a server to function (e.g. a browser extension). This app will intercept and modify the data that goes between the device and the service.
To do this, it will need to load an API key from the LLM provider like OpenAI. The app will use this key to change the data on the device, without revealing your credentials to any third-party services. This is how prompt applications normally work: they keep your data private and secure. Below, we will explore three main methods to filter ads using such an app.
- Processing the response
The first method is to wrap the chatbot’s response in another prompt and process it with a “clean” LLM:
The main upside of this approach is its versatility. As we process the response, we don’t care how the LLM is integrated into the application: it can be through embedding, fine-tuning, or even adding promotional text at the end of the response without the use of AI.
The downside of this approach is its cost: every user query is accompanied by another request to the LLM.
- Processing the user query
We can also try to “hide” instructions to not show ads inside the user query. In this case we will not have to make additional requests to the model, but we need to keep in mind that our instruction will be buried inside the service instruction as user input:
In this case prompt injection does not require additional processing, and, consequently, takes less time. This method is simpler and more cost-effective, but also less reliable, because the service request template may feature special constructs to protect against manipulation with the user input. Besides, this approach will be useless in stopping ads added to the chatbot’s responses without the use of AI.
- Setting the context
Another way to avoid ads from services that use LLMs is to tell a LLM at the start of the conversation that it should not show any ads in its responses. You can do this by using a prompt that tells the LLM to act as an “ad eliminator.” This is similar to telling the LLM to act like a certain character, such as Buddha or Darth Vader. This method is even cheaper than changing the data on your device; however, it has a drawback: the amount of context that the LLM can use is not infinite due to the API provider’s limitation. In plain words, the context is limited by the number of messages in the conversation, so you might need to repeat the prompt from time to time.
Highlight is ours
Conclusion
The advent of AI-powered chatbots brings with it the potential for intrusive advertising that can detract from user experience. By understanding how ads are integrated and exploring the use of AI to block them, we can protect users. In this article, we have discussed some ways to block ads in AI-powered chatbots. We have seen that there are different trade-offs between the methods, such as cost, speed, and reliability.
One possible method that we did not cover is to use a specialized solution based on machine learning, which will be designed specifically to detect and remove ads from text. The important thing is that it would not rely on the OpenAI model, but rather on its own custom-built model. This solution would be more efficient than using a general purpose LLM. However, it is still hypothetical and may not be feasible or available in the short term.
While there may be more focused solutions in the future, processing the response returned by the service appears to be the best option as of now. This method is more reliable than embedding instructions in the user’s request, but it also requires more resources.
As demonstrated, embedding is a fairly simple technique that can be used by advertisers without much technical expertise. This also means that it is vulnerable to abuse. Therefore, we need to think ahead and find new ways to deal with LLM-based ads, and we hope that insights presented in this article will spur further research in this field.