Can’t take it back: Samsung ChatGPT fiasco highlights generative AI privacy woes
A word spoken is past recalling. This adage now rings more than true for three Samsung employees, who inadvertently leaked source code, a testing sequence, and internal company discussions to OpenAI, a company behind ChatGPT.
According to the Economist Korea, Samsung was wary about adopting the AI-powered chatbot at first, fearing it may potentially leak its internal information, such as trade secrets to the world (if only they knew). However, as ChatGPT was taking the world by storm, Samsung decided to join the party and gave the green light for its use in the workplace so that employees could stay on top of technological changes. At the time, the company issued a notice to those planning to use ChatGPT to “pay attention to the security of internal information and do not enter private information” when feeding the AI assistant any prompts.
The message apparently fell on deaf ears, as within the span of the next 20 days, not one, but three company engineers revealed what appeared to be Samsung sensitive corporate data, potentially giving OpenAI and its competitors insights about its technology.
In the first case, a Samsung employee found a bug in the source code of the download program for the semiconductor plant measurement database and asked ChatGPT for a solution. In the second case, an employee was using ChatGPT to optimize the test sequence for a program that identifies yield and defective chips. In the third case, an employee first recorded the internal company meeting on his smartphone, then transcribed it using a speech recognition application, and fed it to ChatGPT to generate meeting minutes. All the three are now facing disciplinary probes.
You might expect Samsung to ban ChatGPT right away after the leaks, but that’s not what happened. Instead, Samsung sought to educate its employees on the privacy risks of AI, telling them that whatever you tell ChatGPT, will end up on OpenAI’s external servers. That means once you spill the beans, there’s no going back — no way to retrieve this data. Samsung has also put a cap on how much data each employee can upload to ChatGPT, and warned that if anyone lets something slip again, it will pull the plug on ChatGPT’s use for good.
Once you say something to ChatGPT, you can’t unsay it
Samsung is not the first, but only the latest victim of its employees’ penchant to discuss sensitive matters with OpenAI’s chatbot. Other firms are learning the ropes on the go as well.
Amazon restricted the use of ChatGPT already in January, warning employees against inputting any confidential information, such as code, into the chatbot after the company had spotted output that “closely matched” internal Amazon data in ChatGPT’s responses. Another US retail giant, Walmart, had initially blocked ChatGPT after detecting activity that “presented risk to the company,” but later opened it for use and issued a set of guidelines, including that employees should “avoid inputting any sensitive, confidential, or proprietary information”.
A slurry of financial services companies have banned ChatGPT from the workplace altogether. Bank of America added ChatGPT to its list of unauthorized applications not allowed for business use. Other financial institutions that have blocked access to the chatbot include JPMorgan Chase, Citigroup Inc, Goldman Sachs, Deutsche Bank and Wells Fargo. The latter told Bloomberg that the line with standard third-party software restrictions, and that it will “continue to evaluate safe and effective ways” to use the tech.
The banking industry has been the most proactive in responding to the risks posed by ChatGPT, and understandably so: banks handle heaps of sensitive customer information and are heavily regulated. But the same risks apply to every industry. Once you feed the chatbot a piece of information, be it a chunk of prorietary code or a transcript of a board meeting, consider it no longer a secret, but part of the public domain.
OpenAI’s terms of service suggest that you shouldn’t expect any privacy when interacting with the chatbot, and puts the onus on you not to let the cat out of the bag. For example, OpenAI says that it can view and use your conversations with ChatGPT to train AI, and that it won’t be able to help you remove specific prompts from your chat history.
That means that if you tell ChatGPT sensitive information about you or your company, someone else can potentially prompt ChatGPT for that data, and get ahold of it. In the case of Samsung, its competitors may try to grill the chatbot about the information the employees had leaked.
OpenAI does give option to opt out of having your data being used for training — for that you’ll need to fill out a special form. However, if you have already shared some data with ChatGPT that you regret sharing, it might be too late for that. You may want to consider deleting your account — as only in that case your chat history could be deleted. OpenAI says the process could take up to 30 days, and you won’t be able to restore your account after it’s gone.
But what happens after that is a big question, since users are generally unaware about how their information was used to train the algorithm, and have no way to verify that it has been removed. While it is theoretically possible to make the system “unlearn” the data, i.e., forget what it has learned, this process is considered immensely difficult, not least because it requires identifying the impact of specific data points on a trained model. While there has been research into so-called machine “unlearning” amid rising concerns about the security and privacy implications of generative AI, it is the area that needs more work. The prevailing approach seems to be to retrain the whole model from scratch, which is unfeasible in practice.
So, if we can’t be sure that the information we feed ChatGPT is gone for good, even after we have exercised all our privacy options down to asking for it to be deleted, then it’s probably better not to trust AI helpers with your or your company’s innermost secrets at all. Who knows where they might turn up. As OpenAI itself says: “Please don’t share sensitive information in your conversations.” Indeed.