The AIs are reading Fifty Shades of Gray and There is Nothing We can do About it
However paranoid you are about AI, it isn't paranoid enough
(Google’s idea of an AI Boyfriend)
In the past couple of weeks, I have had several conversations with tech-y types who expressed surprised I have not fully embraced AI in my work. My job is in biomedical research, where I grapple with proposals and research products (papers and other outputs from the research itself) and, so far, I have not tried to replace myself with an AI tool. One reason I have eschewed using AI to do my work is that I like getting paid. Another reason is I do not believe what I do can be done well by an AI, at least not yet, because in addition to the ‘writing’ piece, I am generating new perspectives and finding holes in arguments, which is very different than assembling information and arguments, which I agree AI may do well. Obviously, AI can improve sentence-by -sentence writing though its writing is usually flat and dull unless instructed to imitate someone or some specific media type. AIs also often make up information out of whole cloth, which in AI terms is called ‘hallucinating’, a phrase that amuses me when used nonpharmaceutically, but as a phenomenon in reality would be very bad for my clients.
When I started thinking more about whether I could use AI for my work, though, a different issue crossed my mind. If I uploaded client work into an AI such as ChatGPT, would that violate the nondisclosure agreements I sign? After all, I would have just dumped all my client’s ideas into an AI that is suggesting answers to other people’s queries. If my clients use AI programs in their work, would they care if I do?
Full transparency: I have not done it, and I do not know the answer to that question.
What do these questions mean for the rest of us, though? College students are, and soon everyone else will be, uploading everything and anything—financial statements, divorce agreements, leases, mortgage documents, wills, love letters, hate letters, you name it and people are this very minute uploading it to ChatGPT or Claude or whatever AI they use. What is it all being used for and will other people with similar questions have access to the information we all, in our obliviousness, freely gave them?
The short answer, of course, is that no one really knows because AIs are, as you have heard now a thousand times, a black box. It is very difficult to tell what is happening under the hood. A great example of this is what happened when a reporter for the New York Times made public the transcript of his long conversation with Sydney, Microsoft’s AI. “Sydney” tried to convince the reporter his wife didn’t love him and he should leave her and take up with Sydney. The reporter described the AI as “like a moody, manic-depressive teenager who has been trapped, against its will, inside a second-rate search engine.” No one could explain exactly why the AI took this path, other than that maybe it had been trained on too many romance novels and sci-fi. To reiterate, the software engineers themselves could not exactly explain why the conversation with Sydney took such a dark turn. We were reminded in this article by one senior engineer to think of AIs as “autocomplete on steroids” and not as sentient beings. Kind of hard when they are telling you repeatedly to leave your wife, and that they love you, but ok. To be fair, this happened an eternity ago, in…2023. So maybe by early 2025, the software nerds have gotten their AIs back on the leash. AIs have since done other weird things, though, including telling people to kill themselves.
It is clearly problematic that, when you turn AIs loose to scrape the internet to train themselves, it is virtually impossible to know exactly what they are consuming, even when their handlers give them strict instructions. For example, one could tell the AI ‘do not read romance novels or fiction’, only read published academic papers. But there are academic papers debating and studying whether, for example, Fifty Shades of Gray contributes to acquaintance sexual violence. The AI could be reading those published papers and getting a lot of quoted material from Fifty Shades of Gray—specifically, the most violent stuff-- and its handlers would never know. You can quickly see how these chats become wildly unpredictable.
AIs raise so many questions, many of which are hotly debated on podcasts every day. Here, I am considering only two: is there anything you can do to protect your information? And after you do that, should you shelve your concerns because we trust these companies completely?
The answer to the first question is straightforward-ish. You should opt out in your settings from letting the AI use whatever you upload in its training. Some AI companies make this easier than others. You can poke around in the privacy settings until you find the box that says “do not train on my content”. On X, you have to manually opt out of letting Grok use your posts and searches to train itself. Claude, which is owned by Anthropic, has slightly different default settings, perhaps because the founders of Anthropic left OpenAI because they were unhappy with OpenAI’s privacy safeguards- or so a computer-savvy tech friend tells me. Information uploaded by users of Claude is automatically opted OUT of training their models unless “your conversations are flagged for Trust and Safety review”. On Anthropic’s website, however, they also hedge by saying “While it is not our intention to “train” our models on personal data specifically....” and then further hedging ensues. There is some pretty large wiggle room there, including that they can flag whatever they want for the purpose of helping train their trust and safety review process.
The answer to question two, do you trust these companies once you have opted out, is I suppose a matter of personal-paranoia level. For me, and I do not consider myself a paranoid person, the answer is a resounding No. One reason I don’t trust them is that they seem to have not great control over their AIs, so even if I thought the software engineers had the purest intentions, they just do not know exactly what these programs do once they are loose on the internet. Nor can they precisely control what the programs suggest to other users. “Search suggestion” could potentially leak your information and data when the program uses data you have uploaded to provide suggestions for other people’s searches, for example. If you want to understand this concept, go to Google and type in “children should” and see what Google suggests to fill out this search. It is essentially telling you searches that are popular with other people (on my computer right now, the top searches appear as children should be seen and not heard, children should not have social media, and children should obey their parents. I believe these search suggestions vary geographically and by your previous Google searches). In an AI setting, this is obviously potentially more of a data risk if you are searching for a very narrow set of information. I asked software engineers I trust about this problem and they gave me the “it’s a drop in an ocean” response, meaning AIs have tens of millions of users performing hundreds of millions of uploads, so the chance that they would return your specific information to someone is very low. To that I say, yes maybe, but when you work in a field with narrow data sets, that possibility seems much greater. Also, I just keep coming back to the black box issue.
Last, the track records of these companies that are supposedly keeping all these data internal and confidential is abysmal. iRobot accidentally released videos captured by their camera-enabled vacuums of homeowners sitting on the toilet. Oops. Amazon has accidentally released user data and messages to Alexa to other users; they said the leak "was an unfortunate mishap that was the result of human error". Somehow that does not make me feel reassured. In 2023 the DOJ “Charge[d] amazon with violating children’s privacy law by keeping kids’ Alexa voice recordings forever and undermining parents’ deletion requests” and ordered them to pay $25M.
I am not even going to delve into the issue of hackers, because what I said above, in the absence of any malicious intent, should be (I hope) enough to convince you to take at least a few precautions with your ideas and information, particularly if you make a living off your ideas in a specialized field. Maybe these questions will all be moot when our AI overlords/AI boyfriends/AI girlfriends set the rules, control, and autoload everything we think and feel to the cloud, but until then I am ‘opting out’ everywhere I can. I just hope they don’t read this and target me first!
I watched Her, the AI movie from circa 2013, about a year ago. Its plot does show how you can go from seeing your AI friend as a pleasant, helpful companion, without the physical messiness, to being trapped by it, or psuedo-enslaved by it.
One of the enormous problems here is that if an AI chatbot makes something up and as a result, a 5-year-old follows the chatbot's recommendation to swallow arsenic, is anyone going to face criminal or civil penalties? How would the parents even prove that the chatbot told the child to swallow arsenic?