ChatGPT PSA: There is an invisible data collection default in CustomGPTs
It's not enough to turn training off in the settings
Hi everyone,
Today I found out that OpenAI has a hidden default setting for collecting user data. If you’re using CustomGPTs, it’s not enough to turn off training in the settings.
To be honest, I am furious.
But my main point today is that I everyone should know about this — how their data might be exposed, the consequences, and how they could turn it off if they build CustomGPTs.
Share with a friend!
The regular training setting
I rarely use ChatGPT (I use other chatbots). In part, it’s because I distrust and oppose their data policies. One huge problem is that they train on user data by default. Yes, all the conversations, what you type, what it types, sensitive information, all of it.
There is a way out, you can turn training off in the settings.
Of course, OpenAI have given the toggle the misleading name of “improve model for everyone” (in data controls). This name appeals to our desire to be helpful to hide important facts:
We could be exposing sensitive information
We could be giving OpenAI the information they need to build agents that replace us
The platform has us taking advantage of those who don’t know about the toggle or don’t understand the implications of data collection
In addition, someone once told me that he occasionally finds the setting back on after turning it off.
The invisible setting
Today I’ve decided to use ChatGPT (regular pro account). I was building a CustomGPT. And what do you know, there is another hidden data collection default. This one is actually invisible!
You see it in the configuration menu when you build a CustomGPT — But only if you have uploaded files!!!!
After you upload a files an “additional settings” section appears at the bottom. You have to click on it to open and see the one and only additional setting (which definitely didn’t require a clickable dropdown) - “Use conversation data in your GPT to improve our models”. And yes, on by default.
Consequences
What if data collection is on in a CustomGPT but I have turned off “improve for everyone” in my account?
I couldn’t find an official policy. But ChatGPT claims that the CustomGPT setting overrides and my data will be stolen regardless:
Is this true?
Who knows given ChatGPT’s hallucination rates, and especially given that the response uses an older name of the toggle and when I asked for links to a policy I got non-policy documents from 2023. But I am going to be operating under the assumption that CustomGPTs expose my data.
Ugh.
Dessert
An AI-generated take on this post!
Ready for more?
Check out my “who trains of your data” series - chatbots, spell checkers, social media
Thanks for highlighting this. I found it by accident, myself, a while back, and I try to tell as many people as possible about that little hidden toggle - which sometimes isn't even visible, until you swipe up and down on the window. Tricky... hmmm...
Whoa. Thanks a bunch, I hadn't caught this. Pretty evil of them to hide that choice under a toggle