We Can Remember It for You Wholesale
The article I really don’t like ChatGPT’s new memory dossier by Simon Willison describes a new feature that incorporates memories of context from prior queries to ChatGPT.
đź‘˝ Thanks to PKD for the title.
“I’m an LLM power-user. I’ve spent a couple of years now figuring out the best way to prompt these systems to give them exactly what I want.
“The entire game when it comes to prompting LLMs is to carefully control their context—the inputs (and subsequent outputs) that make it into the current conversation with the model.
“The previous memory feature—where the model would sometimes take notes on things I’d told it—still kept me in control. I could browse those notes at any time to see exactly what was being recorded, and delete the ones that weren’t helpful for my ongoing prompts.
“The new memory feature removes that control completely.
“I try a lot of stupid things with these models. I really don’t want my fondness for dogs wearing pelican costumes to affect my future prompts where I’m trying to get actual work done!”
He describes a quick analysis of how the feature seems to work.
“[…] it looks like this is yet another system prompt hack. ChatGPT effectively maintains a detailed summary of your previous conversations, updating it frequently with new details. The summary then gets injected into the context every time you start a new chat.”
In the example from the article, the image he’d generated included a giant sign that included text from a previous chat. In this case, it was immediately obvious that the LLM was using something other than the image, the prompt, the current conversation context, and the system prompt to generate the image.
But what if it weren’t that obvious? Are people going to notice a subtle detail that reveals something really private or secret? Take a look at the initial image he’d submitted and the final generated image, which purports to be a copy of the original with the details from the prompt added to it. If you compare those two images, you’ll see that, though the main elements look the same, there are enough subtle differences to show that all of the elements have been regenerated, not “copied”.
We’re seduced into thinking that they’ve been copied. They never have been. This regeneration had classically been influenced by the system prompt and conversation context. Now, it’s also being influenced by “memory” of other conversations. It’s going to be impossible to know which past details influenced the generation of that background—or what they might reveal about other conversations. This is just repeating the “Google Search Bubble” but in an even more obscure way.
The second half of the post describes not only how you can disable the feature (for now) but also prompts to (supposedly) cajole the contents of your conversational context out of the LLM. Willison doesn’t seem to consider how much confabulation/hallucination affects the response for that request.
Whether it’s “true” or not, the result is a large amount of detailed information that the chatbot collects and synthesizes. Taken together with most people’s tendency/compulsion to just believe anything that they read, especially if it seems to have been formulated in a science-y or intelligent-sounding way, we can look forward to a future where OpenAI’s business model is selling these profiles to your employer, health-insurance companies, and the tax authorities—and them then acting on these data ruthlessly and unquestioningly.
Initially, I thought Willison might be overreacting but now, after a bit of consideration, I’m more convinced that this feature—although it purports to be helpful—is actually quite hostile to the user’s ability to retain control over the tool—and not vice versa.
It’s time to have a concept like a web browser’s “private tabs” to keep things separate. Of course, this won’t protect most users as it’s easy to forget what’s going on the background with all of these tools. Most of our apps are designed to comfort us into following their pattern, not letting us tell them how we’d like to work.
At the very end, Willison offers hope for an actual user-empowering feature: including conversational context for projects, where you’ve tightly defined which conversations can be used for context where. I’m not sure how useful this would be, though. Some of the main advice for fixing context-poisoning that leads to pathologically unusable answers is to “throw everything away”. If that’s still the go-to answer for “fixing” a broken conversation, it seems very counterproductive and disempowering to have context included that you can’t remove.