Do early steps into agentic AI respect our needs for privacy and security?

AI assistants and agents are landing. These AI firms must move with great care as they innovate in this environment. 

Key points
  • The move to AI assistants and agents risks a sea change in privacy and security.
  • We analyse OpenAI's early claims that “you’re always in control”.
  • There is great risk that these firms and the data these tools collect will become targets for data exploitation. 
News & Analysis

We’ve been warning for a while now about the risks of AI Assistants. Are these assistants designed for us or to exploit us?

The answer to that question hinges on whether the firms building these tools are considering security and privacy from the outset. The initial launches over the last couple of years were not promising.

Now with OpenAI’s agent launch, users deserve to know whether these firms are considering these risks and designing their service for people in the real world. The OpenAI agent allows you to generate queries that browse the web and run queries across your services to develop plans for you, generate presentations and files to analyse data.

The problems with AI assistants

The move to AI assistants and agents risks a sea change in privacy and security. These services’ usefulness increases with the quantity and quality of the data they have access to; and the temptation will be to lower the friction of data controls to allow the processing of your data. Consequently, we are worried that

  1. the AI tools would generate new datasets on you that create new risks,
  2. could access and share your data at unprecedented levels, and
  3. will store this data beyond your reach, across their services and in the cloud.

So we crafted a range of questions that the AI industry must answer as they continue to deploy these services. We called on the industry to build privacy into the tools from the ground-up, make security a core design principle, and give users adequate controls.

OpenAI’s new agent

OpenAI claims “you’re always in control”. Their examples in their product launch focused on two use cases: an agent conducting research for you by processing third-party data (in the example it was publicly available data), and an agent processing your personal data through connectors.

For each use of these tools it’s essential to ask: What happens to your data, your family’s data, or your employer’s data?

One example used by OpenAI included asking the agent to ‘look at my calendar’ and ‘my email’ using ‘connectors’ to help you ‘because it knows you’… begging the questions what permissions does it need, how does it manage those permissions, and what other data is it processing?

OpenAI admits early on: “This introduces new risks, particularly because ChatGPT agent can work directly with your data, whether it’s information accessed through connectors or websites that you have logged it into via takeover mode.”

The firm states that it has taken steps against prompt injection (and manipulation). It requires user confirmation before ‘consequential actions’. The agent also runs in a virtual machine on OpenAI’s servers. But ultimately the firm states that the tradeoffs are left to the user, when deciding what information to provide to the agent.

Importantly, they state that you may limit the data that the model has access to: you can delete browsing data, log out of website sessions. They also state that the ChatGPT browser does not collect or store any data you enter, ‘because the model doesn’t need it, and it’s safer if it never sees it’.

ChatGPT’s agent uses ‘connectors’ to interface with your third-party applications, such as cloud data stores, calendars, email accounts, and github. This allows ChatGPT’s agent to search data on those services, conduct deeper analysis, and sync data. This seems analogous to Anthropic’s ‘Model Context Protocol’, that provides context data from applications to LLMs. There are deeper issues to these interfaces that need greater study (coming soon), including on security, limits over data access, and even market competition issues.

OpenAI’s explanations around Connectors is intriguing:

  • if you have ‘Memory’ enabled, it can combine past shared data with the data in the queries, and may store some of this data in your account’s Memory store
  • the data can be used by ChatGPT when doing queries and searching the web, for instance
  • for some customers (i.e. enterprise, teams, students), this data is not used to train models, but otherwise the ‘Improve the model for everyone’ is turned on by default.

So this gives rise to (at least) two issues: when does OpenAI process your data for training their models, and what controls are there over how OpenAI processes your data.

To have some control over the training of the model (if you’re not a paid customer where this is off by default), they provide a guide on managing your data including to opt-out, i.e. stop the model from training on your data. Such a guide should not be necessary, particularly as this processing should be opt-in for all users alike.

Meanwhile, ‘Memory’ allows ChatGPT to remember details between chats, previous conversations, and your preferences and interests. OpenAI says you can see what your Memory is by ‘just ask[ing]’: “What do you remember about me?”’ The firm claims that you are able to delete individual memories, and suspend memory. Your ability to control this data, determine when, how and where it is stored, is a key challenge, and has been a stumbling block for other services (see our writeup of Microsoft’s Recall).

Careful next steps

These firms must move with great care as they innovate in this environment. The dangers are intense that these firms and the data these tools collect will become targets for data exploitation. This may become more challenging over the next period as these AI firms expand their business models and start to borrow and deepen practices from surveillance capitalism.

We are monitoring closely the movement of governments as they too expand their interest in AI tools. We’re very concerned about how governments will seek access to the data held on us all by these AI firms, particularly as people and organisations increase their interactions with AI tools – creating tempting datastores for studying people and our habits, interests, and activities. We’re also researching how governments use AI tools for automation and processing of data they hold us on all.

This quickly developing space is as intriguing as it is alarming. To learn more about our work in this space, please sign up to our mailing list, as we launch more analyses and reports in the months and years to come.