4 Steps to Ensure Data Confidentiality when using AI for Patent Application Drafting

Protect confidential patent data when using AI. Follow 4 essential steps to ensure security, compliance, and safe data use.

4 Steps to Ensure Data Confidentiality when using AI for Patent Application Drafting

(From left to right) Paul Morico (Partner, Baker Botts), Hon. Nancy F. Atlas (ret.), Atlas ADR, Dr. Chris Parsonson (CEO, Solve Intelligence), and Tammy Rhodes (Of Counsel, Wenderoth, Lind & Ponack) on the AI ethics panel at the annual ABA-IPL conference.

AI is reshaping the patent landscape. There are several patent application drafting tools available, the best of which are leading to 50%+ efficiency and quality improvements.

However, using the wrong AI product can bring significant risks to the security and confidentiality of your data.

In a recent panel discussion at the annual ABA-IPL conference, Hon. Nancy F. Atlas (ret.), Atlas ADR, Paul Morico (Partner, Baker Botts), Tammy Rhodes (Of Counsel, Wenderoth, Lind & Ponack), and Dr. Chris Parsonson (CEO, Solve Intelligence) discussed a 4-step framework for patent attorneys and IT departments to use when assessing whether an AI product is suitable for processing confidential patent data.

What is an AI Model?

A large language model is a type of neural network architecture.

You can think of neural networks as being a bunch of numbers, or model parameters, which take an input and perform some computation on the input to produce an output.

What is the Difference Between AI Training and Inference?

There are two key states of an AI model: Training mode and Inference mode.

Training: The model parameters are being updated such that the model takes your input and produces your desired output.

Example: Take the 1,000 most recently sold houses in Washington DC and record their square footage, number of bedrooms, and selling price → train an AI model to take the square footage and number of bedrooms of a given house to correctly predict the price of that house.

Inference: The model parameters are ‘frozen’ after training and are now used to make predictions on new, previously unseen data without updating the model parameters.

Example: Use the model trained above to predict the house price of a previously unseen property based on its square footage and number of bedrooms.

What is the Difference Between Open and Closed Source AI?

Open source software is “code that is designed to be publicly accessible - anyone can see, modify, and distribute the code as they see fit”.

In the context of AI, open source models are models where the model parameters are publicly available. Anyone can download the models and run them on their own machines.

Closed source models, on the other hand, are models where the model parameters are only known to the creator of the model. No one else can view or download the model parameters.

What is the Difference Between Self-Hosted and Third-Party-Hosted AI?

Self-hosted models are models where you have downloaded the model parameters and you are running the AI model on your own machines. The model can be a closed source model (i.e. your own model) or an open source model you downloaded.

A third-party model is a model whose parameters are hosted on the machines of a third-party provider. You can only send some input to the third party, have the third party perform an inference to produce an output, and then have the third party send the output back to you. The model can be either a closed source model created by the third party, or an open source model that the third party downloaded and is now serving for you.

How Can I Ensure Security and Confidentiality When Using AI?

Many attorneys (incorrectly) only focus on which AI model is being used and if it’s closed or open source to determine if AI-enabled software is suitable for handling confidential data.

However, which AI model is being used and whether it is closed or open source does not, on its own, address whether or not your data will be kept secure and confidential.

Below is a framework with 4 key questions to ask every AI product provider you talk to when assessing whether you should allow for the AI to process confidential data.

1. Is my Data Being Used for Training an AI Model?

If a model is in ‘training mode’ and you upload confidential data to that model, the model’s parameters will be updated based on your inputs. This means that anyone else accessing the same model may be able to ‘reverse engineer’ your data. 

Example: You have a description of an invention for a time machine. You upload it to a model which is in training mode and which someone else also has access to. Because that model’s parameters are being updated w.r.t. your inputs, the other person will be able to ask the model how to build a time machine, and the model will answer. You have now disclosed your time machine invention!

Data put into an AI model which is in training mode and which other people have access to will not be kept confidential.

If a model is in ‘inference mode’ and you upload confidential data to that model, the model’s parameters will not be updated based on your inputs. This means that anyone else accessing the same model will not be able to ‘reverse engineer’ your data. 

Example: You have a description of an invention for a time machine. You upload it to a model which is in inference mode and which someone else also has access to. Because the model’s parameters are not being updated w.r.t. your inputs, the AI model is not retaining any knowledge of your data. You have not disclosed your time machine invention!

Data put into an AI model which is in inference mode and which other people have access to may be kept confidential.

2. Is My Data Being Monitored by Any Third Party?

If a third party (i.e. not you) is hosting the AI model for you (be it a closed or open source model), you need to check if that third party is viewing or monitoring your data in any way.

Example: Even if you have an enterprise ChatGPT or Microsoft Azure subscription and opt out of having your data used for training, both of these providers store your data for up to 30 days by default for abuse and misuse monitoring purposes. Third party humans are looking at your data to check that you’re not e.g. trying to manufacture a bomb.

Data put into an AI model which guarantees that there is no third party monitoring of any sort may be kept confidential.

3. Where is My Data Being Transported and Stored?

Under the hood, one software product is usually powered by many other software products (‘subprocessors’). If any of these subprocessors store copies of your data, you might find yourself in a situation where you’re not sure where your data is, which makes managing and controlling where your data goes difficult.

Example: You are in the US and sign up for a contract reviewing software product. You upload a confidential contract. Under the hood, that piece of software passes your contract to a subprocessor in Europe to perform OCR and extract key info. You later delete your contract from the software product you signed up for. However, because the software provider does not have a zero data retention agreement with its subprocessor, your confidential data is still sitting somewhere in Europe!

Data put into an AI model which guarantees zero data retention may be kept confidential.

Even with zero data retention agreements in place, you should still know exactly where the data is being transported between when you input your confidential data and when you later receive an output from the software.

This is because for the most confidential information (Example: An inventor who works in a defence company and is asking their legal counsel to draft a patent application), companies and governments may have rules about which geography the data must stay in (e.g. the data must never leave the US or must never leave Europe).

4. What is My Data Retention Policy?

The final question to ask is how your data is being retained on the AI product you are using.

If you delete the data, is that data permanently wiped off the face of the earth? Or is there a back-up of it somewhere? If there is a back-up, how long does this exist for, and where?

Moreover, if you are rolling out an AI product across your organization, you should make sure that you can control exactly what the data retention policy of your organization is for all of your users.

For example, many of the large law firms and enterprises using AI will configure the product to warn users if they have uploaded confidential data to the product and left it there without accessing it for 30 days. After 45 days, the organization admin will be notified, and after 60 days, the data will be automatically deleted. Administrators are in complete control of their organization data, and they can permanently delete it without backups at any given moment they wish.

Conclusion: AI can Significantly Increase the Efficiency and Quality of Your Work - But Ask these 4 Questions to Ensure Your Data is Kept Secure and Confidential

You can use AI to handle confidential data.

The AI models can be either closed or open source, and they can be hosted by a third party while still ensuring the security and confidentiality of your data. You can also customize the AI to your unique style, and you can get significant efficiency and quality gains.

However, the 4 key questions to ask to ensure the security and confidentiality of your data are:

  1. Is my data being used for training an AI Model?
  2. Is my data being monitored by any third party?
  3. Where is my data being transported and stored?
  4. What is my data retention policy?

AI for patents.

Be 50%+ more productive. Join thousands of legal professionals around the World using Solve’s Patent Copilot™ for drafting, prosecution, invention harvesting, and more.

Related articles

8 Critical Patent Application Mistakes That Can Cost You Protection

Learn how to avoid 8 common mistakes companies make when filing a patent application. This article highlights critical pitfalls that can jeopardize your IP protection and offers practical tips and tools.

Solve Intelligence Is Exhibiting at the INTA Annual Meeting 2025

EPO to Use AI Tools for Minuting Oral Proceedings

The European Patent Office (EPO) has announced a pilot project to implement artificial intelligence tools to assist in the preparation of minutes during oral proceedings conducted by videoconference. This development represents a notable evolution in the EPO's approach to procedural documentation, but aligns with the recent trend of increasing AI adoption by the EPO generally.

How AI Patent Drafting Software Streamlines Intellectual Property Protection

AI patent drafting software is transforming intellectual property protection by significantly accelerating the drafting process, improving consistency, and freeing up attorneys to focus on strategic aspects like claim coverage. These tools can generate high-quality initial drafts in minutes, ensure uniform language across documents, and reduce human error. While AI boosts efficiency, expert review remains essential to ensure technical accuracy and strategic alignment, and strong confidentiality protocols are key when integrating AI into legal workflows.