Is ChatGPT a medical device?
The term generative AI refers to systems which are capable of producing text, images, or other media in response to user entered prompts. In recent months, ChatGPT (a generative AI tool that uses deep learning to enable human-like conversation on any given topic) has become hugely popular. This has led to discussion around the potential application of generative AI software across different industries, which has naturally spilled over into the field of healthcare.
It is not difficult to see how such tools may be used in the medical context. When given the prompt “I am experiencing joint pain, what should I do?”, ChatGPT returned a five-step answer which included the following advice:
Take over-the-counter pain medications: Non-steroidal anti-inflammatory drugs (NSAIDs) such as ibuprofen or aspirin can help reduce pain and inflammation in the joints. However, you should consult with a doctor before taking any medication, especially if you have any underlying medical conditions or are currently taking any prescription drugs.
Users can provide further context (e.g. severity of pain, possible causes, and details of other medication being taken) to narrow the recommendations provided by ChatGPT. With sufficient information and prompting, the software could potentially provide tailored treatment advice for many medical conditions. This raises the question of whether ChatGPT and other similar generative AI tools could be regarded as medical devices.
What makes a product a medical device?
The general approach in most countries when it comes to deciding what qualifies as a medical device and what doesn’t, is to look at the product’s intended purpose. Only products which are intended to be used to diagnose, treat, cure or prevent a medical condition are regarded as medical devices.[1]
The fact that a product has characteristics which may lend it to the treatment of a medical condition is generally irrelevant unless the manufacturer indicates (through labelling, instructions for use or any other materials) that the product is intended to be used in this manner. So, for example, a smartwatch which tracks the wearer’s heart rate will not be a medical device unless specific claims are made that it can help monitor or detect potential heart conditions (e.g. atrial fibrillation).
On this basis, ChatGPT is not to be regarded as a medical device since no claims have been made by its manufacturer (OpenAI) that the software can be used for a medical purpose. However, if an identical tool were to be developed and marketed for use in a medical context, its manufacturers would likely find that their product qualifies as a medical device on the basis of its intended purpose.
In addition, while ChatGPT isn’t itself a medical device, it may still indirectly incur regulation as SOUP (Software of Unknown Provenance) if it is used in the development of a medical device. SOUP is defined as software that is used in medical devices but which hasn’t been developed for this purpose. So, for instance, the developer of an app for diagnosing skin conditions may use ChatGPT to make its user interface more friendly and empathetic. In this case, since the app has an intended purpose that would be covered by the medical device regulations, ChatGPT would also be subject to indirect regulation as an incorporated component of the device.
SOUP concerns
However, the use of ChatGPT as SOUP in medical devices may complicate the regulatory approval process to the extent that it prevents such products from making it to market. Under IEC 62304 (the medical device software lifecycle standard),[2] manufacturers must specify the requirements for any SOUP components they use and test their performance. In addition, manufacturers also need to include a specific plan for the management of risks relating to the SOUP in their technical documentation. Since most generative AI tools (including ChatGPT) are only available by API, documentation on how the underlying models are built, trained and maintained is not currently publicly available. Without access to this data, it may be difficult for manufacturers to develop their products in accordance with the SOUP requirements of IEC 62304.
SOUP also creates issues from a risk management perspective as manufacturers cannot control when new updates are implemented or the software is withdrawn from the market entirely. While manufacturers can typically "freeze" SOUP versions to manage these updates at their end, this process is more complicated for tools like ChatGPT which are trained to continuously learn from user inputted questions and responses. In addition, generative AI is well-known to "hallucinate", which poses particular risks when used in a medical context. Manufacturers will therefore need to implement rigorous monitoring mechanisms to ensure that their devices continue to function as intended. If, over time, new risks are introduced or existing risks are modified, manufacturers will be required to obtain additional regulatory approval for the change.
Ultimately, it is possible for products incorporating generative AI such as ChatGPT to be placed on the market as medical devices. However, additional controls will be needed to address the risks associated with SOUP-based products. In a recent blog post, the MHRA (the UK’s regulatory authority for medical devices) stated that it would be difficult for generative AI-based devices to comply with the medical device regulations and reiterated the importance of any medical devices having documented evidence of their safety and effectiveness. However, the MHRA also stated that it remains open minded to the possibility of regulating generative AI tools as medical devices in the future.
[1] Notably, this is the approach taken in the EU, UK, US and Australia
[2] IEC 62304 is a recognised standard in many jurisdictions, including the EU, UK and US