There is one thing certain about AI: that we know very little about it. ChatGPT and its siblings have excited our imagination because, well, because they “chat”. They converse with us, respond to questions and give information in a human-like manner, therefore look “intelligent”. But AI is much more than that. One could even say it is all but that. And if there is one field where AI is at home, that is data management.
Generative AI algorithms are not only capable of sorting through massive amounts of seemingly scrambled data, which computers could anyway do long before the emergence of AI, but they can also find patterns, establish hidden correlations, create their own ways of looking at data and ultimately come up with conclusions, therefore create new value. They can also learn from experience, just like humans do and that is what makes them “intelligent”, so to speak.
In drug development, we know data. We generate data, collect data, verify and clean-up data. From the quest of new molecules with healing properties to the ultimate follow-up of drugs on the market, large amounts of data are exchanged, processed, analyzed, presented and commented on. Could AI help us do a better job? Work faster and discover hidden patterns inside our data banks?
Of course, other sectors such as particle physics for example have already embarked on massive use of AI to drill through their mountain-high piles of data looking for the fundamental structure of our world. But despite the impressive amount of data, they face few constraints in the way they manipulate numbers.
It is a very different story when it comes to dealing with data about people. Data about disease. Data that will one day become a pill that you and I will have to swallow.
In this article, we will discuss how AI is used today and how it might be used in the near future to boost the data management capacity, and we will look at the advantages and the risks.
According to a recent article in the MIT Sloan Review1, there are five common data management areas where AI is playing important roles:
• Classification, which encompasses obtaining, extracting, and structuring data from documents, photos, handwriting, and other media.
• Cataloging: helping to locate data.
• Quality: reducing errors in the data.
• Security: keeping data safe from bad actors and making sure it’s used in accordance with relevant laws, policies, and customs.
• Data integration: helping to build “master lists” of data, including by merging lists.
All these areas apply to pharmaceutical research, but one should keep in mind that the use of AI varies for each one of them.
Classification in the wider sense, is mostly done by the scientists and clinical teams although extracting information from written documents by natural language processing (NLP) or from images may be an area of great development for IA.
A very promising, although less developed, area where AI could make a big difference is cataloging. This is described as “helping to locate data”. What it could do in a near future is sweep through historical data and identify signals and patterns that will lead to new discoveries. It may sound overly optimistic, but the increasing amount of available data may contain hidden indications that can only be revealed through the iterative and self-adjusting processes of generative AI.
Given the massive amount of data generated in clinical trials, AI algorithms also have the ability to identify new opportunities for a molecule that is being studied, for example by analyzing adverse events in combination to other study parameters.
Quality in clinical trials is paramount. Errors are tracked and corrected in many ways including on-site source data verification (SDV), automatic and manual edit checks. However, the cost of such operations can quickly rise. Analytical methodologies such as Risk-based monitoring (RBM) have emerged to identify the areas where errors and deviations are most likely to occur. AI can greatly improve the efficiency of these approaches.
Security and confidentiality have become major issues in recent years and their management has complexified with ever changing regulations both global and local. Data collected from participants in clinical trials, both by the clinical investigators or directly from the patients (PRO – patient reported outcomes) must be handled in such ways as to preserve confidentiality and used only for purposes described in the informed consent form (ICF) acknowledged by the participants. AI can help identify potential breaches to these rules and alert the sponsors.
Data integration has been a topic of IT work for a long time. However, the multiplication of sources and the increase of data volumes call for new ways to integrate information and most importantly extract meaning and value out of it. AI algorithms can improve integration by identifying new ways to combine seemingly unrelated data.
In short, as we all know very well, exciting times lie ahead of us. Stay tuned as AI unfolds the future.
_________________________________________
1How AI Is Improving Data Management. Artificial intelligence is quietly improving the management of data, including its quality, accessibility, and security. Thomas H. Davenport and Thomas C. Redman, MIT Soan Management Review, December 20, 2022 (https://sloanreview.mit.edu/article/how-ai-is-improving-data-management/)