Skip to Main Content

Can AI Predict the Future?

Yale Medicine Magazine, Spring 2025 (issue 174) AI for Humanity in Medicineby Rachel Tompa, PhD

Contents

Yale scientists are using sophisticated AI models to forecast disease risk, improve diagnoses, and develop better treatments.

Imagine visiting a doctor's office for new symptoms and getting tested with thousands of different drugs in the blink of an eye—only it’s not your physical body receiving the medications, it’s a faithful replica of you stored inside the doctor’s computer.

This concept is known as a digital twin, and it could one day help physicians make diagnoses and predict the best treatment for each patient. While these virtual models of ourselves are still far from clinical use, such researchers as Jun Deng, PhD, professor of therapeutic radiology and biomedical informatics and data science at Yale School of Medicine (YSM), see a future in which our computational doppelgangers could help physicians make personalized medicine a reality.

“The beauty and the power of the digital twin is that it’s individualized,” says Deng, who is working on digital twins for cancer patients. “Each person has their own digital twin, and their data becomes the basis of that twin.”

If the idea of a collection of ones and zeros (that is, binary code) making diagnoses or prescribing treatments gives you pause, not to worry—computers aren’t going to replace human doctors anytime soon. In medicine, AI is a tool, not a stand-in. Human doctors are still very much needed to catch errors and make final decisions.

“Ultimately, all these tools are evaluated as human versus human plus AI,” says Xenophon Papademetris, PhD, professor of biomedical informatics and data science, and radiology and biomedical imaging at YSM. “It’s not AI versus human. We haven’t reached that point. That might happen, but not anytime soon.”

But we also don’t need to wait for scientists to program full human replicas to harness the power of AI in predictive medicine: Scientists at YSM are building powerful computational models to forecast our disease risk, speed accurate diagnoses, and build more precise treatments. AI is helping researchers delve deeper into many aspects of human biology and disease, from predicting the detailed molecular structure of an immune protein when it encounters a diseased cell to modeling the human brain and how it changes in psychiatric disorders.

AI before it was cool

Although AI is the au courant buzzword, its applications in predictive medicine far predate the 2022 launch of the chatbot ChatGPT. AI has been used in medical imaging for decades, aiding diagnostics and flagging potentially abnormal results. The first FDA-approved AI-based tool, authorized in 1998, was a computational method to assist radiologists in detecting breast cancer on mammography images. Now, of the more than 1,000 AI-based technologies that have been approved by the FDA, around 75% of them are related to imaging. But other areas of medical AI are catching up quickly.

In research too, AI has been helping scientists make important discoveries about human biology for many years. Generative AI, the type of AI that underlies such popular tools as ChatGPT, has its place in research, but there are many other computational research methods that have been developed outside the limelight.

“Scientists have been doing AI since before it was cool to do AI,” explains Mark Gerstein, PhD, Albert L Williams Professor of Biomedical Informatics and professor of molecular biophysics and biochemistry, of computer science, and of statistics and data science at YSM. “In the past few years, there’s been somewhat of a revolution, but for us, it’s more of an evolution in terms of what we’re doing with AI.”

Gerstein’s expertise is in using machine learning, a subset of AI, to mine large amounts of molecular data for biological meaning. Recently, he led an analysis of nearly 400 human brains that offered new insights into the genetic variants that underlie such neurological and psychiatric diseases as schizophrenia, bipolar disorder, and Alzheimer’s disease. A better understanding of the genetic bases of these diseases is critical, because while most brain diseases and disorders are highly heritable, the specific biological factors that cause these diseases remain a mystery.

“Genetics for mental disease is extremely predictive, but people have no clue how these diseases work,” Gerstein says. “That’s a big problem when it comes to developing drugs.”

In that study, Gerstein and other researchers from the large nationwide collaboration known as PsychENCODE built a computational model that encapsulates key aspects of the human brain, using molecular information from millions of brain cells from 388 different people who’d donated their brains to science for use after death. Some of the donors had been healthy; some had had brain diseases and disorders.

The team’s model was built to replicate how cells in the brain connect to each other and the complicated networks of proteins and gene expression in individual cells. The brain model can predict whether a certain genetic variant leads to a psychiatric disease, such as schizophrenia, and suggest the kinds of brain cells in which that variant plays the biggest role. The computational approach also shows how dialing up or down individual genes can tune the brain model toward a healthy or diseased state, offering insight as to how new drugs might tackle these disorders.

Gerstein’s work is also helping to make sense of cancer’s complexity. He’s led studies seeking to understand the full repertoire of mutations that drive cancer’s growth. Although tumors often contain thousands of genetic mutations, the belief widely held in the field states that a few or even just one of those mutations is responsible for each cancer’s formation—so-called driver genes. Gerstein and his colleagues used computational modeling to show that all the other cancer mutations—the passenger genes—also impact tumor development and growth, so they shouldn’t be ignored when it comes to drug development.

From health records to digital twins

Deng is also using AI to make headway into preventing cancer. He’s developed models to predict cancer risk earlier and more accurately, potentially intervening before the disease begins. “Instead of waiting for the patient to come to us with symptoms, we want to move the battlefield earlier—detect and diagnose the cancer earlier,” Deng says.

Unlike Gerstein’s models, which rely on molecular analyses of individual cells, Deng’s prediction models use patients’ electronic health records. He and his colleagues built a model they dubbed a “statistical biopsy,” which predicts the risk of 15 different cancer types for men and 16 for women. The model uses such data as age, body mass index (BMI), family history, blood test results, and patient-reported symptoms. Deng and his colleagues trained the model on a public dataset from cancer patients in the United States and then tested the model’s accuracy on a different database from the United Kingdom. In that test, the statistical biopsy was able to predict cancer likelihood with 80 to 90% accuracy for 10 different cancer types, including female breast cancer, colorectal cancer, lung cancer, and melanoma.

Such prediction models are early steps toward Deng’s ultimate goal: building digital twins for cancer patients. These sophisticated models would incorporate not only patients’ health records and history, but also detailed virtual representations of individual organs and body systems, including the immune system or gut microbiome.

Some aspects of the digital twin will be based on nonindividualized but highly detailed models of parts of the human body—patients won’t have to undergo invasive biopsies to precisely replicate all their organs in the model, for example. But a cancer patient may have their tumor biopsied and its complete genetic profile sequenced, providing valuable personalized data that can be added to the model. Digital twins are also dynamic, Deng adds, incorporating new information such as how a tumor responds to a given treatment to update themselves and provide accurate real-time predictions.

Looking ahead, digital twins might one day even be used to speed clinical trials, Deng says. If new drugs can be tested virtually before they’re given to patients in the physical world, that would save resources and precious time for patients who are waiting for new treatments for their cancers. Right now, Deng and his collaborators at Yale and at other institutions are working on models of organs and body systems. The next challenge will be integrating those individual models into a larger, multiscale model of the entire body. Optimistically, Deng predicts it will be at least five years before a true digital twin becomes available for clinical use.

“We don’t want to underestimate the complexity of the human body,” Deng says. “There is so much data generated every day, and so many phenomena where we still don’t completely understand the biology.”

Protein binding prediction

On a smaller physical scale, YSM Associate Professor of Biomedical Informatics and Data Science María Rodríguez Martínez, PhD, is using AI to make predictions that could lead to better cancer immunotherapies—a type of cancer treatment that harnesses the natural power of the immune system to attack tumors.

Rodríguez Martínez is building models to understand how certain kinds of immune cells known as T cells interact with their targets. T cells patrol our bodies, searching out and identifying cells or molecules that shouldn’t be there, such as cancerous cells or foreign invaders including bacteria and viruses. T cells’ surfaces are studded with proteins known as T-cell receptors that allow the immune cells to sample their environment by binding to other cells or molecules. T-cell receptors are amazingly flexible—both literally and figuratively. These proteins can bind to a wide repertoire of other molecules, and they rapidly change their 3D shapes to do so.

But not every T-cell receptor can recognize every disease protein. We all have many different types of T-cell receptors, and not everyone’s T-cell receptors are the same. Understanding whether a given T-cell receptor will bind to a given disease molecule would be a huge leap in the field of cell therapy—a form of immunotherapy in which a patient’s own T cells are engineered to better recognize and destroy cancer. Rodríguez Martínez is working on models to predict T-cell receptor binding, the first step toward designing new T-cell receptors that are tailored to certain kinds of cancer cells.

To date, many researchers have used AI to attempt to predict T-cell receptor binding, but there’s room for improvement, Rodríguez Martínez says. Current models use the genetic sequence of T-cell receptors and disease proteins to predict binding, but the sequences don’t capture the full picture.

“It’s like trying to define Mount Everest without a three-dimensional map; it’s akin to just giving someone the mountain’s total height and telling them to go ahead and climb it,” Rodríguez Martínez says. “The whole community has been doing it this way because until very recently we didn’t know how to do better.”

Rodríguez Martínez is now working on models that incorporate T-cell receptors’ 3D structures. The models are made possible by advances such as the recent Nobel Prize-winning AlphaFold, an AI technology that predicts protein structures from their amino acid sequences. However, because T-cell receptors are such good shape-shifters, it’s not as simple as just plugging their information into AlphaFold, Rodríguez Martínez explains, but this and related technologies give researchers a good starting point to predict T-cell receptor binding more accurately.

Doctor, meet AI

Deng and Papademetris are also working to bridge the gap between humans and computers. Deng is working on models that could better personalize radiation treatments in real time, incorporating patient preferences into the model’s input in the clinic. The AI is designed to help the radiation oncologists and their patients in shared decision-making, Deng says.

Papademetris is increasingly focused on training current and future physicians, software engineers, and others in medicine and industry on how to interact with medical AI. Like any sophisticated new tool, education is a key part of AI’s rollout to the clinic. Papademetris has launched a new online nondegree program at YSM, Medical Software and Medical Artificial Intelligence, and an accompanying series of video interviews with experts in the field of medical AI aimed at anyone interested in or currently working in this area.

“How do we train the people who are going to be working in this space to create these products, to use these products, to manage their integration into hospitals?” Papademetris asks. “It’s not just about building better software, because the software also needs to talk to humans. If the humans are part of the system, we can improve things by training the human as well, not just by improving the AI.”

Previous Article
Chatbot Revolution
Next Article
YSM Scientists Receive NIH Support for Bold Research Ideas