Multi-modality AI in medicine
By Moein Shariatnia  |  Mar 28, 2023
Multi-modality AI in medicine
Image courtesy of and under license from
Unlike most AI, which focuses on and is trained for handling just one specific modality, multimodal AI models can handle different modalities such as image, text, and speech at the same time, in addition to narrow and specific tasks. This means that, if and when multi-modality AI progresses to the point where it can deploy on a vast scale, it will truly revolutionize medicine and many other fields.

TEHRAN - Medical students’ professors often emphasize to them the importance of every aspect of a patient’s visit to the clinic, as it can provide valuable insights for an accurate diagnosis. To begin the process, the doctor must listen attentively as the patient describes his/her symptoms and complaints, gathering a thorough medical history. The next crucial step is the physical examination. At this step, the doctor inspects (looks at the body), palpates (touches the body gently), auscultates (listens to the heart, lung, abdomen sounds), and percusses the patient’s body to obtain as much relevant information as possible to help narrow down the list of differential diagnoses. By utilizing all senses, doctors can ensure the best possible diagnoses and treatment plans for patients.

The majority of the machine learning (ML) and deep learning (DL) applications in medicine today - not to mention most other fields beyond healthcare - take the opposite tack to this holistic approach: They are much narrower and focus only on doing one task in a single modality, such as detecting pneumonia in chest X-rays (CXR), determining if a pathology slide shows patterns of a certain cancer, or segmenting tumoral cells in a brain magnetic resonance imaging (MRI) scan. At their current level of narrow focus, these applications can at best serve as supportive tools - just as how a calculator supports a mathematician - to boost the efficiency of healthcare providers by a few percentage points, or to serve as something that double checks to avoid missing crucial diagnoses. This is still helpful and useful, but far from being anything that truly revolutionizes medicine.

The development of artificial intelligence (AI) systems that can handle various forms of data and modalities - such as natural language text describing patient history, lab tests as tabular numeric data, imaging studies like CXR, computed tomography (CT) scans, MRIs, ultrasonography, genom

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.
Continue reading
Sign up now to read this story for free.
- or -
Continue with Linkedin Continue with Google
Share your thoughts.
The Yuan wants to hear your voice. We welcome your on-topic commentary, critique, and expertise. All comments are moderated for civility.