Multimodal Model Architectures May Enhance Clinical AI Performance
By George Mastorakos  |  Feb 07, 2022
Multimodal Model Architectures May Enhance Clinical AI Performance
Image courtesy of and under license from
George Mastorakos believes combining data types into what are called "multimodal models" may be the key to moving clinical artificial intelligence into the next phase of better performance and broader applicability of clinical decision-making.

SCOTTSDALE, ARIZONA - Healthcare encompasses a plethora of data types and sources: demographic data, lab values, scans, videos, speech studies, medication dosages, insurance coverage data, and wearable data, e.g., FitBit/Apple Watch, to name just a few.

Despite this diversity, most machine learning models in healthcare incorporate only one type of data source, whether that be image data, e.g., labeled magnetic resonance imaging (MRI) to detect brain tumors, or time series data, such as electrocardiograms to detect arrhythmias. Research on multimodal models, models that incorporate multiple types, or, mathematically speaking, have different modes, is sparse. Why aren’t multimodal models used as the standard for clinical artificial intelligence (AI)? 

Key Challenges

For multimodal models to work properly, several pre-processed, cleanly labeled, and relevant datasets must be readily accessible. This prerequisite unearths a few key obstacles. Firstly, compiling a new single database, let alone multiple ones, is sometimes a challenge. Much of patient data is scattered across the electronic medical record, image storage systems, e.g., picture archiving and communication systems, and other clinical data stores; is difficult to parse through, sort, and organize; is usually manually scraped by medical students or assistants conducting clinical research - an error-prone process. Even if the proper care and energy was put into organizing a single database, it may not have all the data types necessary for training a multimodal model. A typical cancer registry, e.g., may include patient characteristics, chemotherapy regimens, and treatment outcomes, but likely doesn’t include x-ray image files or specific lab values over time.

Secondly, labeling multiple data types requires extreme consideration toward the end use case scenario. A mo

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. The copying or storing of any content for anything other than personal use is expressly prohibited without prior written permission from The Yuan, or the copyright holder identified in the copyright notice contained in the content.
Continue reading
Buy this article for only US$1.99
- or -
Continue with Linkedin Continue with Google
Share your thoughts.
The Yuan wants to hear your voice. We welcome your on-topic commentary, critique, and expertise. All comments are moderated for civility.