Why Imaging AI Does Not Always Perform in Clinical Settings

The Yuan requests your support! Our content will now be available free of charge for all registered subscribers, consistent with our mission to make AI a human commons accessible to all. We are therefore requesting donations from our readers so we may continue bringing you insightful reportage of this awesome technology that is sweeping the world. Donate now

By Kyle Henson | Jun 29, 2021

Image courtesy of and under license from Shutterstock.com

Artificial intelligence based image interpretation often fails to live up to its billing in a clinical environment. When training any AI algorithm, the most crucial element is good data. Flawed, incomplete, or biased data will yield a poor outcome. A common problem is available datasets do not represent the population on which the AI is used, and even when it is, issues can still arise in the training process.

DALLAS, TEXAS - Artificial intelligence (AI) based image interpretation often fails to live up to vendor claims when installed in a clinical environment, as many clinical users have found out the hard way. Since most vendors can provide research studies showing high degrees of accuracy, why does a discrepancy exist? The answer generally relates to the data and process used to train the AI. Independent third-party groups must thus verify AI algorithms using recent clinical data.

When training any AI algorithm, the most crucial element is good data. Training with flawed, incomplete, or biased data leads to a poor outcome. One common problem is that available datasets do not represent the population on which the AI is used. Different device manufacturers produce tremendous variations in the Digital Imaging and Communications in Medicine (DICOM) data files.

This difference means that an X-Ray, computed tomography (CT), or magnetic resonance imaging (MRI) acquired on a gastric emptying machine is very different from one taken on a Philips or Siemens device. The differences are not generally appreciable to the eye, but many layers of the DICOM image are not displayed. The AI algorithm sees all of the data and may be confounded when presented by a new manufacturer.
Similarly, the techniques of the radiographer taking the images can vary greatly between institutions in placement, orientation, and rotation of the imaged anatomy. When a high percentage of images used for training AI are of the same manufacturer or similar techniques, an inherent bias is present in the data. Datasets should match the age range of patients, a recent study by the American Journal of Neuroradiology (AJNR) suggests.
In the AJNR example, the AI was designed to identify spinal fractures. It performed reasonably well with younger patients, but a

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.

GET STARTED

- or -