Open medical data is no-brainer

The Yuan requests your support! Our content will now be available free of charge for all registered subscribers, consistent with our mission to make AI a human commons accessible to all. We are therefore requesting donations from our readers so we may continue bringing you insightful reportage of this awesome technology that is sweeping the world. Donate now

By Satyen K. Bordoloi | Jul 05, 2022

Image courtesy of and under license from Shutterstock.com

Medical AI is revolutionizing global healthcare, even reengineering human genes, but it cannot do so without data. This generation’s data may be the last to record ‘pure’ medical data, which will be tainted by humans with altered genes in future, so today's will be priceless for posterity, says Satyen K. Bordoloi as he marks the 110th anniversary of Alan Turing’s birth to push for open access.

MUMBAI - The COVID pandemic has exposed shocking fault lines in our healthcare systems. Take pulse oximeters - noninvasive devices which gauge oxygen in blood by shining infrared light into capillaries - as their use became more common, we were horrified to learn they had been calibrated only on white skin, i.e., they gave incorrect readings on darker skin tones.

Medical science already knew this, yet it still was not addressed. The reason for this was simple - there was simply no need, since the bulk of the stupendous developments in medicine in the last 50 years have mostly come from the West, with the test subjects and data used primarily for, of, and by white, Caucasian people. In effect, for half a century 15 percent of the world’s population had decided the entire world’s medical needs.

As the globe prepares for the next 50 years of explosive growth in medicine and healthcare, the factors leading the charge, in addition to Artificial Intelligence (AI) today and Quantum Intelligence a decade or so from now, are clustered regularly interspaced short palindromic repeats (CRISPR) - a family of DNA sequences found in the genomes of prokaryotic organisms - and this growth will be based on medical data. Hence, the most important question we can ask today is: should datasets be open or closed?

Open datasets refer to any of the publicly funded data available to developers and researchers for their use, with no or limited caveats and at minimal or no cost. Closed datasets refer to those that are guarded by private corporations and closed to everyone but their own researchers or those to whom they grant access.

The pros and cons of open datasets are well known.¹The pros are increased efficiency at reduced costs, innovation, increases in transparency, and reductions in corruption. The cons are incomplete, incorrect, and missing data, lack of privacy and consent, and identity theft. W

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.

GET STARTED

- or -