STRASBOURG, FRANCE - The European Union defines a disease as rare if it afflicts fewer than one in 2,000 individuals.1 While each rare disease may be uncommon individually, collectively they constitute an outsized contribution to overall morbidity. About 95 percent of rare diseases lack effective treatment options.2 Most are thought to result from inherited gene mutations or occur spontaneously (or ‘de novo’) in an individual who is the first in an otherwise healthy family to display phenotypes. Fortunately, the exploration and diagnosis of rare diseases is now easier thanks to advances in next-generation sequencing technologies - which have gathered momentum in the past decade. Exome and genome sequencing reveal thousands to millions of genetic variants in a typical individual. The main challenge is therefore pinpointing the small subset of variants - usually just one or a few - responsible for a disease's phenotype.
In the ever-evolving landscape of medical genetics research, the identification of variants associated with diseases is an ongoing struggle. Over the last few years, several technologies have been developed to identify genetic variants based on their biochemical and genomic features, but only a few of these consider patient characteristics and evidence from the literature. Current methods often rely on ontologies such as the human phenotype ontology (HPO) - which defines and organizes human phenotypic abnormalities - or the online mendelian inheritance in man, an ontology that catalogues human genes and genetic disorders.3 Conventional approaches frequently require clinical presentations to conform to standardized vocabularies, making it more difficult to precisely grasp patient symptoms. These approaches are often incomplete and do not cover evolving research concepts or terms not explicitly defined - which in turn affects the accuracy of ontology-based tools.