Proyecto Multipaís: Research and development in the reuse of EHR in clinical research through openEHR and knowledge graphs.

The growing amount of clinical data in electronic format presents a great opportunity to enhance research and healthcare. However, much of this data in electronic health records (EHRs) is in free-text format, making it difficult to organize, ensure interoperability between systems, and consequently use in clinical research. Our project aims to address this issue by using standards such as openEHR and SNOMED CT, which enable better organization and access to healthcare information.

The main objective is to maximize the use of electronic health record (EHR) data in clinical research by leveraging these standards, advanced techniques such as natural language processing (NLP), and knowledge graphs. As a result, the project will generate tools and methodologies that, on one hand, facilitate the structuring and normalization of narrative clinical texts and, on the other, enable the use of information contained in openEHR repositories for clinical research.

We will focus on the following areas:

  • Structuring clinical texts in Spanish in openEHR format, which involves identifying and organizing relevant information within these texts.
  • Creating a knowledge graph from clinical texts and openEHR data to enable more precise and advanced searches in electronic health records.
  • Developing an advanced query engine that enables searches guided by SNOMED CT terminology on openEHR repositories, improving the accuracy of analyses.

The following figure shows the general framework of the work areas previously listed and the relationships between them.

One of the most important innovations will be the semantic query engine, which will enable complex semantic searches in electronic health records. For example, by fully leveraging the definition of SNOMED CT concepts, it will be possible to answer questions such as "patients with a disease localized in the lung that is not a neoplasm and who do not have any type of heart disease." This engine, supported by SNOMED CT and the knowledge graph, will facilitate a more detailed and accurate analysis of clinical data.

It is expected that this project will significantly improve the ability to perform semantic analysis and searches, benefiting both researchers and healthcare providers, who will be able to access more detailed information for decision-making. Additionally, the project could have a major impact on how clinical research is conducted and the development of electronic health records, as software companies will be able to integrate these innovations into their products, improving service quality. It also contributes to the goals of the PERTE for cutting-edge health by promoting an innovative data system that enhances prevention, diagnosis, treatment, rehabilitation, and health research.

The project will be carried out by the company Veratech for Health in collaboration with the German company Vitagroup. (https://hip.vitagroup.ag/en/) which will contribute its expertise and tools for managing openEHR data.

The project is funded by the Ministry of Science, Innovation, and Universities and by the CDTI within the "Multi-country" Projects call linked to the PERTE of Cutting-Edge Health under the Recovery, Transformation, and Resilience Plan.