Healthcare Analytics Using Graph Database Neo4j with Cypher Query Language

Yaswanth Kumar Togarapu
3 min readApr 25, 2023

Health care analytics is an analysis activity that can be undertaken as a result of data collected from four areas within healthcare:

  1. Electronic Health Records (EHRs)
  2. Claims and billing data
  3. Clinical data
  4. Patient-generated data

Health care analytics is a growing industry and is expected to grow to even more with time.

Use Case

I have developed a self-explanatory example use case to explain the capabilities of Graph Database in healthcare. In the demo guide, we are performing data ingestion and analytics of the FDA Adverse Event Reporting System Data.

FDA Dataset

The FDA Adverse Event Reporting System (FAERS or AERS) is a computerized information database designed to support the U.S. Food and Drug Administration’s (FDA) post-marketing safety surveillance program for all approved drug and therapeutic biologic products.

The FDA uses FAERS to monitor for new adverse events and medication errors that might occur with these products. It is a system that measures occasional harms from medications to ascertain whether the risk–benefit ratio is high enough to justify continued use of any drug and to identify correctable and preventable problems in health care delivery (such as the need for retraining to prevent prescribing errors).

Public Dashboard of FDA

Data, Modeling, and Graph Ingestion

We downloaded one of the publicly available FDA FAERS datasets, and massaged and articulated the demographics for the United States. FAERS data is traditional RDBMS-based tabular data. We translate it to a Graph-based data model.

Data Model

Next, we perform data ingestion to prepare the FAERS graph and run a few example analytics queries to see the interesting output. Some interesting queries are:

  • What are the top five drugs reported directly by consumers for the side effects?
  • What top ten drug combinations have the most side effects when consumed together?
  • What age group reported the highest side effects, and what are those side effects?
  • What are the most common side effects reported in children and what drugs caused these side effects?

You’ll notice these queries are truly analytical in nature — additionally, they cannot be easy to prepare and produce with a traditional RDBMS data and querying language. With Neo4j and the power of Cypher, this becomes extremely easy.

Example Analysis

--

--

Yaswanth Kumar Togarapu

Hello everyone, Myself Yaswanth from India. I love to solve the real world problems and make myself confident to acheive that.