🌞 1-知识图谱和Schema的定义

# Definition ## Knowledge Graph We define a knowledge graph as $\mathcal{G}=\{\mathcal{E}, \mathcal{R}, \mathcal{F}\}$, where $\mathcal{E}, \mathcal{R}$ and $\mathcal{F}$ are sets of entities, relations and facts, respectively. A fact is denoted as a triple $(h, r, t) \in \mathcal{F}$. 具体实现: 1. 实体-关系-实体 2. 实体-属性-属性值 ## 该项目用到的 Medical Data ![image.png](https://cos.easydoc.net/46811466/files/l894hi9v.png) [[source GitHub](https://github.com/liuhuanyong/QASystemOnMedicalKG)] [[data on Huggingface](https://huggingface.co/datasets/nlp-guild/medical-data)] **Analysis**: 1. 每个sample都是某一个疾病的各种信息 (dict/json object) 2. an example: ![image.png](https://cos.easydoc.net/46811466/files/l8auh9u9.png) ## Schema 数据模式 ![image.png](https://cos.easydoc.net/46811466/files/l894b2nh.png) [[schema介绍](https://mp.weixin.qq.com/s/t7F9GDEsn0wmqwZ-eh-Ypg)] **Analysis**: 根据原生Medical Data的实体种类以及不同种类实体之间的常识关系 (如"感冒" "有症状" "发烧"), 建立关系模式 schema. # Build KG Goal: 将原生的Medical Data 转换为按照上述schema的KG, 并写入neo4j DB中 [[code](https://colab.research.google.com/gist/leoxiang66/bf52b9045187a95e276914451954c667/build_kg.ipynb)] [[generated KG on neo4j](https://console.neo4j.io/#databases/f54cadff/detail)] ![image.png](https://cos.easydoc.net/46811466/files/l8auovxx.png) ![image.png](https://cos.easydoc.net/46811466/files/l8auphz2.png) **Remark**: *第一次使用需要在neo4j官网新建一个neo4j auro DB instance, 并保存username, password* # Summary ![image.png](https://cos.easydoc.net/46811466/files/l8ausdis.png)