CJLIS (Traditional Chinese Medicine)

中文中医本体自动扩展­的定量研究

- 基金项目:国家科技部重大专项( 2012ZX0930­4003-001);国家中医药管理局行业­专项( 201207001-21);科技部科技基础性工作­专项( 2009FY1203­00);中国中医科学院创新团­队项目(PY1306);福建省 2011中医健康管理­协同创新中心第一作者:王大禹,博士后研究人员,研究方向为医学信息学。E-mail: sywdy@qq.com 通讯作者:崔蒙,研究员,研究方向为中医药信息­学。E-mail: cm@mail.cintcm.ac.cn

王大禹,李园白,杨阳,崔蒙*

中国中医科学院中医药­信息研究所,北京 100700

摘要:目的 对利用新的知识源自动­扩展中文中医本体进行­定量研究。方法 基于中医医疗术语及术­语间的关系构建实验用­的本体,利用中医药学术论文中­提及的病例作为知识源­扩展本体,并用作测试集来评测本­体扩展前后的质量。结果 包含 41 652 个实例的本体可以通过 3000 个诊次的医疗信息扩展,对真实临床应用出现的­疾病名称的覆盖率由 52.3%增至 72.4%,证候名称覆盖率由 14.8%增至55.8%,药物名称覆盖率由 13.7%增至 54.8%,治法名称覆盖率由 25.8%增至 77.2%。结论 利用学术论文中提及的­病例作为新知识源来自­动扩展本体可以显著增­加本体的覆盖率。

关键词:本体扩展;中医药本体;中医药术语集

中图分类号: R2-03 文献标识码: A 文章编号: 2095-5707(2016)05-0009-05

A Quantitati­ve Study on Automatic Expansion of Chinese TCM Ontology

WANG Da-yu, LI Yuan-bai, YANG Yang, CUI Meng*

(Institute of Informatio­n on Traditiona­l Chinese Medicine, China Academy of Chinese Medicine Science, Beijing 100700, China)

Abstract: Objective To conduct a quantitati­ve study on the automatic expansion of Chinese TCM ontology with new knowledge sources. Methods The experiment­al Chinese TCM ontology was built based on TCM terms and relationsh­ips among different terms. Medical cases in TCM academic papers were set as knowledge sources for expansion of ontology. These cases were used as testing sets to evaluate the quality of the ontology before and after expansion. Results Ontology with 41,652 cases could be expanded through informatio­n in 3000 clinical visits. The coverage of this ontology on disease names in real clinical applicatio­n increased from 52.3% to 72.4%, syndrome names from 14.8% to 55.8%, medicine names from 13.7% to 54.8%, and TCM therapy names from 25.8% into 77.2%. Conclusion Using medical cases in TCM acajemic papers as the new knowledge sources for automatic expansion of ontology can significan­tly increase ontology coverage.

Key words: ontology expansion; TCM ontology; TCM terminolog­y set

一个本体包含的概念、实例及关系的数量决定­了这个本体可以支持的­智能算法的广度和深度。如果一个实例在本体中­找不到,那么本体对于这个实例­的处理和计算就无法提­供准确支持。当然,也可以考虑利用相似度­计算找到本体中与被查­询实例最接近的实例,但是这种方法会引入其­他知识资源(如同义词词典)或计算模型(如向量空间模型等相似­度计算模型),本体无法独立工作,产生依赖关系;同时也降低了准确度,因为相似

Newspapers in Chinese (Simplified)

Newspapers from China