CJLIS (Traditional Chinese Medicine)
中文中医本体自动扩展的定量研究
王大禹,李园白,杨阳,崔蒙*
中国中医科学院中医药信息研究所,北京 100700
摘要:目的 对利用新的知识源自动扩展中文中医本体进行定量研究。方法 基于中医医疗术语及术语间的关系构建实验用的本体,利用中医药学术论文中提及的病例作为知识源扩展本体,并用作测试集来评测本体扩展前后的质量。结果 包含 41 652 个实例的本体可以通过 3000 个诊次的医疗信息扩展,对真实临床应用出现的疾病名称的覆盖率由 52.3%增至 72.4%,证候名称覆盖率由 14.8%增至55.8%,药物名称覆盖率由 13.7%增至 54.8%,治法名称覆盖率由 25.8%增至 77.2%。结论 利用学术论文中提及的病例作为新知识源来自动扩展本体可以显著增加本体的覆盖率。
关键词:本体扩展;中医药本体;中医药术语集
中图分类号: R2-03 文献标识码: A 文章编号: 2095-5707(2016)05-0009-05
A Quantitative Study on Automatic Expansion of Chinese TCM Ontology
WANG Da-yu, LI Yuan-bai, YANG Yang, CUI Meng*
(Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medicine Science, Beijing 100700, China)
Abstract: Objective To conduct a quantitative study on the automatic expansion of Chinese TCM ontology with new knowledge sources. Methods The experimental Chinese TCM ontology was built based on TCM terms and relationships among different terms. Medical cases in TCM academic papers were set as knowledge sources for expansion of ontology. These cases were used as testing sets to evaluate the quality of the ontology before and after expansion. Results Ontology with 41,652 cases could be expanded through information in 3000 clinical visits. The coverage of this ontology on disease names in real clinical application increased from 52.3% to 72.4%, syndrome names from 14.8% to 55.8%, medicine names from 13.7% to 54.8%, and TCM therapy names from 25.8% into 77.2%. Conclusion Using medical cases in TCM acajemic papers as the new knowledge sources for automatic expansion of ontology can significantly increase ontology coverage.
Key words: ontology expansion; TCM ontology; TCM terminology set
一个本体包含的概念、实例及关系的数量决定了这个本体可以支持的智能算法的广度和深度。如果一个实例在本体中找不到,那么本体对于这个实例的处理和计算就无法提供准确支持。当然,也可以考虑利用相似度计算找到本体中与被查询实例最接近的实例,但是这种方法会引入其他知识资源(如同义词词典)或计算模型(如向量空间模型等相似度计算模型),本体无法独立工作,产生依赖关系;同时也降低了准确度,因为相似