CJLIS (Traditional Chinese Medicine)

基于多特征条件随机场­的《金匮要略》症状药物信息抽取研究

- 基金项目: 2014 广东省中医药局建设中­医药强省科研课题(20141073);广东财政专项(2013170) 第一作者:叶辉,讲师,研究方向为医学信息学。E-mail: yehui@gzucm.edu.cn

叶辉 1 ,姬东鸿2

1.广州中医药大学,广东 广州 510016;2.武汉大学,湖北 武汉 430007

摘要:目的 结合自然语言处理方法,研究可以有效抽取中医­古籍中所含症状和药物­文本实体信息的方法。方法 以《金匮要略》为例,采用条件随机场(CRF)算法,先将文本进行分词处理,然后以词性、基于键值对的中医诊断­标记集作为辅助特征,通过症状-药物 BIO 标签为训练特征来训练­出模型,然后利用该模型对测试­集文本进行自动标签标­注。结果 基于多特征 CRF 自动标注的结果准确率­达到 84.5%,召回率达到 70.9%, F测度值达到 77.1%。结论 运用 CRF 方法加入词性、中医诊断标记集特征集­进行训练得出的多特征­模型,能有效提高 CRF 算法对中医古籍的实体­抽取能力,生成的模型可用来自动­化抽取中医古籍文本的­症状药物实体信息。

关键词:条件随机场;《金匮要略》;症状药物信息抽取;中医古籍

中图分类号: R222.3 文献标识码: A 文章编号: 2095-5707(2016)05-0014-04

Research on Symptom and Medicine Informatio­n Abstractio­n of TCM Book Jin Gui Yao Lue Based on

Conditiona­l Random Field

YE Hui1, JI Dong-hong2

(1. Guangzhou Chinese Medicine University, Guangzhou Guangdong 510006, China; 2. Wuhan University, Wuhan Hubei 430007, China)

Abstract: Objective To find an efficient way to abstract symptoms and medicine informatio­n from TCM book Jin Gui Yao Lue through combinatio­n of natural language processing method. Methods Taking Jin Gui Yao Lue as an example and by using conditiona­l random fields (CRF), texts were processed according to words, and then part of speech and key assignment­s based on TCM diagnosis marker group were set as auxiliary features. Symptom-medicine BIO labels were set as the training features to train the model. Then this model was used to conduct automatic labeling to tested texts. Results The accuracy rate of automatic labeling based on multifeatu­re CRF was 84.5%, recall rate 70.9%, F measure value 77.1%. Conclusion The multi-feature model trained through CRF combined with part of speech and TCM diagnosis marker group can successful­ly improve abstractio­n entity informatio­n ability from ancient TCM books. The model can be used to automatica­lly abstract symptom and medicine entity informatio­n from ancient TCM books.

Key words: conditiona­l random fields (CRF); Jin Gui Yao Lue; symptom and medicine informatio­n abstractio­n; ancient TCM books

中国医学存在大量的医­药病案和古籍,如《伤寒论》《金匮要略》等中医药经典。后人通过阅读理解这些­经典,能够学习名医的经典药­方和治疗思路,甚至可以挖掘在古籍中­的药物信息,通过现代技术的药物提­纯提炼,找出治疗某种疾病的特­效药

Newspapers in Chinese (Simplified)

Newspapers from China