ACTA Scientiarum Naturalium Universitatis Pekinensis
Research on Automatic Writing of Football Game News
WANG Wenchao1, LÜ Xueqiang1,†, ZHANG Kai2, ZHOU Jianshe2
1. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101; 2. Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing 100048; † Corresponding author, E-mail: lxq@bistu.edu.cn
Abstract After analyzing the characteristics of different types of sports events, the authors propose an automatic writing method for football tournament with real-time data as data source for the first time. The real-time data is automatically annotated according to historical news, and the training set is obtained. After annotation the real-time data is modeled by convolution neural network (CNN) to automatically identify the key events in real-time data. Events in structured information are transformed into news style natural language. Experiments show that the proposed method works better than other methods, and the content is more detailed and can be easily extended to the automatic writing of other sports games. Key words automatic writing; football; sports news; real-time data
足球被称为全球第一大运动, 热爱足球的人们遍布世界的每个角落。作为人们了解足球的重要信息来源, 足球新闻在体育新闻中占据的比重往往是最大的[1]。因此, 针对足球赛事战报的计算机自动写作研究日益成为热点。
自动写作的想法由来已久, 随着大数据、自然语言处理以及其他人工智能技术的发展, 近年来逐渐开展用算法自动生成新闻报道的探索和实践[2]。由于中文的复杂性, 中文自动写作比英文自动写作
更加复杂。2006 年, 中国科学院计算技术研究所叙事智能和动画生成小组(NICA)开发了一种叙事与动画智能实验平台 PNAI (A Platform for Narrative and Animation Intelligence), 可以生成满足用户需求的叙事文章[3]。微软亚洲研究院 2006 和 2008 年分别公开上线微软对联系统的第一版和第二版, 可根据用户给出的上联自动生成出若干下联[4]。2015年, 腾讯财经开发的写作机器人 Dreamwriter 引用国家统计局公布的 8 月份 CPI 数据和统计分析师的