So­gou, Xin­hua vir­tual an­chor her­alds AI era

China Daily (USA) - - WORLD INTERNET CONFERENCE - By WANG KEJU in Wuzhen, Zhe­jiang wangkeju@chi­nadaily.com.cn

Chi­nese search en­gine So­gou launched an AI vir­tual an­chor — the world’s first hu­man replica in­tel­li­gent vir­tual host, at the Fifth World In­ter­net Con­fer­ence in Wuzhen, East China’s Zhe­jiang prov­ince.

The tech­nol­ogy sim­u­lates nat­u­ral speech and ex­pres­sions, in­te­grat­ing ad­vanced im­age de­tec­tion and pre­dic­tion ca­pa­bil­i­ties, as well as speech syn­the­sis, to al­low the vir­tual an­chor to “broad­cast” text in­puts in real time.

His ap­pear­ance and voice are mod­eled af­ter Zhang Zhao, a real an­chor at Xin­hua News Agency, an of­fi­cial State-run me­dia out­let. Once a user in­puts news text, a vir­tual Xin­hua news an­chor will ap­pear on-screen.

The vir­tual an­chor speaks in Zhang’s voice, and of­fers a be­liev­able im­age of him, com­plete with ap­pro­pri­ate mouth move­ments and nat­u­ral fa­cial ex­pres­sions, mean­ing the vir­tual an­chor is not much dif­fer­ent from a real one.

Ac­cord­ing to Xin­hua, “he” has be­come a mem­ber of its re­port­ing team and can work 24 hours a day on its of­fi­cial web­site and var­i­ous so­cial me­dia plat­forms, re­duc­ing news pro­duc­tion costs and im­prov­ing ef­fi­ciency.

“Vir­tual as­sis­tants are rapidly gain­ing trac­tion as an ef­fi­cient way to solve daily prob­lems,” said Wang Xiaochuan, CEO of So­gou. Cre­at­ing a more re­al­is­tic vir­tual char­ac­ter will fa­cil­i­tate more nat­u­ral in­ter­ac­tions and en­able this tech­nol­ogy to be­come an even more in­te­gral part of ev­ery­day life, said Wang.

While still in the early stages of ex­plor­ing po­ten­tial ap­pli­ca­tions for this tech­nol­ogy, there is no doubt that So­gou will con­tinue to push the bound­aries of AI, Wang said.

Based on “So­gou avatar” tech­nol­ogy and us­ing such cut­ting-edge tech­niques as fa­cial land­mark lo­cal­iza­tion and face re­con­struc­tion, the AI Vir­tual An­chor was de­vel­oped suc­cess­fully side-by­side with mul­ti­modal in­for­ma­tion for joint mod­el­ing train­ing.

Ac­cord­ing to Wang Yan­feng, gen­eral man­ager of the in­tel­li­gent voice divi­sion at So­gou, “So­gou avatar” tech­nol­ogy is one of the divi­sion’s core achieve­ments, which fol­lows the con­cept of “Na­ture In­ter­ac­tion plus Knowl­edge Com­put­ing”.

This form of broad­cast­ing breaks through the re­stric­tion that vir­tual im­ages must be cre­ated first and with the ac­com­pa­ny­ing voice be­ing added later, Wang said, as us­ing the “So­gou avatar” tech­nol­ogy, the AI Vir­tual An­chor can pro­duce syn­chro­nized video in real time.

Users can pro­vide text in var­i­ous ways such as text typ­ing,

Vir­tual as­sis­tants are rapidly gain­ing trac­tion as an ef­fi­cient way to solve daily prob­lems.” So­gou

Wang Xiaochuan,

CEO of

voice in­put and ma­chine trans­la­tion. Then, they in­stantly ob­tain a real-time broad­cast video. This method of news­mak­ing will greatly re­duce the costs of post pro­duc­tion and im­prove ef­fi­ciency, Wang said.

As early as 2000, re­searchers in both the aca­demic and pri­vate sec­tors have worked to de­velop tech­nol­ogy that could create a vir­tual an­chor. This type of re­search has ad­vanced quickly in re­cent years thanks to the evo­lu­tion of AI-en­abled tech­nolo­gies such as fa­cial recog­ni­tion, lip-read­ing and ma­chine learn­ing driven by big data an­a­lyt­ics.

In de­vel­op­ing its vir­tual an­chor tech­nol­ogy, So­gou’s team of AI re­searchers an­a­lyzed au­dio and vis­ual data from a live an­chor, al­low­ing them to de­velop a model that could then pro­duce a re­al­is­tic vir­tual an­chor.

With a fo­cus on nat­u­ral lan­guage pro­cess­ing and ma­chine learn­ing, So­gou has de­vel­oped in­dus­try-lead­ing ca­pa­bil­i­ties in speech recog­ni­tion and im­age recog­ni­tion. So­gou’s speech recog­ni­tion tech­nol­ogy pos­sesses an ac­cu­racy rate of over 97 per­cent, while its im­age recog­ni­tion tech­nol­ogy has achieved an ac­cu­racy rate of 96 per­cent.

Cur­rently, there are 500 mil­lion voice re­quests on So­gou each day. The en­gine pro­cesses these with mul­ti­lin­gual and mul­ti­tonal speech syn­the­sis ca­pa­bil­i­ties that help it to re­al­ize per­son­al­ized voice syn­the­sis and emo­tional trans­fer­ence.

Wang said the tech­nol­ogy has the po­ten­tial to en­able more nat­u­ral in­ter­ac­tion be­tween hu­mans and ma­chines in a wide range of dif­fer­ent sce­nar­ios. In ad­di­tion to gen­er­at­ing en­ter­tain­ment con­tent, AI-gen­er­ated char­ac­ters could also be equipped with So­gou’s in­ter­ac­tive voice op­er­at­ing sys­tem and uti­lized to de­liver per­son­al­ized con­tent in the ed­u­ca­tion, med­i­cal and le­gal fields.

Wang said he an­tic­i­pated this new tech­nol­ogy will im­prove so­cial pro­duc­tiv­ity and ser­vice ef­fi­ciency, re­duce in­dus­trial pro­duc­tion costs, and en­hance peo­ple’s ex­pe­ri­ences in science and tech­nol­ogy. Xin­hua con­trib­uted to this story.

Newspapers in English

Newspapers from China

© PressReader. All rights reserved.