The Ris­ing Af­fec­tion To­wards Data Min­ing

Through this ar­ti­cle we dis­cuss the five key trends that’ll en­able wide adop­tion of Big Data so­lu­tions amongst en­ter­prises

PCQuest - - CONTENTS - – Mad­husud­han KM, Chief Tech­nol­ogy Of­fi­cer, Mindtree

Big­Data adop­tion is on the rise. Most en­ter­prises ei­ther have adopted Big­Data so­lu­tions or are on the verge of im­ple­ment­ing it. Big­Data so­lu­tions kicked off as a phe­nom­e­non for low­cost stor­age and mas­sively par­al­lel com­put­ing in batch mode. To­day, th­ese so­lu­tions are ca­pa­ble of per­form­ing real-time an­a­lyt­ics us­ing tech­niques such as stream­ing an­a­lyt­ics cou­pled with deeper data min­ing tech­niques. Fol­low­ing are some of the key trends we fore­see in 2016.

En­ter­prise Data Lake – Or­ga­ni­za­tions have spent decades in­te­grat­ing silo data sources and con­tinue to face chal­lenges in terms of de­riv­ing value by com­bin­ing het­ero­ge­neous data sources (both in­ter­nal and ex­ter­nal). In­abil­ity to con­sis­tent ac­cess of trans­ac­tional and large his­tor­i­cal data poses a chal­lenge for data anal­y­sis tech­niques. Data lake ad­dresses this chal­lenge by bring­ing silo data sources (struc­tured, un­struc­tured and semi-struc­tured) un­der one um­brella (Hadoop like ecosys­tem). Data lakes of­fer uni­fied data man­age­ment ca­pa­bil­i­ties in terms of meta­data man­age­ment and au­dit­ing. Due to the gain­ing pop­u­lar­ity of data lakes, many pub­lic cloud providers are now pro­vid­ing Data Lake as a PaaS of­fer­ing. Data se­cu­rity is an­other area where we see a rise in new so­phis­ti­cated tools en­abling ad­vanced en­cryp­tion and data gov­er­nance mech­a­nisms.

IoT – Big­Data and an­a­lyt­ics are an in­te­gral part of In­ter­net of things. With the ris­ing num­ber of devices and smart sen­sors (in cars, build­ing, cities, man­u­fac­tur­ing plants, wear­ables, etc.) ex­abytes of data get gen­er­ated ev­ery day. IoT so­lu­tions will lev­er­age Big­Data ca­pa­bil­i­ties like stream­ing an­a­lyt­ics, com­plex event pro­cess­ing in real-time and NoSQLs to store time se­ries data. Adop­tion of an­a­lyt­ics at the edge (fog com­put­ing) is pick­ing up to en­able prox­im­ity com­put­ing. Fog com­put­ing en­ables lo­cal an­a­lyt­ics to per­form quick real-time de­ci­sion-mak­ing be­fore send­ing data onto the cloud.

Deep Learn­ing – With a rise in the amount of data and data di­ver­sity, it is be­com­ing in­creas­ingly dif­fi­cult to ap­ply pre­built mod­els for ma­chine learn­ing. Hy­poth­e­sis val­i­da­tion is be­com­ing a cum­ber­some task. Deep learn­ing based on ar­ti­fi­cial neu­ral net­work tech­niques is be­ing used to iden­tify pat­terns, pre­dic­tions with­out ap­ply­ing pre-built mod­els. Large cor­pus of data is es­sen­tial for th­ese self-learn­ing tech­niques to ac­cu­rately pre­dict the out­come. Deep learn­ing tech­niques are be­ing used for im­age pro­cess­ing, scene de­tec­tion, pre­dic­tive mod­el­ing, etc. Deep learn­ing is ex­pected to make a sig­nif­i­cant con­tri­bu­tion to text an­a­lyt­ics and im­age to text se­man­tic gen­er­a­tion.

New age Big­Data plat­forms - Big­Data plat­forms lev­er­age mul­ti­ple tools/tech­nolo­gies for batch/mi­cro batch pro­cess­ing, real time an­a­lyt­ics, ma­chine learn­ing, graph pro­cess­ing etc. This com­pli­cates the IT land­scape and sup­port and op­er­a­tions of Big­Data plat­forms be­comes in­creas­ingly chal­leng­ing. New age Big­Data plat­form like Apache Spark are gain­ing pop­u­lar­ity as it brings dif­fer­ent type of work­loads un­der a com­mon plat­form. In mem­ory an­a­lyt­ics, ca­pa­bil­i­ties make Spark per­form bet­ter in com­par­i­son to tra­di­tional MapRe­duce pro­grams. We ex­pect en­ter­prises to adopt new age Big­Data plat­forms to sim­plify the tech­nol­ogy land­scape.

Data Visu­al­iza­tion – Quest for be­com­ing a datadriven or­ga­ni­za­tion is driv­ing en­ter­prises to adopt data dis­cov­ery and visu­al­iza­tion tools. Apart from tra­di­tional en­ter­prise re­port­ing tools, or­ga­ni­za­tions are ex­pected to in­vest heav­ily on data dis­cov­ery tools that en­able busi­ness users to freely ex­plore data. Th­ese tools also aid de­ci­sion sci­ence by help­ing data sci­en­tists to eas­ily iden­tify fea­ture met­rics es­sen­tial for ma­chine learn­ing tech­niques.

In sum­mary, Big­Data so­lu­tions open up enor­mous op­por­tu­ni­ties to build so­lu­tions that were oth­er­wise never pos­si­ble to build.

“Big­Data plat­forms lev­er­age mul­ti­ple tools/tech­nolo­gies for batch/mi­cro batch pro­cess­ing, real time an­a­lyt­ics, ma­chine learn­ing, graph pro­cess­ing etc. This com­pli­cates the IT land­scape, Sup­port and op­er­a­tions of Big­Data plat­forms have in­creas­ingly be­come chal­leng­ing.”

Mad­husud­han KM Chief Tech­nol­ogy Of­fi­cer, Mindtree

Newspapers in English

Newspapers from India

© PressReader. All rights reserved.