OpenSource For You

5.Hadoop

-

Social networking (Twitter, Facebook, LinkedIn, etc), Web search engines (Google, Bing), and mobiles are some sources of big data. Scientists, analysts and architects deal with big data on a routine basis. The major part of this data has comprehens­ive and undiscover­ed relationsh­ips, which do not fiW LQWR WUDGLWLRQD­O UHODWLRQDO PRGHOV.

Apache Hadoop is a software framework inspired by Google's Mapoeduce and Google File System (GFS) papers. Hadoop Mapoeduce is a programmin­g model to write applicatio­ns that rapidly process huge amounts of data in parallel, on large clusters of compute instances.

Hadoop can be used to analyse and process a variety of data WR HxWUDFW VLJQLfiFDQ­W EXVLQHVV RSHUDWLRQV LQWHOOLJHQ­FH, ZKLFK remained hidden earlier. In normal scenarios, data moves to the computatio­n node and then it is processed; but in Hadoop, data is processed where the data resides. The types of questions Hadoop helps answer are event analytics, large-scale Web click-stream DQDOyWLFV, UHYHQXH DVVXUDQFH DQG SULFH RSWLPLVDWL­RQV, fiQDQFLDO ULVN PDQDJHPHQW DQG DIfiQLWy HQJLQH, HWF.

Amazon blastic Mapoeduce is a Web service in the category of a public cloud. It enables researcher­s, analysts, developers and organisati­ons to process vast amounts of data easily and cost-effectivel­y. It utilises a hosted Hadoop framework running on the elastic infrastruc­ture of Amazon blastic Compute Cloud (APDzRQ (C2) DQG APDzRQ 63, DQG SUH-FRQfiJXUHG (C2 instances (slave nodes) to distribute the Mapoeduce process.

Newspapers in English

Newspapers from India