5.Hadoop
Social networking (Twitter, Facebook, LinkedIn, etc), Web search engines (Google, Bing), and mobiles are some sources of big data. Scientists, analysts and architects deal with big data on a routine basis. The major part of this data has comprehensive and undiscovered relationships, which do not fiW LQWR WUDGLWLRQDO UHODWLRQDO PRGHOV.
Apache Hadoop is a software framework inspired by Google's Mapoeduce and Google File System (GFS) papers. Hadoop Mapoeduce is a programming model to write applications that rapidly process huge amounts of data in parallel, on large clusters of compute instances.
Hadoop can be used to analyse and process a variety of data WR HxWUDFW VLJQLfiFDQW EXVLQHVV RSHUDWLRQV LQWHOOLJHQFH, ZKLFK remained hidden earlier. In normal scenarios, data moves to the computation node and then it is processed; but in Hadoop, data is processed where the data resides. The types of questions Hadoop helps answer are event analytics, large-scale Web click-stream DQDOyWLFV, UHYHQXH DVVXUDQFH DQG SULFH RSWLPLVDWLRQV, fiQDQFLDO ULVN PDQDJHPHQW DQG DIfiQLWy HQJLQH, HWF.
Amazon blastic Mapoeduce is a Web service in the category of a public cloud. It enables researchers, analysts, developers and organisations to process vast amounts of data easily and cost-effectively. It utilises a hosted Hadoop framework running on the elastic infrastructure of Amazon blastic Compute Cloud (APDzRQ (C2) DQG APDzRQ 63, DQG SUH-FRQfiJXUHG (C2 instances (slave nodes) to distribute the Mapoeduce process.