Taming the Elephant
It is now time for cloud and big data to come together
‘Big data’ refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. As we know, technology advances over time therefore the size of data sets that qualify as big data will also increase. It would vary as technology advances and also by sector, depending on the volume and velocity of the data sets. It can range from a few dozen terabytes to multiple petabytes.
The ability to store, aggregate, and combine data and then use the results to perform deep analysis have become even more accessible. The discussion on big data is at a preliminary stage with CIOs and IT managers today, and the tempo is picking up real fast.
The reason for this is that the CIOs are faced with enormous amount of important data sets and are turning to mine them for insights into business value. Also, most enterprises are contemplating to virtualize and move to the cloud at some point. All this requires a pre-determined plan of moving large data assets, securing them, and a plan to drive analytics.
Verticals that have been traditionally heavy consumer-centric and IT-savvy such as retail, BFSI and others such as oil F gas and healthcare, which process significant amounts of information, are turning to apply existing data for business insight.
Increase in Demand
Another factor that will fuel the need for big data is the ever-increasing growth of digital data. A recent IMC-EMC study determined that India generated nearly 40,000 petabytes of data in 2010. It estimated that India’s share of digital information will grow 60 times by 2020, driven by the rollout of 3GIBPA networks, digitization of television networks, and increased technology adoption among individuals, SMBs, enterprises and in government services like the unique IM project, census, among others.
Big data analysis can help retail enterprises, as it can improve the speed and scalability of a massively parallel data warehouse in cost-effective methods.
Many government functions already have a recognizable ‘cloud plus big data’ function, with more coming along all the time. Another exciting prospect is the National Intelligence Grid (Natgrid)—imagine when it becomes operational in India, it will integrate the existing 21 databases with central and state government agencies and other organizations in the public and private sector such as banks, insurance companies, stock exchanges, airlines, railways, telecom service providers, chemical vendors, etc.
Healthcare delivery has enormous potential to move towards harnessing enormous amounts of patient records and providing evidence-driven recommendations that not only change recommended protocols but shape overall healthcare policy. Before long, healthcare will undoubtedly be a ‘cloud plus big data’ industry.
The most important driver for big data is cloud computing. In one sense, the move to cloud (whether private, public, or hybrid) sets the stage for big data. Various industry experts suggest that the time now is about cloud meeting big data. Cloud computing makes big data possible by providing an elastic pool of resources to handle the massive scale of big data. Through cloud computing, IT resources are more efficient and IT teams are more productive, freeing up resources to invest in big data.