Large Scale Analytics in the Enterprise
August 9, 2012 No CommentsSOURCE: Think Big Analytics
The growth of Internet businesses led to a whole new scale of data processing
challenges. Companies like Google, Facebook, Yahoo, Twitter, and Quantcast now
routinely collect and process hundreds to thousands of terabytes of data on a daily basis.
This represents a significant change in the volume of data which can be processed, a
major reduction in processing time required, and of the cost required to store data. The
most important of the techniques used at these companies is storing data in a cluster of
servers and using a distributed data processing technique Google invented, called
MapReduce. Facebook, Yahoo, Twitter, and Quantcast all process data with an open
source technology implementation of MapReduce called Hadoop.