How Big Data Helps Ancestry.com Map People, Places and Time
June 13, 2012 No CommentsOnline genealogy service Ancestry.com is trying to become like the Amazon or Netflix of family trees. Much like those companies use customer data to recommend products or movies customers might like, Ancestry.com wants to feed its users relevant historical records and other information on ancestors without making them search through its database. And it’s taking in everything from newspaper clippings to your DNA to make this happen.
It you’ve used Ancestry.com recently, you’re probably thankful for its efforts. According to Head of Engineering Scott Sorenson, Ancestry.com has more than 10 billion records that are part of a 4-petabyte (or 4-million gigabyte) data store. If you’re searching for “John Smith,” he explained, it probably has about 60 million for “Smith” and about 4 million for “John Smith,” but you’re only interested in the relative handful that are relevant to your John Smith.