SE News :Yahoo! Search Webmap (Yahoo! Developer Network blog)
Posted on February 22nd, 2008 in SE News
แวะมาอัพเดทข่าวก่อนแล้วกัน เรื่องของ Yahoo search webmap.
The Webmap build starts with every Web page crawled by Yahoo! and produces a database of all known Web pages and sites on the internet and a vast array of data about every page and site. This derived data feeds the Machine Learned Ranking algorithms at the heart of Yahoo! Search.Some Webmap size data:
* Number of links between pages in the index: roughly 1 trillion links
* Size of output: over 300 TB, compressed!
* Number of cores used to run a single Map-Reduce job: over 10,000
* Raw disk used in the production cluster: over 5 Petabytes
Source : Hadoop running in production on the Yahoo! Search Webmap (Yahoo! Developer Network blog)
Blogged with Flock
![RSS[Blog]](http://www.eblogbiz.com/wp-content/themes/eblogbiz20/images/rss.png)

No Comments »
