From Kurzweill’s A free database of the entire Web may spawn the next Google:
“A nonprofit called Common Crawl is now using its own Web crawler and making a giant copy of the Web that it makes accessible to anyone.
The organization offers up over five billion Web pages, available for free so that researchers and entrepreneurs can try things otherwise possible only for those with access to resources on the scale of Google’s, MIT Technology Review reports.
Common Crawl has so far indexed more than five billion pages, adding up to 81 terabytes of data, made available through Amazon’s cloud computing service…”

Thank you, I’ve recently been searching for info approximately this subject for ages and yours is the greatest I have came upon so far. But, what in regards to the conclusion? Are you certain in regards to the supply?