Common Crawl – A Free Database of the Entire Web May Spawn the Next Google…01.24.13

24 01 2013


From Kurzweill’s A free database of the entire Web may spawn the next Google:

“A nonprofit called Common Crawl is now using its own Web crawler and making a giant copy of the Web that it makes accessible to anyone.

The organization offers up over five billion Web pages, available for free so that researchers and entrepreneurs can try things otherwise possible only for those with access to resources on the scale of Google’s, MIT Technology Review reports.

Common Crawl has so far indexed more than five billion pages, adding up to 81 terabytes of data, made available through Amazon’s cloud computing service…”



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 717 other followers

%d bloggers like this: