Common Crawl – A Free Database of the Entire Web May Spawn the Next Google…01.24.13

24 01 2013

11

From Kurzweill’s A free database of the entire Web may spawn the next Google:

“A nonprofit called Common Crawl is now using its own Web crawler and making a giant copy of the Web that it makes accessible to anyone.

The organization offers up over five billion Web pages, available for free so that researchers and entrepreneurs can try things otherwise possible only for those with access to resources on the scale of Google’s, MIT Technology Review reports.

Common Crawl has so far indexed more than five billion pages, adding up to 81 terabytes of data, made available through Amazon’s cloud computing service…”

About these ads

Actions

Information

One response

25 01 2013
Marlon

Thank you, I’ve recently been searching for info approximately this subject for ages and yours is the greatest I have came upon so far. But, what in regards to the conclusion? Are you certain in regards to the supply?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 698 other followers

%d bloggers like this: