Here is an excerpt from Voice of America report US Library of Congress’ Digital Collection One of World’s Largest:
“…So far, the library has a total of 700 terabytes of data. But because of copyright issues, only 200 of those are available on the Web.
“A terabyte is about 1,600 CDs or about 330 hours of TV or about 2,000 books and we have about 500 terabytes that we keep in our long term preservation systems,” she adds.
At the Library of Congress, the numbers can be mind-boggling. Experts estimate they have more than 120 million books, 36,000 feature films, hundreds of thousands of music sheets and recordings, and the large collections of manuscripts, Web sites, posters and photography. Yet only one percent of it has been digitized.
Thomas Youkel is the senior systems engineer.’We have a scan lab here that scans anywhere from four to six million items a year,’…While workers continue scanning and digitizing millions of items, they keep an eye on a migration plan, to move from obsolete technology to new technology – a never ending process.”
