/>

Tuesday, January 08, 2013

A Mountain of Tweets at the Library of Congress

The Library of Congress just put out a White Paper on the status of their Twitter Archive which was started in 2010 with tweets from 2006-10, and continues with a streaming operation set up for tweets post 2010 to the present.We're talking 170 billion tweets so far, with a growth rate of 140 tweets harvested per day.


While they have received over 400 serious research requests they are not yet ready to provide research access to the archive.  Explains the paper: "Currently, executing a single search of just the fixed 2006-2010 archive on the Library’s systems could take 24 hours.  This is an inadequate situation in which to begin offering access to researchers, as it so severely limits the number of possible searches."  It's no easy problem to solve, either as it will take an extensive infrastructure overhaul of their servers which is cost-prohibitive for a public institution such as theirs.  In the meantime, they are developing a "basic level of access that can be implemented while archival access technologies catch up"--which doesn't tell us a whole lot but it will be interesting to follow for sure. 

Labels: , , ,

Friday, December 17, 2010

Chronicling America: Historic American Newspapers

Remember, however meagerly we go about our daily lives, the Library of Congress is relentlessly building its digital tower of historical newspapers. Just this past week it uploaded a new batch of titles into Chronicling America, bringing its current total number of pages to 3.1 million, 414 newspapers from 23 states between 1899-1922. Chronicling America newspapers can be searched with words or phrases; or, for a broader, more regional approach, once can search by one or more states. Of course, this being the Library of Congress, this database is an open web resource, available to all, no UPenn status required.

Chronicling America is sponsored jointly by the National Endowment for the Humanities and the Library of Congress as part of the National Digital Newspaper Program (NDNP).

Labels: , ,

Tuesday, April 10, 2007

Update on Historical Newspapers

The horizon for historical newspapers just keeps expanding. Here are a couple of initiatives you may not know about:

From the Library of Congress :
Chronicling America: Historic American Newspapers, a prototype Website providing access to information about historic newspapers and select digitized newspaper pages, produced by the National Digital Newspaper Program (NDNP). NDNP, a partnership between the National Endowment for the Humanities (NEH) and the Library of Congress, is a long-term effort to develop an Internet-based, searchable database of U.S. newspapers with descriptive information and select digitization of historic pages. Supported by NEH, this rich digital resource will be developed and permanently maintained at the Library of Congress. An NEH award program will fund the contribution of content from, eventually, all U.S. states and territories. You can already do a lot at this betasite. You can search and read newspaper pages from 1900-1910 and find information about American newspapers published between 1690-present (publisher, years of publication, and a list of places with holdings).

Another initiative was actually started way back in 1999 by Cold North Wind, Inc., a privately held company based in Ottawa, that started using best-of-breed technology to turn newspaper archives on microfilm into high-resolution, searchable, digital images on the Internet. Partnering with the
National Newspaper Association to create an online search engine that accesses the digital archives of America's community newspapers, beginning with 3600 NNA member newspapers, the initiative became known as the Cold Wind Newspaper Archive Project and was launched at the National Newspaper Association's annual convention in 2002 with great fanfare. Cold North Wind has also joined forces with PaperofRecord.com which hosts The Toronto Star, which claims to have been the first newspaper in the world to have its entire history from 1892 to present digitized for all on the Internet to see and search...and pay for, because it's not free. Cold North Wind has also embarked on an ambitious project to digitize the archives of thousands of newspapers from around the world, some dating back several hundred years. The oldest newspaper in the project to date is a paper from Spain dating back to 1692.

Of course, don't forget about Penn's ever growing
historical newspaper resources (right now PaperofRecord is not one of them). You may want to tack up this handy table on your wall (virtual or concrete).

Labels: , , ,

Web Analytics