[Genealib] RE: Newspaper indexing

Donna Jo Atwood datwood at olatheks.org
Mon Jul 3 14:07:40 EDT 2006


Our local paper used to always provide us with a microfilm copy of the
papers for the month.  They did not keep any back issues on their premises
and sent everyone to us.  About three years ago, they abruptly stopped doing
that, but said they would provide us with CDROM instead.  We didn't think
that would be as useful to our library, so they didn't do that either.  We
now have nothing for the last three years, since we haven't the storage
space for more than three months of the hard copy.  --and the paper till
sends their reporters over to us to find material from those three years.
We are working to resolve this, but things are still up in the air.

Donna Jo Atwood
Reference Librarian 
Olathe (KS) Public Library
-----Original Message-----
From: genealib-bounces at mailman.acomp.usf.edu
[mailto:genealib-bounces at mailman.acomp.usf.edu] On Behalf Of Larry Naukam
Sent: Monday, July 03, 2006 12:45 PM
To: 'Librarians Serving Genealogists'
Subject: [Genealib] RE: Newspaper indexing

Karen Miller wrote: Our local paper takes a dim view of digitization or even
presenting full text of ANYTHING on line

Amen! I recently had a chance to talk over the phone with the local Gannett
newspaper librarian, and I came away with the idea that with as many holes
as our clipping file has, it's far better than what they have themselves.
That is: when the afternoon paper (now out of business) moved to the main
building in the 1980's THEY THREW OUT THE NEWSAPPER LIBRARY!!!!!

Back beyond 1970, they only have limited clips. There is virtually nothing
before 1920, and so on. What we do have, and is our next digitization
project, is a WPA produced 500,000 item 19th century newspaper index.
Goodness knows it not perfect but is incredibly useful. We looked at
digitizing the underlying papers, and Cornell DCAPS came up with the
estimate that it would take 7 petabytes (yes, petabytes - that's 7 million
gigabytes) of storage just for the original scans, let alone any OCR'd
items.

Some things that putative digitizers sometimes leave out of the equation is
not only what media are you going to store it on, but how often are you
going to refresh it? Where will you keep the backups? How will you know that
they are being made? If you OCR it, what is an acceptable error rate
(besides zero per cent?) 


_______________________________________________
genealib mailing list
genealib at mailman.acomp.usf.edu
http://mailman.acomp.usf.edu/mailman/listinfo/genealib



More information about the genealib mailing list