Newspaper Research Tip – Change the Letters

Picture

If you do newspaper research online as part of your your genealogy and family history pursuits, then you have certainly been puzzled by some of the search results (or lack thereof) that you have received.

Creation of newspaper images and application of the OCR process does not always result in what you might expect.

There is a simple explanation for this, and it all has to do with quality:

  • Quality of the original material – was the newspaper old and brittle when scanned?  Was it yellowed?  Did it have dirt on it or lots of ink spots?
  • Was the scan performed to create the digital image and the index from the original paper, or from a microfilm of the paper, or worse a copy of the microfilm?  Every additional copy or scan degrades the resulting image and when the OCR process is applied the index suffers.
  • Quality of the OCR software.- some are better than others
  • Quality of the writing in the original newspaper.  Did the author get your ancestor’s name spelled correctly?
  • Quality of the typesetter – did the typesetter get every word from the author set up correctly?

Thus what you are searching is not a perfect digital database that represents what was originally written by the author and newspaper publisher.

What can we do about it?  There are lots of things to try and this article deals with changing the letters in your search criteria.  For example – if the surname you are searching for is “Wilson” and the letter “n” is often picked up as the letter “m” why not search for “Wilsom”?

I guarantee that changing your search criteria will lead to an improvement of at least 5 to 10% in search results.  I heard from one reader that changing word pairs got them a 20% improvement.

So which letter pairs are often confused? Here’s a few of them:

  • rn and m  (ar n and em)
  • h and b
  • Capital D and O
  • i, l, 1, /, !, and I are all often interchanged
  • 0 and O
  • c and e
  • r and n
  • [, ] and l (el)
  • nl and m  (en el and em)
  • Capital R and B
  • n and ri  (en and ar eye)
  • v and y 
  • Capital S and 8
  • Capital S and 5
  • Capital Z and 2
  • Capital G and 6
  • Capital B and 8
  • Capital K and |<

My suggestion?  Change your search criteria and exchange the letter string you are looking for to include these alternative letter and letter pairs and see what happens.  You might be pleasantly surprised!

______________________________________________________

Thank you for visiting The Ancestor Hunt!

______________________________________________________


Join the Conversation

12 Comments

  1. What a great tip, as are the others in the linked post! Thank you so much for sharing!

  2. Very good tip – I think people simply don’t realize how challenging it is for digitizing the old newspapers. I’ve got a good list in my head – now I’ll get it down on paper/computer so I don’t forget it for students when I’m teaching beginners in genealogy. Thanks Kenneth.

  3. Very useful tip – I changed the first letter from a capital G to a capital O and got another 45 entries in one newspaper

  4. One of the surnames I research is Pagel – and I get hundreds of hits for “Page 1” listed in my results. I occasionally see interactive sites that allow you to edit the OCR in a side column for the next person. Wish more would do that.

    1. I agree Denise. The California Collection that I use all the time does that. It really helps.

  5. This is also useful in digitized books such as those on Family Search, Hathi Trust and others. Another thing to think about when making searches like this is to see how wild of a guess you can make. M for N is easy but “HI” for M or “ine” for “the” because the scan missed the top of the “h” and part of the “t”.

Leave a comment

Your email address will not be published. Required fields are marked *