If you’ve read any prior articles or tips regarding online newspaper research, then you know that the quality of the scanned newspaper as well as the OCR process dictate the quality of the search index. You need to realize that you are searching for combinations of letters, not words. And when you are using those letter combinations as your search criteria, you essentially are trying to match that search criteria against that index.
Here’s a news flash – the search index for older newspapers, especially, may not be very good.
Here’s an example of a 115-year-old newspaper, with the original first, followed by the search index:
There are words missing and several of the words are “misspelled.” It is charitable to assign a “50% correct” value to this representation of the original to the search index.
The thing that many newspaper researchers forget and why they get frustrated is that they think that they are searching against an index that is an EXACT (or near-exact) replica of the original newspaper article. Many researchers quit or get discouraged searching newspapers primarily because of three reasons:
- This feeling they have that the search index is an exact replica or near an exact replica (say 90% of words represented correctly) is so stuck in their brain, that when the index is only 50% or less of the exact words in the original, that they just can’t handle this difference. It just becomes “too hard for them to deal with.”
- The search criteria that they create is not very detailed and is simplified, such as searching for just a person’s surname. They may get too many results in the case of a common surname, or not enough or none for a more complex surname. And they don’t put in a date range or a first name, or other distinguishing words that would help. So, in this case, their lack of training or lack of desire to learn results in failure.
- They can’t find what they are looking for because the information is not available because the dates for an event or a person are not available in the newspaper collection. For example, if you are looking for an event or an event in a person’s life that happened say between 1915 and 1935, and the newspaper collection that is online does not have any newspapers for that date range, guess what the search results will be? Not very many if any.
Let’s look at this from an emotional or attitudinal perspective. Some of these emotions or attitudes are as follows:
- This stuff is in a database so it should be easy to find.
- Why should I spend time learning how to search newspapers – it can’t be that hard.
- Everything is online isn’t it, so how come I can’t find anything?
- Why can’t I just put the name of the person or event in the box and have the system give me the results I want?
Here’s the deal and you may not like what I am going to say, but here goes:
If you can’t find what you are looking for, then ask yourself these questions – “Have I really tried to learn about successful search techniques, or am I just winging it? Do I have unreal expectations of the software vendor? Am I searching really old newspapers where the quality is most assuredly sub-optimal? Have I really tried to overcome the likely less than optimal search index by trying the many search tips that are available for me to learn about?”
You see the problem is more than likely ATTITUDE. To be successful as a newspaper researcher, you must be DETERMINED. You must LEARN and apply what you have learned to create better search criteria.
You know why this ATTITUDE is important? Because more than likely the SEARCH INDEX is sub-optimal and what I mean by that is it likely does not fit your expectations. And that more than likely is NOT the software vendor’s fault.
So, what do you do? Get determined to OUTSMART the search index. Read about different search techniques and tips and view tutorials. Apply what you have learned with vigor. Then and only then will you have as much success as you can get when searching old newspapers.
Researching old newspapers online is a battle of wits. Positive results are found when you outwit the index.
6 replies on “Want Better Newspaper Search Results? Get an Attitude”
Excellent point. I would say the same goes for research in censuses. My great grandfather Einer, was listed as a female named Ida in a census as a child. A distant cousin could never find him in a census because of this understandable error. A family member with a thick Danish accent providing info to the census worker with a likely different language background. Thankfully I knew where he lived at the time and the names/ages of the ither family members.
This was my problem when I first started researching censuses for my mother’s father’s grandparents, who emigrated from Germany/Prussia in 1868. I had been given the original spelling of the surname, and thought that’s how it was always spelled. With some help, I discovered a “variation”, and have since opened my mind to other possibilities. In some instances, knowing additional family members in the household helped! However, still have great difficulty researching old newspapers, more because the areas my ancestors resided didn’t seem to have newspapers in print during critical years….
It doesn’t help that none (to my knowledge) of the newspaper sites use fuzzy searching with their indexes. People have become very used to the quality of Google’s search algorithm, which matches things like misspellings and variants. I don’t know why none of them have invested in expanding that technology, or licensed it from those who do know it.
Just wanted you to know that I shared your great article in my “Friday finds” segment today 🙂 http://martinroe.com/…/06/16/friday-finds-week-24-2017/
Thank you Martin!
I have found at least 20 spelling variations of Murphy. E for U. F for PH. EY, IE or EE for Y. Add an extra R and make it Murphry to completely confuse a searcher.