Why Use Proximity Searches When Searching Historical Newspapers Online

When researching historical newspapers online, we often have some basic capabilities used by most of the underlying search engines.

For example, one basic search that all of the search engines have is to enter one or multiple words in your search box. For multiple words, the engine generally assumes that without entering search operators such as “AND” or “OR” that you are implying an “AND”.  So if you are searching for someone named John Sousa, you can leave out the “AND” or put it in and the results will be all the pages that have the word “John” and the word “Sousa”.

Furthering our example, if you are looking for someone with the same name, you could encase the two words in quotes. So searching for “John Sousa” would return you all pages that have those two words next to each other only. But they must be next to each other for a page to be returned as a result.

Again with our example, in the “AND” case, we would get all pages that also have the name John Philip Sousa.  But in the case where we are using quotes, any occurrence of John Philip Sousa would not be a successful result.

Now let’s discuss the basics of proximity searching. Simply, proximity searching lets you search for two words within a specified number of words from one another. Let’s say your target person is named “John middle name Sousa, and you don’t know the middle name. If, like John Philip Sousa he used his middle name regularly, how would you find all the results?  If you use quotes like “John Sousa” you won’t find him, and if you don’t use quotes you may get a result of pages that have both John AND Sousa somewhere on a page.

The ideal way is to use a Proximity Search, and you would search for John and Sousa WITHIN one word of each other. This would return results where a middle name or a middle initial is used. 

Obituaries are good candidates for proximity searches, since fathers, brothers, uncles, married female relatives, etc. may be named in the obit. So a proximity search within 5 or 10 words from one another might give you some interesting results.

Another example would be an event such as the “Montgomery Bus Boycott”, where the word “bus” may not be used in all articles.

Now, the good news is that several online newspaper collections provide proximity searching and many do not. Chronicling America does but you are restricted to within 5, 10, 50, or 100. And those who use the Chronicling America search engine, such as South Carolina have the same feature.

Texas and Oklahoma’s search engines allow you to perform proximity searching within a selection of 1, 2, 3, 4… up to 25.

Utah’s allows 1, 2, 3, 4, 5, or 10. The New York NYS Historical Newspapers site only allows 5 or 10. 

Sadly, most of the online collections’ search engines do not provide proximity searching capabilities.

​So if available, try using proximity searches.  It might deliver results that you might miss otherwise.

Join the Conversation

2 Comments

  1. I’m new at researching newspapers for obits and any little bit of info I can find on my family. I’ve found that it can be just as addictive as doing genealogy research! 😉 I just wanted to thank you for the examples you gave for proximity searches and I’m going to start adding that exact type of search to my list. I live in Georgia and most of my family research is in Georgia, Kentucky and Tennessee, where just about everyone goes by their first and middle names. My dad’s side of the family is in Ohio and Michigan where that’s not a common practice at all. I’ve found out the hard way that I can’t search newspapers like I would do a Google search-well, I can, but I won’t get very many good results. There’s all kinds of tricks to getting around OCR, damaged papers, scans that were too dark, etc. So I’m taking the time to figure these things out. Your website is amazing-and that’s an understatement! I’ve learned so much from your videos and your blog! Thank you so much for all of your hard work!

Leave a comment

Your email address will not be published. Required fields are marked *