I recently conducted an interview with the CEO, Stefan Boddie. Below is a transcript of that interview. But first a quick overview of Veridian
Did you know that Veridian Software has collections of digitized newspapers that can be searched for free? Over 50 million pages!
Did you know that they power California, Colorado, Illinois, Indiana, Michigan, Virginia, Washington, and Wyoming state newspaper collections? You may have used their software and didn’t know it!
How about country-wide collections, such as Estonia, Israel, New Zealand, Singapore, and Switzerland?
Did you know that they are the folks behind the Elephind federated search facility that allows you to search many collections all at once?
Hello, Stefan. Thank you for taking the time to answer these questions. Could you provide a short history of the DL Consulting company and its mission?
DL Consulting was born in 2002, so we’ve been doing this for nearly 20 years. For about 5 years prior to that, I worked on a research project at the University of Waikato here in New Zealand, developing open source “digital library” software called Greenstone (several others who work here at DL also came out of that project). The Veridian product was first developed for the National Library of New Zealand “Papers Past” newspaper project in 2006. It’s still used by that project and is of course now used for many other digitisation projects.
Our mission is to make it easy for institutions to digitise their newspaper collections and to make previously difficult to access information available to everyone.
You seem to have been adding online collections at a very constant, but increasing rate. To what do you attribute this increase in activity?
Perseverance! We keep doing the best job we can and each successful project becomes a reference site and adds to our credibility for other potential projects. It’s kind of a slow snowball effect…
Verdian software is one of my favorites for personal newspaper research. It’s been around for over 10 years but always seems to get more robust. I am especially impressed with your recent use of non-Roman/Latin alphabets, such as Hebrew in the National Library of Israel collection, and Japanese characters in the Stanford/Hoover Collection of Japanese-American newspapers. Without getting too technical, how is that achieved?
One key point is that Veridian is constantly evolving and improving, and is dramatically more robust, feature-rich, stable, fast, and scalable than it was when it was first developed nearly 15 years ago. We’re a small and very specialised company, and digitisation projects are all we do, so the technology we’re using is always improving.
A second point I guess is we’ve found something of a niche in handling difficult digitisation projects. For example, if you have a simple collection of a few thousand English-language documents or photographs or some such thing there are dozens of products you can use to put them online. However, if you have a collection of several million pages of digitised newspapers, with large page sizes and poor OCR quality, and all the other issues that come with historic newspapers, then Veridian is one of only a few viable options. Likewise, if you’re dealing with newspapers in more difficult languages like Hebrew, Arabic, and Japanese. We pride ourselves on being able to handle the complex projects that others can’t so that naturally creates some opportunities for us to work on some interesting and unusual collections.
METS/ALTO is a standard that was created by the Library of Congress I believe and is used by Chronicling America and of course yourself. Again without getting too technical, what is it?
There’s an article on our website (https://veridiansoftware.com/knowledge-base/metsalto/) that hopefully explains it.
Another favorite of mine is your crowdsourcing of text correction, necessary because of the quality issues of old newsprint. Other than the obvious need for such a tool, what prompted you to create the feature over a decade ago? No other provider has it, as far as I know.
We borrowed the idea from the National Library of Australia, who included it in their Trove system (https://trove.nla.gov.au/newspaper/) when they launched it way back in about 2007 or 2008. Trove is a system NLA built themselves in-house — that is, it’s not Veridian or any other “product” that can be used by other projects, and I believe it’s still the only system with crowdsourced text correction, aside from Veridian. We’re not above borrowing good ideas from others, and crowd-sourced text correction is a great idea!
For what it’s worth, while the obvious benefit of crowdsourced text correction is the improved text and search-ability, there are less obvious benefits that we think are just as important. The key one is that it tends to create communities of like-minded and very engaged users for these collections. Finding smart ways to encourage people to really engage with and care about the content of these collections is one of our key goals for Veridian.
I discovered Elephind over 5 years ago and always publish an article when you provide updates. It is a popular tool. since the federated search of multiple collections is so desirable. Any updates on that service?
No updates at this point. We actually built Elephind.com originally to test the search systems we use in Veridian, to see how they perform with 10 million or 20 million or 30 million pages of newspapers. It then became popular enough that we’ve left it online, with occasional content updates, for many years. We don’t currently have any plans to change it, but that may change in the future.
People may be surprised, but the number of pages of the collections for which your software is used doubles that of Chronicling America. You’re the best-kept secret in the newspaper digitization world! Without giving away any of your secret plans, what can we expect in the next few years from Veridian?
We’ll continue to develop the software to add new features, as well as regularly upgrading all the existing Veridian-based collections to keep them looking modern and running well. And of course, we’ll hopefully continue to become less of a secret and to expand the group of newspaper digitisation projects using Veridian.
Thank you, Stefan. Very informative! To my readers, keep your eyes on Veridian Software, one of the best in the industry.