Early English Books Online-Text Creation Partnership (EEBO-TCP)

Early English Books Online Text Creation Partnership (EEBO=TCP)

This is another database resource, and it’s one I’ve used a lot. It’s the result of a collaboration between ProQuest, a for-profit company, and the University of Michigan, the University of Oxford, and the Council on Library and Information Resources, with the cooperation of some 150 other libraries around the world.

Why I am writing about a commercial resource when my whole purpose is to promote open-access resources? Well, it’s because 25,368 texts published before 1700 came into the public domain just a few years ago, and the story of how that happened is an interesting example of how universities and libraries have interacted with the private sector to create a database. More about that later.

What they say

” 130,000 works, microfilmed over 70 years from more than 200 libraries worldwide, were made available online by ProQuest in one collection. Early English Books Online (EEBO) is now one of the most successful research collections ProQuest has ever produced and it is used by students and scholars in over 1,000 institutions worldwide

https://www.sc.pages04.net/lp/43888/470018/1208PQ%20EEBOTCP%20Brox-Jap.pd

But…

If you’re paying attention here (and you’ll need to, because it’s complicated!) you’ll notice that they are talking about Early English Books Online (EEBO), not Early English Books Online Text Creation Partnership (EEBO-TCP), which is a different kettle of fish.

Here’s the point. EEBO is not open access. I consists of PDF files that have not been run through an optical character reader (OCR). If you want to see the actual pages of these early modern texts this is what you need, but there’s no way you can get access to it except through a university or some other institution. Well, I guess if you were a billionaire you could take out a subscription and not notice it, but it’s priced way beyond what the average person could afford.

So what is EEBO-TCP, and how is it connected to EEBO? Well, as you might expect, having digitized all these books as PDF files, the next thing people wanted to do with them was make them searchable. EEBO-TCP consists of (aat the time of writing) 58,531 EEBO texts converted into text-searchable plain text.

“Wait a minute!” I hear you cry. “Didn’t you say there were 23,568 texts?

Ah! So you have been paying attention. Yes, that’s right. 23,568 texts are in the public domain. But a further 34,963 texts are still only accessible if you have a log-in. Furthermore, if you have a log-in (which, you’ll remember, basically means you have to be a member of a participating institution) you can click through from the plain text version to the PDFs. This is particularly useful for those books which do not mark the page number – and many (perhaps most) early modern books do not mark page numbers. If you can see the PDF you can figure out the signature or folio, which is what was used before page numbers came into vogue, but if you can’t see the PDF you can access the text but you can’t give a proper reference for it.

Smart, huh? I mean smart from ProQuest’s point of view. The text-readable material that has come into the public domain is a fantastic resource, but it would be just that much better if you could access the PDFs, but that requires a subscription.

This is the way “freemiums” work. A “freemium” is a resource that gives you a certain amount free, but holds back features you can only get by paying for them. They offer something, but they’ve got more than they offer.

Still, something’s better than nothing, and the 25,000+ public domain texts are still a resource worth knowing about if you’re interested in the literature of that period.

So how does it work?

A couple of years ago a colleague of mine in the world of antiquarian books posted a Christmas message on Facebook featuring what he said was the earliest occurrence of the salutation “Merry Christmas” in print. That’s what the Oxford English Dictionary (another resource you have to pay for if you want it in its complete form!) says, and that’s what he was going by.

Without wanting to go into killjoy mode I felt it was worth putting his claim to the test and searched for “Merry Christmas” on EEBO-TCP. I found that the first occurrence with this spelling was in 1577, several decades earlier than the text my colleague had featured in his Facebook post.

In the same way, it has been possible to show that a fair number of the expressions credited to Shakespeare had in fact found their way into print before he used them. We can also establish patterns of usage. For example, we can check the use of the word “cruelty” in proximity to the word “Catholic” to see how closely these two things were connected in people’s minds, and we can check the frequency of such usage over time, to get some idea of whether attitudes changed, or were related to specific events, such as the Gunpowder Plot.

Of course, we can’t just go by the raw results. We need to examine each result and see the full context, but being able to isolate all the relevant texts in this way is something that people could only have dreamed about until just a few years ago.

Slowly, the database is redefining our understanding of the early modern period. It’s frustrating that the whole database isn’t open access, together with the PDFs that underlie it, but it’s still a lot better than nothing!

Leave a Reply

Your email address will not be published. Required fields are marked *