- What is the difference between Early English Books Online (EEBO) and EEBO-TCP?
- What is the difference between EEBO-TCP Phase I and Phase II?
- How much will it cost my library to join EEBO-TCP?
- How much does it cost to key and encode a single EEBO-TCP text?
- When will the texts be freely available?
- Why would I buy something that isn’t even finished yet?
- Why would I buy something that is going to become freely available?
- I’m not affiliated with an institution (or my institution doesn’t have EEBO or TCP), but I would be willing to pay for access to these texts. Do you have individual partnerships?
- Why don’t you use OCR?
- A work that I am interested in hasn’t been converted yet. When will you do it?
- Why does TCP only include one edition of a work?
- I found an error in a transcription.
What is the difference between Early English Books Online (EEBO) and EEBO-TCP?
Simply put, EEBO is a commercial product published by ProQuest LLC, and available to libraries for purchase or license. EEBO-TCP is a project based at the University of Michigan and Oxford, and supported by more than 150 libraries around the world.
EEBO consists of the complete digitized page images and bibliographic metadata (catalog records) for more than 125,000 early English books listed in Pollard & Redgrave’s Short-Title Catalogue (1475-1640) and Wing’s Short-Title Catalogue (1641-1700) and their revised editions, as well as the Thomason Tracts (1640-1661) collection and the Early English Books Tract Supplement. With EEBO alone, you can search for a book based on the information in the catalog record and you can flip through or download page images in TIFF or PDF format. With EEBO alone, it is not possible to search the full text of a book or to read a modern-type transcription of the text.
EEBO-TCP captures the full text of each unique work in EEBO. This is done by manually keying the full text of each work and adding markup to indicate the structure of the text (chapter divisions, tables, lists, etc.). The result is an accurate transcription of each work, which can be fully searched, or used as the basis of a new project. To date, EEBO-TCP has produced more than 40,000 texts. The EEBO-TCP text files are delivered back to ProQuest and indexed in EEBO, so users at partner libraries can seamlessly perform full text searches and view transcriptions right within the EEBO platform, although the texts can also be accessed in other ways. EEBO-TCP is administered by the University of Michigan Library, with teams of editors at Michigan and Oxford.
What is the difference between EEBO-TCP Phase I and Phase II?
The initial EEBO-TCP project began in 1999. Its goal was to key and encode 25,000 selected works from the EEBO corpus. This effort was completed in 2009, with the support of nearly 150 library partners. The 25,000 texts produced by this effort are called “Phase I.” This set of texts is currently available exclusively from ProQuest. On January 1, 2015, this “exclusivity period” will end, and the texts will be made freely available to the public.
Under the encouragement of the project advisory board, and with the promise of another round of support from many libraries, in 2008 the TCP decided to continue the work of EEBO-TCP in a second phase. The goal now is to key and encode one edition of each unique work represented in EEBO. When complete, Phase II will consist of around 44,000 texts. EEBO-TCP Phase II is currently supported by about 100 libraries and we are still actively seeking new library partners. Each library’s partnership fee goes directly to covering the cost of keying new books that we otherwise would not be able to do. When this work is complete, for five years the Phase II texts will be available exclusively from ProQuest. At the end of that period, the texts will be made freely available to the public.
Ultimately, the entire EEBO-TCP corpus (Phase I and Phase II together) will consist of about 70,000 works.
How much will it cost my library to join EEBO-TCP?
The partnership fee depends on the size of your institution. Please see our fees.
How much does it cost to key and encode a single EEBO-TCP text?
The cost of keying and encoding a book depends on how long the book is and how difficult it is to capture and edit the text. A book might be particularly challenging due to the difficulty of the font, the quality of the microfilm, or simply the presence of unusual and complex textual features, such as large tables or genealogical charts. A work might consist of a single broadsheet, or thousands of pages. Our vendors charge a flat fee by the character or by the kilobyte of data captured. The costs of review and editing, which is done in-house at Michigan and Oxford, are measured in time, typically by counting how many books can be reviewed in a month. On average, we estimate that it costs $200-$250 to key, encode, and review a “typical” work.
A research library pays $60,000 to become a partner, so each library that joins supports the conversion of 250-300 new books.
When will the texts be freely available?
Part of the mission of the TCP is that the text files we produce will ultimately be freely available to the public. The date that restrictions on sharing and distributing the texts depends on when the project was completed.
- ECCO-TCP is already available to the public
- Evans-TCP will become available to the public June 30, 2014
- EEBO-TCP Phase I will become available to the public January 1, 2015
- EEBO-TCP Phase II will become available to the public five years from the completion of our work on the project.
Why would I buy something that isn’t even finished yet?
TCP partnership is less a purchase than it is an investment, and a commitment among libraries to share the burden and reward of this work. Partner libraries contribute to the cost of producing tens of thousands of painstakingly produced electronic editions of early English works. Each new library that joins makes it possible for the project to key books that we otherwise would not, improving the corpus for everyone.
Why would I buy something that is going to become freely available?
The success of the EEBO-TCP depends on the support of partner institutions. The partnership fee directly funds the conversion of new books, so by joining up, your library will gain immediate access to the texts, but also contribute to making a larger, more comprehensive corpus for everyone. In addition, the sooner we get through the corpus, the sooner the texts will be released to the public. While we can sympathize with the temptation to wait, doing so hurts both our progress today and the corpus that will ultimately be released to the public.
It is also important to keep in mind that the EEBO page images and the ProQuest EEBO interface will always be paid/subscription services. Only the text files will become freely available. For those who are not partners, once the texts are freely available, it will be possible to access the texts without page images through the interface hosted by the University of Michigan Library.
I’m not affiliated with an institution (or my institution doesn’t have EEBO or TCP), but I would be willing to pay for access to these texts. Do you have individual partnerships?
Unfortunately, we do not currently have a way to offer individual access to EEBO-TCP.
Why don’t you use OCR?
Because of the irregularity and difficulty of early printing, as well as the variable quality of the microfilm-based images from which we are working, optical character recognition cannot reliably “read” the EEBO images to produce an accurate electronic text. The review and correction of the text produced would be so expensive and labor-intensive that it is more efficient to simply key the work from scratch. However, there is a great deal of interest right now in Europe and in North America in improving OCR for older works and there are a number of research projects investigating this right now–using TCP texts as a “ground truth” against which to compare their results.
A work that I am interested in hasn’t been converted yet. When will you do it?
Users affiliated with partner libraries are welcome to request works from EEBO that have not yet been keyed. To make a request, simply send an email to the project: email@example.com. It helps if you can provide the EEBO number for the work. If you are requesting that we key a different edition of a work that has already been converted, please offer some explanation of the significance of keying this new edition.
For the time being, we can only accept requests for works in EEBO. These requests go straight to the front of the queue and are added to our list of texts to be keyed for the next month. However, it still takes awhile to turn them around. If you want the source text for the work, we can most likely get it to you within a few months. It will take longer–six months to a year–for the work to appear online in the ProQuest or Michigan interface.
Why does TCP only include one edition of a work?
We recognize that each edition of a work is unique, that one cannot stand in for others, and that for many scholarly purposes, there is value in examining closely the differences between editions. However, given limited funding, our first priority is to capture as many different works as we can, usually focusing on the first edition of each work. Simply put, for every book that we choose to convert, a different book does not get converted. However, we have keyed additional editions where there is sufficient justification for doing so.
I found an error in a transcription.
We are very grateful to those who report errors to us, and will incorporate corrections into our next release of the texts. Unfortunately we don’t yet offer a way to report (or correct) errors within the interface itself. Please get in touch at firstname.lastname@example.org.