The Future of the ESTC: A Vision

Henry L. Snyder

    This occasion is a temptation for many of us to celebrate what has been accomplished. There is certainly much to celebrate. The ESTC is unquestionably an extraordinary accomplishment in the annals of bibliography. It has many illustrious predecessors, though in size and scope it possesses qualities that tempt one to use that much-abused word, unique. Above all it is a tribute to what can be accomplished by cooperation on a grand, international scale. We are here not to bury, not to praise, but to plan for the future. The ESTC is a living, constantly growing and changing organism. We must continue to nourish and nurture it so it can continue to benefit us and our successors in the manner envisioned by its founders.
It is both the frustration and virtue of the ESTC that it is different each day when the work of the preceding day is loaded. It is ever changing and improving. Each searcher finds new information, new help, new opportunities, new data that sheds new light on his or her studies. And this will continue so long as it remains accessible and current. This is the twentieth year that I have directed the ESTC in North America, and I continue to marvel at its attributes and its value. I continue to be amazed at the new uses to which it can be put, the new ways it can be managed and shaped, the new ways in which it can be improved, and the new ways in which it can be made available.
    The ESTC was the result of a technological revolution: the application of computer technology to cataloging. I need hardly remind those present that the rate of change in the technological revolution in which we live is ever accelerating. I say this not to review the past but to point out that with change ever present we can plan for but not forecast the future. We cannot predict new developments, but we can be sure that they will cause changes as radical as those we have experienced in the past. The one thing we have learned over the years is that, sooner or later (and probably sooner), whatever mechanism and system we rely upon today will become passé, and will be supplanted by some “rough beast” not yet imagined.
When we began storage space was limited and expensive, and response time was slow. As a consequence the base format for the record which we still employ for the eighteenth century is a truncated record: abridged in title and imprint information, shorn of all but the most essential of added entries, lacking subject headings, although the notes field is extensive. The next generation of records in the file, those for the pre-1701 period, are far richer and far more valuable as a research tool because of the greatly increased amount of information they contain. This is one important challenge for our successors: to expand the eighteenth-century records to match those for the earlier period in the richness and variety of data they contain. For space is no longer an issue. The ESTC, which at one time could only have been mounted on a large mainframe, can now be mounted on a small table-top machine that is smaller in size than most of the individual workstations we routinely use at home and in the office.
    This revolution in capacity, power, and size suggests one of the ways in which the ESTC of the future may be managed differently than that of the present. As John Haeger has remarked to me, if we were beginning the project today he would not recommend building it in the Research Libraries Information Network (RLIN), but rather building it in-house and then perhaps loading it into a utility for online distribution. We have considered this matter very seriously, and do indeed have a UNIX machine that is fully capable of mounting and maintaining the ESTC. There are some palpable advantages to our doing so. We could perform many alterations electronically that we must now perform manually. Access to the supporting software was long denied us. The cost of staff and computing time in performing these tasks in a large bibliographical utility, not to mention the long delay in getting them scheduled and performed, continues to restrict the way in which we maintain the file in an increasingly outdated, archaic fashion. Our concern about accessibility is no longer a problem with the flexibility of the latest generation of computers, though, because of accounting and other reasons, some shared responsibility for distribution still seems valid. At the same time, the development of the Internet and its increasing reliability or at least speed, have caused even the utilities to restructure the way in which they connect with their users, suggesting that more of these tasks can be assumed in-house. One thing I can be sure of: whatever we decide today will be reconsidered and altered in the future, probably sooner rather than later. (1)
I have spoken about the substantial differences in the nature and content of our pre-1700 and post-1700 records. But there are ways in which both can be improved. We have tried, where possible, to include citations to specialist bibliographies that help to provide access and to identify the item being described. This has been hit or miss, with the exception of the STC and Wing and perhaps Foxon. (2)  I have as a personal project a review of Hanson to make sure every Hanson item is entered and identified. In the process, I know I will find items that have escaped our canvass and will therefore try and locate copies from which we can create new records.
I think one of the great virtues of the ESTC is the consistency with which we have built the file. This is because we have had only three editorial centers. It is, moreover, a tribute to the care and attention of our full-time managers. But inconsistencies and omissions remain. Though cataloging rules are intended to be objective and formats are inflexible, catalogers are human, and much of what they enter is a personal decision, often subjective rather than objective. One of the reasons I am tempted to mount the file in-house is to have the ability to make global searches, identify inconsistencies, and correct them by a series of replacement commands. We have done that so successfully with subsets of the file that I know it can be done with the file as a whole, provided we have unrestricted access at a feasible cost. Then there are other additions we can make. For example, we do employ collation in the absence of pagination, and a team at the Folger Shakespeare Library is now adding collation in our file for all STC items the Folger possesses.
    One of my concerns is how we can reach the initial and still valid goal of having the ESTC contain a record of all relevant items in public repositories throughout the world. I believe we are well along in that goal in North America. But libraries are living, growing organizations like the ESTC. And we should keep current with their acquisitions, deaccessions, changes in shelfmarks, corrections, and revisions in identifying their holdings. There are far too many significant collections in the British Isles that are not reported. I mean this as no criticism of our British partners, but we must recognize the reality of the situation: that British collections are more numerous and the means have not been found to enter all their collections in our files. The file cannot be said to be complete in coverage, let alone holdings, until at least all the major institutions on both sides of the Atlantic are reported.
    When the ESTC was first projected, the cutoff date was set at 1800. The curse of the Magdeburg Centuriators is still with us! But as it is further characterized as a bibliography of the handpress era, a terminal date of 1830 would be more appropriate. This alone would be a formidable task. The inexorable increase in the annual number of publications accelerates rapidly if not exponentially at the end of the eighteenth century and continues into the nineteenth century. Perhaps this was why 1800 was chosen. We do have another project we can adapt and which sets out guideposts for the journey, the Nineteenth-Century Short-Title Catalogue, the brainchild of Frank Robinson. It could provide a basis for the expansion just as the STC and Wing provided the basis for the extension backwards in time to 1475.
I have mentioned en passant some current projects related to the ESTC. Let me describe them now because they are a portent of how data may be added in the future. I believe we all recognize the need to maintain our momentum until the pre-1701 section is complete, that is, until we have a proper bibliographical record for every item in the two printed catalogs which have defined the publishing history of printing in England from 1475 to 1700. So how else is the completion of the record to be achieved? Recognizing the need to find alternative methods, we asked Research Libraries Group (RLG) to investigate a new, limited means of access to the file that would permit libraries to enter their own holdings directly into the file; they could not enter the bibliographical record itself, and thus our editorial control would not be diluted. That study is currently underway. In the interim we have tried another, very fruitful alternative.
    In a new departure, the Folger began entering its holdings directly into the file. They send us only queries and reports for items new to the file which we promptly catalog. We had entered a major part of their STC holdings by consulting their printed catalog when we were at work on the STC. But now they are verifying those entries, adding copy notes and collations, completing the census, doing the same for the 1641–1700 period, and finally, reporting unreported eighteenth-century items. This project started so successfully that we developed a similar project with the Henry E. Huntington Library. We have followed a similar course with the Library Company of Philadelphia and are now negotiating similar arrangements with other libraries here and in the British Isles.
Our present method of recording holdings, both in-house and by surrogate, is a wholly manual one. It is labor intensive, and the cost is such that we cannot complete the task by this method alone or even by cooperative projects along the lines we are currently using. As a consequence we are exploring machine-loading. RLG has agreed to provide a mechanism for machine-loading that will not undermine the accuracy of our reporting. One of our longtime catalogers is also a gifted programmer. He has found a number of ways to manipulate data within our STAR system that have proved to be very useful. With the consent and support of the Bibliographical Society and the Modern Language Association, we now have machine-readable versions of both the STC and Wing mounted in-house. To the STC file the British Library added the holdings of the Oxford Colleges surveyed by Paul Morgan. We are now negotiating for a machine-readable version of the pre-1701 Cathedral Libraries Catalogue sponsored by the Bibliographical Society in England. In both cases the holdings of these institutions reported in the STC and Wing are limited in nature. And, of course, we have made no canvass of the STC and only a limited one for Wing so that our holdings record in the ESTC for the pre-1701 period is severely limited. But by uploading the enriched machine-readable versions of the STC and Wing directly into the ESTC in RLIN, we can remedy this deficit with minimal labor. I am convinced this is the way of the future.
    Over time every library will convert its manual catalog to a machine-readable format. If the relevant records can be extracted by machine manipulation and sent to us, we have the prospect of further machine loading. We are actively studying this process now. Once we have worked algorithms for that process with a level of accuracy we can accept, we will test the programs file by file. Even though there may be many records the machine-match algorithm will be unable to handle and which will have to be examined on a one by one basis, at least the number we must deal with manually will be far less than it is now.
We have tried to make the ESTC what I like to call a “one-stop shopping center.” I believe that scholars are best served if they can locate not only all the copies of a particular item in the ESTC, but also facsimiles. Thus we have taken considerable pains to add citations to microfilm sets and hard copy facsimiles. In many cases the reader needs only access to the contemporary version of the text but not necessarily the original. The rarity of many of the items often makes access to the original impractical or impossible. I think this could be taken a step further now that we have hypertext markup language (HTML). I would like the scholar to be able to highlight the citation in such a fashion that an order could be transmitted for a copy and it could be delivered directly to the individual who seeks it. A second expansion could be the inclusion in the ESTC itself of visual images of the title page and other critical pages of the texts recorded. Cost again is the key factor, but as so many of the texts have been filmed, and the editorial offices at least have photocopies of a large percentage of the title pages, it may be possible to begin, at least on an experimental basis, without even having to go back again to the hard-copy original. And if the publishers of the major microfilm sets digitize those sets at some point in the future, the ESTC is the ideal candidate to index those images by hot-linking our records directly to the images.
    Even more exciting is the availability of the text itself in a digitized format. Facsimiles are one thing. They have inherent physical limitations and because of limited indexing access to specific words or topics in the texts themselves, usage remains limited and time-consuming. But if the text can be digitized and accessed through the ESTC then another technological revolution affecting scholarship will result. To be cost-effective the texts must be scanned rapidly and virtually error-free. That is not possible given the current state of scanners and software. Hand-set type, with all the variations imposed by intensity of inking, fading, deterioration of paper, print-through, and the like, make it currently impossible. The alternative of keying the texts is simply too costly for large-scale conversion projects, although a few examples exist. But I have no doubt that at some point in the next decade or two the technology will advance to the point where it will be possible. And if it can be done from microfilm the conversion could advance quite rapidly. I am particularly interested in seeing it take place with newspapers, above all the Burney Collection at the British Library, but digitization of all kinds of texts would result in major benefits for scholars. It is impossible to predict how the searching and retrieval of digitized texts would transform scholarship. I think of Charlton Hinman’s pioneering work on the comparison of copies of the first folio of Shakespeare and how it revolutionized not only Shakespeare scholarship but our knowledge of the early English and Europe printing establishment. One day this will be all done by machine.
I have mentioned alternative methods of providing access and also the existence of similar national union catalogs, though none are so advanced as the ESTC. But there are machine-readable national bibliographies in Finland, Sweden, the Netherlands, Germany, Spain, and Italy. The Consortium of European Research Libraries (CERL) was formed to create a single European catalog of the handpress by merging these individual national catalogs. The ESTC is the largest of the national union catalogs made accessible by CERL. The goal is to be able to search the full range of records from the handpress in Europe with a single command. And by linking the ESTC to the CERL database we would address one of the possible expansions of the ESTC we have considered: adding continental translations of English texts. By linking the records with uniform titles, both editions in English and other languages of a single text could be retrieved with a single command. The ESTC also makes possible, or, at the very least, enhances many derivative projects. Aside from the shared cataloging use and the ability for libraries to extract their own holdings (with records attached at minimal cost for loading into their in-house catalogs), there are a host of bibliographic projects that build on our efforts. Most recently we supplied 1,500 records to the compilers of a Wesley bibliography at Southern Methodist University.
    The institutional expertise and memory we have built up over two decades is too precious to be lost. As a result, we have been exploring for some years the means to providing a long-term home with the necessary resources to insure not just its survival but to give it the means to carry out the ambitious program that still confronts us. I am pleased to report that both the physical facility and core staffing is now committed by the University of California at Riverside. We are also working to build an endowment to provide for a reliable base of operating funds. The recent award of a challenge grant by the National Endowment for the Humanities moves us along the way to realizing this goal.
    These public-spirited and deeply appreciated commitments take me to the final point I want to address today. The ESTC has been an enormous public undertaking on an unprecedented international scale. How did it happen? It is because of a strong and continuing agreement by the world of learning and the agencies that support that world. The 1976 conferences in London and Washington, D.C. sponsored by the National Endowment for the Humanities and the British Library, and our 1984 conference held in this very room,(3) have affirmed the need, the goals, and the investment. I trust this meeting will reaffirm them.
    A project like the ESTC can only be made possible by a large-scale and continuing commitment of major funding agencies. It is only because we have great national agencies like the National Endowment for the Humanities and the Department of Education, and equally great philanthropic foundations such as Andrew W. Mellon, Ahmanson, and Pew Charitable Trusts that it has all come to pass. They did not back off. They have stayed the course, and the University of California at Riverside has taken the critical step to guarantee our success. I do want to reiterate in closing that it is only the existence of these great national funding agencies and major nonprofit organizations, both universities and libraries, that make projects like the ESTC possible. It is the success of these large-scale projects that justifies the continuing existence of these organizations. It is a symbiotic relationship of the best kind. Long may it continue!

1.    -In the three years that have passed since this paper was delivered and this manuscript is being prepared for publication, much of what I anticipated has come to pass. A mirror copy of the ESTC is now maintained in our office. Our database manager, Alain Veylit, has developed a number of user-friendly and sophisticated forms of access. As well as adding data directly online, we also enter it into the in-house file and then batch upload the new data periodically into RLIN. We are able to make all sorts of extractions and analyses that enable us to perform our task better. We can receive machine-readable files from contributing libraries and match more than 50% of the records directly. The remainder, most of which could be one of several printings, still have to be matched by hand. We are constantly expanding our use of the in-house file.
2.    -David F. Foxon: English verse 1701–1750: a catalogue of separately printed poems with notes on contemporary collected editions (London; Cambridge University Press, 1975).
3.    -Trustees Room, New York Public Library at 42nd and Fifth Avenue, New York City, NY.