The Future of the ESTC: A
Vision
Henry L. Snyder
This occasion is a temptation for many of us to
celebrate what has been accomplished. There is certainly much to
celebrate. The ESTC is unquestionably an extraordinary accomplishment
in the annals of bibliography. It has many illustrious predecessors,
though in size and scope it possesses qualities that tempt one to use
that much-abused word, unique. Above all it is a tribute to what can be
accomplished by cooperation on a grand, international scale. We are
here not to bury, not to praise, but to plan for the future. The ESTC
is a living, constantly growing and changing organism. We must continue
to nourish and nurture it so it can continue to benefit us and our
successors in the manner envisioned by its founders.
It is both the frustration and virtue of the ESTC that it is different
each day when the work of the preceding day is loaded. It is ever
changing and improving. Each searcher finds new information, new help,
new opportunities, new data that sheds new light on his or her studies.
And this will continue so long as it remains accessible and current.
This is the twentieth year that I have directed the ESTC in North
America, and I continue to marvel at its attributes and its value. I
continue to be amazed at the new uses to which it can be put, the new
ways it can be managed and shaped, the new ways in which it can be
improved, and the new ways in which it can be made available.
The ESTC was the result of a technological
revolution: the application of computer technology to cataloging. I
need hardly remind those present that the rate of change in the
technological revolution in which we live is ever accelerating. I say
this not to review the past but to point out that with change ever
present we can plan for but not forecast the future. We cannot predict
new developments, but we can be sure that they will cause changes as
radical as those we have experienced in the past. The one thing we have
learned over the years is that, sooner or later (and probably sooner),
whatever mechanism and system we rely upon today will become
passé, and will be supplanted by some “rough beast” not yet
imagined.
When we began storage space was limited and expensive, and response
time was slow. As a consequence the base format for the record which we
still employ for the eighteenth century is a truncated record: abridged
in title and imprint information, shorn of all but the most essential
of added entries, lacking subject headings, although the notes field is
extensive. The next generation of records in the file, those for the
pre-1701 period, are far richer and far more valuable as a research
tool because of the greatly increased amount of information they
contain. This is one important challenge for our successors: to expand
the eighteenth-century records to match those for the earlier period in
the richness and variety of data they contain. For space is no longer
an issue. The ESTC, which at one time could only have been mounted on a
large mainframe, can now be mounted on a small table-top machine that
is smaller in size than most of the individual workstations we
routinely use at home and in the office.
This revolution in capacity, power, and size
suggests one of the ways in which the ESTC of the future may be managed
differently than that of the present. As John Haeger has remarked to
me, if we were beginning the project today he would not recommend
building it in the Research Libraries Information Network (RLIN), but
rather building it in-house and then perhaps loading it into a utility
for online distribution. We have considered this matter very seriously,
and do indeed have a UNIX machine that is fully capable of mounting and
maintaining the ESTC. There are some palpable advantages to our doing
so. We could perform many alterations electronically that we must now
perform manually. Access to the supporting software was long denied us.
The cost of staff and computing time in performing these tasks in a
large bibliographical utility, not to mention the long delay in getting
them scheduled and performed, continues to restrict the way in which we
maintain the file in an increasingly outdated, archaic fashion. Our
concern about accessibility is no longer a problem with the flexibility
of the latest generation of computers, though, because of accounting
and other reasons, some shared responsibility for distribution still
seems valid. At the same time, the development of the Internet and its
increasing reliability or at least speed, have caused even the
utilities to restructure the way in which they connect with their
users, suggesting that more of these tasks can be assumed in-house. One
thing I can be sure of: whatever we decide today will be reconsidered
and altered in the future, probably sooner rather than later. (1)
I have spoken about the substantial differences in the nature and
content of our pre-1700 and post-1700 records. But there are ways in
which both can be improved. We have tried, where possible, to include
citations to specialist bibliographies that help to provide access and
to identify the item being described. This has been hit or miss, with
the exception of the STC and Wing and perhaps Foxon. (2)
I have as a personal project a review of Hanson to make sure every
Hanson item is entered and identified. In the process, I know I will
find items that have escaped our canvass and will therefore try and
locate copies from which we can create new records.
I think one of the great virtues of the ESTC is the consistency with
which we have built the file. This is because we have had only three
editorial centers. It is, moreover, a tribute to the care and attention
of our full-time managers. But inconsistencies and omissions remain.
Though cataloging rules are intended to be objective and formats are
inflexible, catalogers are human, and much of what they enter is a
personal decision, often subjective rather than objective. One of the
reasons I am tempted to mount the file in-house is to have the ability
to make global searches, identify inconsistencies, and correct them by
a series of replacement commands. We have done that so successfully
with subsets of the file that I know it can be done with the file as a
whole, provided we have unrestricted access at a feasible cost. Then
there are other additions we can make. For example, we do employ
collation in the absence of pagination, and a team at the Folger
Shakespeare Library is now adding collation in our file for all STC
items the Folger possesses.
One of my concerns is how we can reach the initial
and still valid goal of having the ESTC contain a record of all
relevant items in public repositories throughout the world. I believe
we are well along in that goal in North America. But libraries are
living, growing organizations like the ESTC. And we should keep current
with their acquisitions, deaccessions, changes in shelfmarks,
corrections, and revisions in identifying their holdings. There are far
too many significant collections in the British Isles that are not
reported. I mean this as no criticism of our British partners, but we
must recognize the reality of the situation: that British collections
are more numerous and the means have not been found to enter all their
collections in our files. The file cannot be said to be complete in
coverage, let alone holdings, until at least all the major institutions
on both sides of the Atlantic are reported.
When the ESTC was first projected, the cutoff date
was set at 1800. The curse of the Magdeburg Centuriators is still with
us! But as it is further characterized as a bibliography of the
handpress era, a terminal date of 1830 would be more appropriate. This
alone would be a formidable task. The inexorable increase in the annual
number of publications accelerates rapidly if not exponentially at the
end of the eighteenth century and continues into the nineteenth
century. Perhaps this was why 1800 was chosen. We do have another
project we can adapt and which sets out guideposts for the journey, the
Nineteenth-Century Short-Title Catalogue, the brainchild of Frank
Robinson. It could provide a basis for the expansion just as the STC
and Wing provided the basis for the extension backwards in time to 1475.
I have mentioned en passant some current projects related to the ESTC.
Let me describe them now because they are a portent of how data may be
added in the future. I believe we all recognize the need to maintain
our momentum until the pre-1701 section is complete, that is, until we
have a proper bibliographical record for every item in the two printed
catalogs which have defined the publishing history of printing in
England from 1475 to 1700. So how else is the completion of the record
to be achieved? Recognizing the need to find alternative methods, we
asked Research Libraries Group (RLG) to investigate a new, limited
means of access to the file that would permit libraries to enter their
own holdings directly into the file; they could not enter the
bibliographical record itself, and thus our editorial control would not
be diluted. That study is currently underway. In the interim we have
tried another, very fruitful alternative.
In a new departure, the Folger began entering its
holdings directly into the file. They send us only queries and reports
for items new to the file which we promptly catalog. We had entered a
major part of their STC holdings by consulting their printed catalog
when we were at work on the STC. But now they are verifying those
entries, adding copy notes and collations, completing the census, doing
the same for the 1641–1700 period, and finally, reporting unreported
eighteenth-century items. This project started so successfully that we
developed a similar project with the Henry E. Huntington Library. We
have followed a similar course with the Library Company of Philadelphia
and are now negotiating similar arrangements with other libraries here
and in the British Isles.
Our present method of recording holdings, both in-house and by
surrogate, is a wholly manual one. It is labor intensive, and the cost
is such that we cannot complete the task by this method alone or even
by cooperative projects along the lines we are currently using. As a
consequence we are exploring machine-loading. RLG has agreed to provide
a mechanism for machine-loading that will not undermine the accuracy of
our reporting. One of our longtime catalogers is also a gifted
programmer. He has found a number of ways to manipulate data within our
STAR system that have proved to be very useful. With the consent and
support of the Bibliographical Society and the Modern Language
Association, we now have machine-readable versions of both the STC and
Wing mounted in-house. To the STC file the British Library added the
holdings of the Oxford Colleges surveyed by Paul Morgan. We are now
negotiating for a machine-readable version of the pre-1701 Cathedral
Libraries Catalogue sponsored by the Bibliographical Society in
England. In both cases the holdings of these institutions reported in
the STC and Wing are limited in nature. And, of course, we have made no
canvass of the STC and only a limited one for Wing so that our holdings
record in the ESTC for the pre-1701 period is severely limited. But by
uploading the enriched machine-readable versions of the STC and Wing
directly into the ESTC in RLIN, we can remedy this deficit with minimal
labor. I am convinced this is the way of the future.
Over time every library will convert its manual
catalog to a machine-readable format. If the relevant records can be
extracted by machine manipulation and sent to us, we have the prospect
of further machine loading. We are actively studying this process now.
Once we have worked algorithms for that process with a level of
accuracy we can accept, we will test the programs file by file. Even
though there may be many records the machine-match algorithm will be
unable to handle and which will have to be examined on a one by one
basis, at least the number we must deal with manually will be far less
than it is now.
We have tried to make the ESTC what I like to call a “one-stop shopping
center.” I believe that scholars are best served if they can locate not
only all the copies of a particular item in the ESTC, but also
facsimiles. Thus we have taken considerable pains to add citations to
microfilm sets and hard copy facsimiles. In many cases the reader needs
only access to the contemporary version of the text but not necessarily
the original. The rarity of many of the items often makes access to the
original impractical or impossible. I think this could be taken a step
further now that we have hypertext markup language (HTML). I would like
the scholar to be able to highlight the citation in such a fashion that
an order could be transmitted for a copy and it could be delivered
directly to the individual who seeks it. A second expansion could be
the inclusion in the ESTC itself of visual images of the title page and
other critical pages of the texts recorded. Cost again is the key
factor, but as so many of the texts have been filmed, and the editorial
offices at least have photocopies of a large percentage of the title
pages, it may be possible to begin, at least on an experimental basis,
without even having to go back again to the hard-copy original. And if
the publishers of the major microfilm sets digitize those sets at some
point in the future, the ESTC is the ideal candidate to index those
images by hot-linking our records directly to the images.
Even more exciting is the availability of the text
itself in a digitized format. Facsimiles are one thing. They have
inherent physical limitations and because of limited indexing access to
specific words or topics in the texts themselves, usage remains limited
and time-consuming. But if the text can be digitized and accessed
through the ESTC then another technological revolution affecting
scholarship will result. To be cost-effective the texts must be scanned
rapidly and virtually error-free. That is not possible given the
current state of scanners and software. Hand-set type, with all the
variations imposed by intensity of inking, fading, deterioration of
paper, print-through, and the like, make it currently impossible. The
alternative of keying the texts is simply too costly for large-scale
conversion projects, although a few examples exist. But I have no doubt
that at some point in the next decade or two the technology will
advance to the point where it will be possible. And if it can be done
from microfilm the conversion could advance quite rapidly. I am
particularly interested in seeing it take place with newspapers, above
all the Burney Collection at the British Library, but digitization of
all kinds of texts would result in major benefits for scholars. It is
impossible to predict how the searching and retrieval of digitized
texts would transform scholarship. I think of Charlton Hinman’s
pioneering work on the comparison of copies of the first folio of
Shakespeare and how it revolutionized not only Shakespeare scholarship
but our knowledge of the early English and Europe printing
establishment. One day this will be all done by machine.
I have mentioned alternative methods of providing access and also the
existence of similar national union catalogs, though none are so
advanced as the ESTC. But there are machine-readable national
bibliographies in Finland, Sweden, the Netherlands, Germany, Spain, and
Italy. The Consortium of European Research Libraries (CERL) was formed
to create a single European catalog of the handpress by merging these
individual national catalogs. The ESTC is the largest of the national
union catalogs made accessible by CERL. The goal is to be able to
search the full range of records from the handpress in Europe with a
single command. And by linking the ESTC to the CERL database we would
address one of the possible expansions of the ESTC we have considered:
adding continental translations of English texts. By linking the
records with uniform titles, both editions in English and other
languages of a single text could be retrieved with a single command.
The ESTC also makes possible, or, at the very least, enhances many
derivative projects. Aside from the shared cataloging use and the
ability for libraries to extract their own holdings (with records
attached at minimal cost for loading into their in-house catalogs),
there are a host of bibliographic projects that build on our efforts.
Most recently we supplied 1,500 records to the compilers of a Wesley
bibliography at Southern Methodist University.
The institutional expertise and memory we have built
up over two decades is too precious to be lost. As a result, we have
been exploring for some years the means to providing a long-term home
with the necessary resources to insure not just its survival but to
give it the means to carry out the ambitious program that still
confronts us. I am pleased to report that both the physical facility
and core staffing is now committed by the University of California at
Riverside. We are also working to build an endowment to provide for a
reliable base of operating funds. The recent award of a challenge grant
by the National Endowment for the Humanities moves us along the way to
realizing this goal.
These public-spirited and deeply appreciated
commitments take me to the final point I want to address today. The
ESTC has been an enormous public undertaking on an unprecedented
international scale. How did it happen? It is because of a strong and
continuing agreement by the world of learning and the agencies that
support that world. The 1976 conferences in London and Washington, D.C.
sponsored by the National Endowment for the Humanities and the British
Library, and our 1984 conference held in this very room,(3)
have affirmed the need, the goals, and the investment. I trust this
meeting will reaffirm them.
A project like the ESTC can only be made possible by
a large-scale and continuing commitment of major funding agencies. It
is only because we have great national agencies like the National
Endowment for the Humanities and the Department of Education, and
equally great philanthropic foundations such as Andrew W. Mellon,
Ahmanson, and Pew Charitable Trusts that it has all come to pass. They
did not back off. They have stayed the course, and the University of
California at Riverside has taken the critical step to guarantee our
success. I do want to reiterate in closing that it is only the
existence of these great national funding agencies and major nonprofit
organizations, both universities and libraries, that make projects like
the ESTC possible. It is the success of these large-scale projects that
justifies the continuing existence of these organizations. It is a
symbiotic relationship of the best kind. Long may it continue!
Notes
1.
-In the three years that have passed since this paper was
delivered and this manuscript is being prepared for publication, much
of what I anticipated has come to pass. A mirror copy of the ESTC is
now maintained in our office. Our database manager, Alain Veylit, has
developed a number of user-friendly and sophisticated forms of access.
As well as adding data directly online, we also enter it into the
in-house file and then batch upload the new data periodically into
RLIN. We are able to make all sorts of extractions and analyses that
enable us to perform our task better. We can receive machine-readable
files from contributing libraries and match more than 50% of the
records directly. The remainder, most of which could be one of several
printings, still have to be matched by hand. We are constantly
expanding our use of the in-house file.
2.
-David F. Foxon: English verse 1701–1750: a catalogue of
separately printed poems with notes on contemporary collected editions
(London; Cambridge University Press, 1975).
3.
-Trustees Room, New York Public Library at 42nd and Fifth Avenue,
New York City, NY.