Home

What is Digitization?

WRLC's Digitization Process

Other Digitization Services

Tower Collection Information

Olive Proposal

Conclusion

Presentation

Resources


Digitizing CUA's Student Newpaper


Other Digitization Services

Olive Software

Founded in 2000, Olive was the company used by WRLC-DCPC for digitizing the American University student newspaper The Eagle. Olive has the capability to both digitize and store archives and to convert PDFs into searchable XML while preserving the original layout and full text of the items. Olive software also allows users of the archives to compile headline searches without downloading PDFs and to print and e-mail articles.

A demo is available on Olive's website.

Libraries who have used Olive Software include The British Library, Brooklyn Public Library, and Israel National Library.

Universities who have used Olive Software include Gettysburg College, Penn State University, Oxford University, and Princeton University.

Greenstone

Greenstone began as a project for the New Zealand Digital Library at the University of New Zealand. Greenstone is open-source software that organizes information and publishes it in a searchable meta-driven library on the Internet. Greenstone can run on multiple platforms and has two interfaces, Reader and Librarian. The Librarian contains pre-defined metadata formats that include Dublin Core and RFC 2807 and allows for other metadata schemas such as MARC, XML and BibTex to be added. Greenstone interfaces also accept plug-ins for a variety of text and multi-media files.

Libraries who have used Greenstone include Auburn University Library, Detroit Public Library, and WRLC Special Collections.

Flexible Extensible Digital Object Repository Architecture (FEDORA)

Originally created by Cornell University and later developed by Cornell University and the University of Virginia, FEDORA is open-source software whose services include search engine integration, format conversion, bulk ingest, content versioning, and dynamic views of digital objects. FEDORA provides a "Lego-like" set of digital object building blocks that provide access to the digital objects through disseminators, which can either deliver a particular portion of a digital object or a customized view of the object. The building blocks of FEDORA also support uniform management of and access to multiple content formats such as text, images, and multi-media.

FEDORA has been used by Tufts University, University of Virginia, and the University of Maryland.

Apex CoVantage

Apex CoVantage is an excellent, for-purchase option for digitizing The Tower archival collection. Its content conversion services are adaptable to a variety of materials and provides archival quality scanning, optical character recognition (OCR), transcription and re-keying. In particular, it converts microfilm, microform, electronic data and image files (including XML, SGML, PDF, JPEG, TIFF, and GIF) and handwritten content. Its SmartRef technology works with electronic publications to create interactive links.

Because much of its business is with universities, Apex CoVantage offers unique services that would benefit The Tower. Chief among these is its option to publish in print and electronically at the same time, which would allow for easier inter-operability between the current Tower materials and the planned archives. In addition, the company recognizes the unique cultural resources of newspapers and has created the Global Newspaper Initiative. This program is dedicated to preserving newspaper content in countries wher the funding of digitization is not available.

Apex CoVantage also offers a webinar on it's website that would be useful; it is titled Avoiding the Pitfalls of Newspaper Digitization.

Eprints Free Software

Eprints Software is a downloadable program that allows users to create their own online repository. It is open source software that allows for a large range of customizations. The software is easy to use, intuitive and highly interoperable. It uses a name authority to prevent common metadata typing mistakes. Users can easily import a variety of record types from XML to BibTex and Endnote. It also integrates well with metadata interchange formats (Dublin Core, METS, and MODS) and external services (Google Maps and Similie Timeline).

There are a number of free support services on the site, including a wiki, an announcements mailing list, and a technical mailing list. Like other free software sites, fellow users provide most of the assistance. Though the programming is free to download and use, the Eprints also offers a number of fee-based premium support options such as training seminars and customized help with issues like policy formation and site maintenance.

DSpace

DSpace is a free online platform that captures, distributes, indexes and preserves data in any format. It provides a way to manage and organize a variety of materials in a professionally-maintained repository. It was specifically designed for higher education institutions. DSpace is extremely customizable, and its use by a large number of organizations ensures its continued viability and visibility. According to the DSpace website, files ingested in the near future will be recognized and validated via the Global Digital Format Registry (GDFR) and JHOVE application tools.

Unlike with many other services, however, The Tower staff must be able to use and manipulate the DSpace tools on their own. While there is support available through the website, the support comes from other users who answer questions posted on the DSpace-General mailing list and who provide content on the wiki pages.