DSpace and Omeka

      5 Comments on DSpace and Omeka

I continue thinking about how to exploit a user-friendly tool like Omeka to enable library staff to build inviting exhibitions of digital objects—more particularly, those stored in our MARS (DSpace) repository.

We already stretch DSpace a bit in the sense that we use it both for the sorts of e-publishing things common to most IR’s but also as a place to archive digital objects that may have nothing at all to do with scholarly communication (e.g., digital images from our Special Collections area, a few web sites, various data files, audio files, source code archives and so on).

This dual-use of DSpace was a conscious decision from the start—we didn’t think the open access function would generate sufficient use or interest in the system quickly enough to give us an immediate and unmistakable win. By including a bit of digital archiving along the way the library took an active role in building content and made MARS a much more interesting destination. One of these days the open access activity on campus will catch up with the support infrastructure we’re building but there’s clearly value in keeping those disk drives spinning with related tasks until that happens. But I digress…

We also use DSpace as the “front door” for some of these collections but it’s far from an optimal solution. While we have an attractive local instance of DSpace (thanks to Dorothea for the initial CSS work) there just isn’t that much you can do to make DSpace look like anything more than what it is: a list-centric online catalog of digital objects.

To use a metaphor from the world of art, DSpace gives us an unadorned catalogue raisonneé and I’d like to be able to offer visually interesting exhibition catalogs.

Having this ability would enable us to focus our DSpace installation on those things it does well (bitstream ambivalence, an OAI source for Google Scholar, OAIster and similar services, a standards-compliant metadata repository, a single store to back up and so on) and use other tools to enhance the discovery and utility of some of the objects stored therein.

I think I may have figured out a strategy of sorts and post it here in hopes of getting feedback (and offers of assistance if it strikes the right note with any reader).

kludge.jpg
A year or so ago I downloaded a copy of Harvester from the Public Knowledge Project at Simon Fraser University in western Canada. At the time I used their open source product to build a “union” catalog of several local DSpace installations including our own MARS system. It worked well and I still log in and do an update “crawl” of the contributing systems from time to time.

Now I’m thinking we could perhaps modify the open-source “harvest.php” code that ships with Harvester2 to pull metadata from our MARS system via the OAI protocol and use it to populate an Omeka database. Yes, we could just write a Postgres -> MySQL conversion utility to capture our DSpace metadata (which lives in a Postgres database) but I think an OAI import module might prove more useful to the Omeka community (after all, there are DSpace sites that don’t use Postgres and OAI sources that don’t use DSpace).

Once an Omeka database of items was built using the DSpace metadata, non-technical staff could log into Omeka and build exhibits. While the metadata imported into Omeka would form the basis of the exhibition, when the time came to display a particular digital object, we’d have Omeka slipstream the bits from our DSpace repository (following the handle back to DSpace for the appropriate item).

I can think of a few gotchas right off the bat:

  • DSpace handles don’t resolve to a particular bitstream (although it is possible to code an algorithm that moves from handle URL to bitstream URL on a given server). Figuring out the “appropriate” bitstream from an object that contains multiple streams is a problem we’ll have to deal with.
  • It will probably be necessary to store at least a thumbnail of images in the Omeka database—not only to reduce network traffic on browse pages but also to simplify the task of staff selecting items to use in a particular exhibition.
  • We’d have to figure out a mechanism to allow subsequent updating of the Omeka database to reflect changes in our DSpace installation (additions, edits, etc.).

Taking the easy way out, I’ll just assume these are the sorts of optimizations we’ll tackle once we know whether the basic functionality can be achieved. We’re now at the “delve deeper into the harvest.php code from PKP” and I’ll report back when we have something of wider interest…