Shhhh….Theory Testing In Progress

      2 Comments on Shhhh….Theory Testing In Progress

logo_chnm_150.jpgIt is such a pleasure to work with people as smart and creative as Dan Cohen over at the Center for History and New Media.  Yesterday’s announcement (read about it here, here, or read what  Dan blogged yesterday) is the latest manifestation of an idea I know he’s been refining for several years.

Back in 2004, I met with Dan to talk about ways I might help the logo.jpggroup working on SmartFox (Zotero‘s original name) capture bibliographic data from our web-based library catalog. Truth is, they didn’t really need any help from me but it was fun to spend time with them talking about the project’s future.

In one of those meetings I suggested that it would be really cool if a researcher could somehow expose his local database to other researchers across the network (I had in mind some sort of embedded SmartFox service–you’d give a person access to your machine and they’d be able to query the items in your local database that you tagged “shareable”). It might have been an interesting idea but it wasn’t terribly original or very ambitious. I guess that explains why Zotero doesn’t do something like that today.

You want original and ambitious? How about a site where thousands of researchers are uploading and downloading gigabytes of data without benefit of quality standards, agreed-upon metadata guidelines, normalized naming conventions, and so on. Chaos, right? Well, maybe not. As I read Dan’s post, I think this new Internet Archive partnership has game-changing potential.

Speaking as a librarian, it might be just what our profession needs to drive home the fact that it’s time to stop worrying so much about exchanging metadata and start focusing more energy on facilitating the actual exchange and use of digital objects.

As I understand it, the “metadata’ for these objects will be drawn from OCR scans and (I’m guessing here) some sort of tagging (perhaps drawn from the contributor’s local Zotero database or something created on the IA server by the user community). There will be persistent URLs which will have embedded timestamp metadata but that’s about it. The actual content of the objects will be the primary source of both discovery and organization.

Won’t it be instructive to see if this turns into a unholy mess or a self-tuning, extremely valuable resource? Either way, we’ll have lots of new data on this question: “if we have full access to content just how important is metadata, anyway?”