A year or so ago I gave a presentation to our university’s President’s Library Planning Task Forceâ€”a group of faculty charged with helping describe where the library needs to be in 2010. The focus of my contribution was technology in the library–where we’ve been and where we’re headed. In the course of that, I took a fairly lengthy detour to talk about federated searching–not because we make much use of it but because I wanted the group to better understand the reasons we haven’t embraced it more fully. After all, that single search box is a seductive thing.
No need to rehash here the arguments made (a QuickTime video of the presentation slides is available in our MARS repository if you have both excess bandwidth and nothing better to do) but the theme was a simple oneâ€”federated searching just doesn’t work very well: not everything gets searched, compromises are made which tend to dumb-down the underlying target systems, true de-duping is virtually impossible, the value of relevancy ranking falls away since it’s based only on the metadata the search returned not the underlying content and so on. I was essentially talking about the difference between what results from “Just in Time” searching as opposed to “Just in Case” indexing.
I still think these problems and limitations are present in the metasearch products marketed to libraries but I’m beginning to see a future for metasearching that I was missing. Instead of relying on a centrally-hosted metasearch intermediary, why not pull the search piece back to the desktop where more flexible and powerful tools can be employed? In effect, this local application becomes your personal agent–searching the net in the background as you go about other tasks. If well designed, it can then handle de-duplication of results, rank them based on relevancy and present them to you in ways that transcend what a browser-based system can offer.
For the past week I’ve been reviewing an application that moves us toward that goal. It’s an OS X application (still sporting a “Panther” interface) but what it runs on is far less interesting than what it does.
Note to Windows users: head over to Copernic for a similar product. Ironically, Copernic was once the go-to product for this sort of thing on the Mac but they abandoned the platform when OS X arrived.
Out of the box, DEVONagent is a desktop metasearch engine for the open webâ€”Google, Yahoo! and Metacrawler but also newsgroups, blogs and so on. You set the level of depth you want and enter your search. You can specify a fast scan (e.g., the top 100 results from just a few search engines) or go deep (you decide the maximum x hits per source from a long list of target options) with complex boolean logic fully supported. DEVONagent runs the search, gathers the results, throws away junk hits, eliminates the “404’s”, de-dupes the remaining content and then ranks the results. That alone is impressive but it’s really only the start.
It also builds an interactive topic map that shows the primary concepts from the now merged set of results. Click a topic in the map and you see lines drawn to others that are closely related. The text box below the map changes with each click, displaying excerpts from matching pages in the result setâ€”each time highlighting the relevant topic in context. Click a new topic in the map and everything changes.
DEVONagent also offers the ability to save individual documents or the entire set to a local database (called the internal archive) for future use. Pages can be converted to RDF documents if desired.
If you already use DEVONthink (a personal information management package), the “send to DEVONthink” button on the toolbar comes in quite handy. I’m a Yojimbo user (can’t seem to give up the .mac database syncing Yojimbo provides) so I removed the DEVONthink button but the idea of integrating this sort of program with a local database manager is a good one. I get matches into Yojimbo via printing (selecting “PDF to Yojimbo” as my printer). It’s also possible to use the “Launch URL” service to send the link to Firefox for inclusion in a Zotero database.
But wouldn’t it be great if DEVONagent could also search things like JSTOR, First Search and other “restricted” contentâ€”like those centrally-hosted federated search systems I complained about earlier? Yes, and an XML plugin architecture is built into DEVONagent for just that purpose. It’s also here that the program comes up a bit short. Unfortunately, the “build your own plugin” function is poorly documented and much harder to configure and use than it should be. I was surprised to find that you can’t just go to DEVONtechnologies and download various XML plugins (in the way you can grab connection files at endnote.com). Reading through the support forums, I realize that this is increasingly what users are asking for so I guess there’s hope. Perhaps DEVONtechnologies thinks bringing order to the chaos of open web searching is sufficient achievement but I’d argue this could be a killer application with just a bit more work.
I did finally manage to build and use a JSTOR plugin but only after benefitting from a post in a DEVON forum and finally guessing the correct file extension to use when saving my plugin (turns out DEVONagent was looking for .plist instead of .xml). I was using TextMate to build my XML file instead of Apple’s Property List Editor so I didn’t get the automatic .plist extension.
Note to DEVONtechnologies: If your program is going to ignore plugins that don’t carry a particular file extension you should probably mention the extension to use at least once in the documentation.
Deep Web / Proxied
What I haven’t been able to do is use DEVONagent to search a site like JSTOR when it’s behind a proxy/authentication server. I hope to solve this problem soon (and I’m close) but it’s uncharted territory. Like many libraries, we use EZproxy to deliver content to authenticated off-campus users and the proxy-by-port-number scheme appears to be complicating things. I’ve had some luck using DEVONagent’s built-in web browser (built with Apple’s Cocoa WebKit) to authenticate with EZproxy before running an agent search but things still don’t work quite right. I’ll post a fix if/when I figure this out.
I don’t want to end this discussion on a negative note. DEVONagent is a great program and after using it for a few days I can honestly report that I’m going directly to Google much less often. Not only am I bypassing advertisements and sponsored links, I’m able to do other things while I let my “agent” handle it. There are still many parts of the program that I’ve not yet explored/mastered and a couple more video tutorials I want to watch on the DEVONtechnologies site but I’m confident I’ll continue to rely on this program. I also highly recommend the downloadable PDF documentation from the developer’s siteâ€”it will definitely improve your use of the program.
A fully-functional 30 day demo of DEVONagent is available. Should you decide to register the product, it sells for $49.95. Educational users can receive a 25% discount.