Borrowing by LC classification FY 2012-13

      Comments Off on Borrowing by LC classification FY 2012-13

Working with historic circulation data the other day I discovered that affiliates of the Washington Research Library Consortium borrowed some 20,653 monograph titles from Mason’s libraries during FY 2012-13.  Thinking about Mason’s collections and those of fellow WRLC members, I assumed that the strength of our STEM collections would surely push those sorts of titles to the top of any “what gets borrowed sorted by LC classification” list…

Did that happen?

Not exactly.   Here’s an excerpt of the list that aggregates the number of titles borrowed by each book’s LC classification number.

To explain,  the first entry shows we circulated 1,067 titles with the  LC classification stem of PN (e.g.,  PN 4748.G7 C48 1997).

LC classifications with a circulation of at least 100 titles:


The top ten subject classifications account for 8,311 titles or just a fraction over 35% of the all circulation to WRLC members.    The first STEM subject appears in position # 7 (the QA’s) and not again until #24 on the list  (Medicine – Internal Medicine).   Totaling the titles for all STEM classifications shown in this list covers 1980 titles  (or about 8% of all titles circulated).   I can think of a couple of alternative explanations for this phenomenon:

  • STEM research, for the most part, doesn’t seem to involve books, or
  • WRLC member libraries have very strong STEM collections and don’t need to borrow books from Mason

A colleague pointed out yet another possibility:  my 100 circulation threshold may be suppressing the count for STEM titles–given that those areas tend to have more finely-grained LC classifications.

And finally, my friend Dorothea Salo (@LibSkrat) pointed out that “need it now” might be the reason STEM borrowers don’t request titles from other libraries:


More research needed to nail down that last one down.

[update 3/28/2014]  Today my colleague Theresa Calcagno pointed out another factor that I haven’t thought about–it might be that e-books in STEM subjects are suppressing the number of circulations for those subject areas. 

I reworked my SQL and ran a variation of the query to get numbers on titles borrowed by Mason-affiliated borrowers during the same period: broken out as before by each title’s LC classification.   This dataset included 66,733 distinct titles (the WRLC dataset had 20,653 titles).    As an aside, some 6,357 titles were common to both datasets.

Here’s the result for any class with 450+ circulations:

newmasonQA (computer books for the most part) jumped up to #2 on this chart but otherwise, the STEM titles just don’t seem to circulate like those in the Social Sciences and Humanities (next best showing for a STEM discipline was TK at #19).  I’m ready to say I have an answer for what I’ll call the  “Salo postulate”


I’m now wondering what, if anything, the similarities and differences in these two lists can tell me about collection strengths and weaknesses at Mason and WRLC libraries.

1 million+

      Comments Off on 1 million+


Last week we logged the 1,036,791th search on our Primo system. I think I’m most impressed by the nearly vertical line between the start of this semester (August 24) and early October.

“But aren’t you supposed to minimize redundancy in a database…”

      Comments Off on “But aren’t you supposed to minimize redundancy in a database…”

I’m building an SQL database to help with assessment of library services.   Today’s autodidactic activity involved counting the number of students by status (undergrad, grad or law) based on the declared home address in the registrar’s database.

Finally nailed the SQL syntax but as I looked more closely at the results, excitement waned…


I suppose there are other ways to mangle Washington, DC during data entry.  Maybe if I were to just count DC in the “state” field–surely that would be a reasonable proxy for living in the District, right?


Ordinarily, yes, but that student from Bangalore, India, DC still leaves us with an off-by-one error.



EZproxy Logs and Assessment

      1 Comment on EZproxy Logs and Assessment

Seems librarians are more interested in assessment today than was the case just a few years ago.  There are many reasons for the quickening of interest but I suspect most cluster around one or the other of these themes:

  • – a sense that libraries need to justify their existence, relevance, etc.
  • – everybody’s talking about data so you need to have some to be seen as serious

In this sort of environment, the prudent course is to prepare for a dramatic rise in “Can we get some numbers on this?” questions. If you haven’t heard them yet they’re surely coming.

One way to get ready is to build in ways to measure new services as you develop them. Another is to look at data you already have and see if it can be enhanced to deliver a more compelling usage metric.

For my library, one option worth investigating–and the only place where I know I can capture every bit (and byte) of what’s happening with e-resources for off-campus users–is our EZproxy server. Is there a new way to look at the activity logs on that system? Let’s see…

Continue reading

690,000 searches

      Comments Off on 690,000 searches

2013 09 03 10 24 36


This graph charts the 690,000 searches conducted on our Primo system since January 1, 2013.   I am happy to see that vertical jump during the final week tracked by the graph–the first week of this term.


      Comments Off on 4%

We placed an search widget on our library’s home page once Primo went “live”–pretty standard fare for libraries that implement a discovery product.   Our search widget looks like this:

2013 07 17 22 14 14

You’ll notice we’ve put a lot of explanatory text (which, of course, no one reads) and a number of options.  That little “locally-held collections” box was added so a user could limit search results to just our Voyager catalog and our DSpace and LUNA systems–reducing the noise that enters a result set when you include in the Primo Central Index content.

Thanks to some logging this widget performs, we know that since January 8, 2013, it has been used to launch 104,186 searches.  For 4,180 of them, the “Limit to locally-held collections” box was also checked.

Which means our usage stats show that our little “limit” checkbox gets ignored 96% of the time.   Should be easy to make the case that we should just remove it, but still…

  • it is used in 4% of searches
  • it likely performs a useful function for the 4% that select it
  • it imposes no real penalty if you choose to ignore it

I understand why some lean toward a search box that offers no options and very little explanation–just enter something and see what you get.   I also appreciate the fact that you can offer a user so many choices and options that all you’ve really done is increase the odds that they’ll choose the wrong thing.

What I haven’t quite figured out is when is it right to toss a useful feature that you know only a small percentage of people use.


Content Neutrality

      Comments Off on Content Neutrality

Participated in a panel at ALA last Saturday:

“Hiding in Plain Cite: The Growing Importance of Content Neutrality in Library Discovery Services”

Roger Schonfeld served as moderator.  Joining me on the panel was Lisa O’Hara, University of Manitoba; Todd Carpenter, Executive Director of NISO;  and Amira Aaron, Northeastern University.

One of the questions I took the lead on was, “What does Content Neutrality Mean to You?”   Here’s my response:

I’m sure most have heard the phrase “net neutrality” — a network model that says bandwidth providers should treat all data that moves across their network in the same way. It is certainly true that many ISPs are just in the bit-moving business..providing network access..but a smaller percentage also provide content. It’s that vertical integration (meaning significant parts of the supply chain fall under the same owner) that gives rise to trouble.

For example, in a net-netural world:

  • Comcast as an access provider shouldn’t shape traffic in such a way that Netflix video streams end up slower than content flowing from Comcast’s own Xfinity platform.
  • Verizon shouldn’t count Amazon video streams against a user’s data allowance while exempting the same user from cap charges on a FIOS video stream.

“Content neutrality” is a similar idea.  Our “access provider” in this instance is the discovery platform vendor. The analogs to traffic shaping or billing distortions occur instead around the metadata that’s being searched to “discover” relevant content. As with ISPs and net neutrality, there are some companies that just provide a discovery platform and others that are also in the content business.  As before, vertical integration and perceptions of competitive advantage are problem incubators.

Continue reading

Primo Searches

      Comments Off on Primo Searches

Eleven weeks in two things are clear:

1) use of Primo is increasing

2) researchers begin winding down a week before Spring Break and appear to need another week to get back into the swing of things.


searches on primo


Seems we now average roughly 30,000 Primo searches a week.   Over half of the 250+ public workstations in our libraries still default to our “classic catalog” (Voyager) so I suspect we haven’t yet hit Peak Primo.


      Comments Off on update…


Spent the past few days watching everyone move out of the area (due to impending construction of a new wrap-around expansion of Fenwick Library, various offices are emptying ahead of construction).  I’m still inside the office to the right (with the box near the door) but everyone else has moved to the Johnson Center library.

Plus ça change.

Over the next month, I’m sure we’ll get back to the way things looked just before we moved to Fenwick late in 2006.


In other news, we finally got a new Sun server installed over in the campus data center and for the past two days I’ve been learning more and more about ZFS.    Built a working mirrored ZFS pool today that’s currently backing up our Voyager server (via NFS).   Speaking of which, having two thirds of our server farm colo’d in the data center, I’m seeing the difference gigabit ethernet makes.   When I backup a Fenwick-based server (we’re still 100BaseT over here) to a unit in Aquia (gigabit), it seems quite sluggish compared to moving files between the machines in the data center.     Maybe we can upgrade to gigabit as they construct the Fenwick addition?

Thankfully, our proxy server (the hardest working machine in the library) now lives on the gigabit backbone.


As shown below, we continue to see building interest/usage for our Primo installation.  This past month (January 14 – February 13) we’re averaging 2300 searches per day.  Earlier this week, we hit a daily high (4,554).   That high occurred on a Tuesday, demonstrably our busiest day.  I’ve added a red dot for each Tuesday in this graph.

2013 02 14 14 09 30


Picking up speed

      Comments Off on Picking up speed

A short note to follow up on the Primo usage graph from a few weeks ago.

This graph shows the number of searches per day from Saturday, January 12th through Thursday, January 31st. We’re now regularly seeing well over 2,000 searches per day, busiest Wednesdays and Thursdays (the red dots are Thursdays).