For your textual pleasure…

Earlier today I was running a few SQL queries against our local Voyager system–preparing for the upcoming metadata migration to a consortial implementation of Alma. My tool of choice for this sort of thing is Navicat and as I worked through a series of “count this for me” queries, like…

how many bib records have NULL in the NETWORK_NUMBER field? 54,995
how many have an OCLC number in that field? 1,640,304
exactly how many bib records are there in the database? 3,490,929

…I realized that Navicat made the export of data in a variety of formats a reasonably trivial exercise. Thinking it might be somehow useful for people sharpening their text-mining chops in our new Digital Scholarship Center (2nd floor, Fenwick Library), I decided to build a text file of brief bibliographic data (author, title, publisher, date, etc.) from the 3+ million records in our Voyager database. A simple click in a checkbox produced both JSON and XML versions of the metadata

The zipped versions of these files are roughly 200MB each.

Click the link below to retrieve the JSON recordset.

https://dl.dropboxusercontent.com/u/166896/MasonCatalog.json.zip

XML? Click below…

https://dl.dropboxusercontent.com/u/166896/MasonCatalog.xml.zip
2017 03 09 10 31 53

Sample record in the JSON version of the file

The XML version has a couple more data fields (LCCN and SERIES) if available in a record.

If you end up using this data for anything useful (or need a slightly different extract), send me a tweet

Post Views: 3,734