VuFind and OSX

      Comments Off on VuFind and OSX

xeon.jpgWe received one of the new Intel-based Apple XServe machines the other day and I thought it would be great to build a test VuFind installation just to see how the application scaled on a server-class machine (two 2.66Ghz Woodcrest CPUs with 4 GB memory). I felt that the test database I had been using (300,000+ records) was a strain on a small single CPU desktop machine and beyond seeing that the application worked, I haven’t been able to stretch it with any confidence. I was eager to see what might happen on a faster server with a larger database.

We have several G5-based XServes here in the office (running major applications like MARS, our database portal, the campus directory, our video server, and so on) but I was still surprised at how much faster the new Intel-based XServe appeared to be. I soon had proof that it didn’t just seem faster. Here are Geekbench 2.0 scores for a couple of machines (a 1.6 Ghz G5 scores 1000 on this test):

  • Intel X-Serve, dual 2.66 Ghz Woodcrest Xeons: 5093
  • G5 X-Serve, dual 2.3 G5: 2065

After installing PHP and VuFind and the Yaz modules, I extracted just over 1 million records from our Voyager system (any record changed or added since 1995) and ran them through the Solr indexing module. It “burned” through about 500,000 records per hour, completing the task in just over 2 hours.

Show-stopper

There’s a serious snag to moving VuFind to the Intel Xserve: there isn’t an Intel-native port of Oracle or the InstantClient libraries so there’s no way to install the PDO_OCI driver for Oracle on this XServe. All the Oracle code is quite old and designed for the PPC processor.

So, as I write this post, I have Solr running on the XServe, providing the initial search results page as well as the facet search information you see after an initial search. When you select a particular record, you’re back on the little Shuttle XPC running Ubuntu—the XML file for each MARC record is stored on the linux server. Because that XML file is created during the Solr import (which happens on the XServe), I had to manually tar up the /vufind/data file on the Apple and move it to the little Ubuntu machine. A kludge but it works. With any luck at all, soon I’ll hear that Oracle has decided to make good on their promise to support OS X and we’ll see intel-native code…at which point I can move all of VuFind to the XServe.

UPDATE: May, 2008. Oracle has now posted an x86 Intel-native InstantClient for OSX.

Oh, and here’s a tip you’ll need if you too head down this sordid path. There are so many files in the /usr/local/vufind/data directory (where the .xml files are stored) that tar and other utilities will probably cough and die. Here’s a workaround (these sample shell commands will build a compressed tar file of all files in the directory that begin with 6 and end with an .xml extension):

find . -name ‘6*.xml’ > /tmp/6.files.txt
tar -cvzf 6.tar.gz – –files-from /tmp/6.files.txt

Timing is everything

Ironically, just as I was discovering the “gotcha” with Intel-native code I received an email from Christopher Jones at Oracle, offering to help me get the pdo_oci driver working (turns out he’d stumbled across the iNode post where I mentioned his “Underground Guide to PHP and Oracle” and my pdo_oci difficulties (isn’t the internet great?)). By the time I heard from Chris, I had actually gotten all that working but hey, here was a chance to speak to someone at Oracle so I made my pitch.

A few minutes later I heard again from Chris, telling me he’d spoken again to the database management group, asking for an Intel port. He said he talked with them about “my” application and while he couldn’t relate what they said, he assured me that my “vote” had been duly noted. Thanks Chris.

Vufind on OSX installation notes

Here are my notes on installing VuFind on OSX. It depends on macports (nee DarwinPorts). Make sure you install PHP with the +apache2, +mysql, and +pear switches or it won’t be easy to recover. I don’t use “sudo” in the examples below as I just log in as root. If you don’t do that, be sure to preface the commands with “sudo” as in sudo port install apache2.

1. install developer tools (xcode)

2. download and install macports

3. port selfupdate

(very optional) port install joe [this gets the jstar editor we old Borland people like to use]

4. port install apache2 (will also add apr, db44, expat, libiconv, ncursesw, readline,
sqlite3, apr-util, zlib, openssl, pcre)

5. port install mysql5

6. port install php5 +apache2 +mysql5 +pear (adds pkg-config, curl, freetype,
gettext, jpeg, libmcrypt, libpng, libxml2, libxslt, mhash, tiff)

7. port install wget

8. You can install Yaz-marcdump via port but you get the 2.x version.

Use this sequence instead:

wget http://ftp.indexdata.dk/pub/yaz/yaz-3.0.8.tar.gz
tar -zxvf yaz-3.0.8.tar.gz
cd yaz-3.0.8
./configure
make
make install

[optional step] It isn’t essential but if you link the yaz binaries to /opt/local/bin, they’ll show up in the same path as all the other binary files that get installed via MacPorts. I find that convenient but YMMV.

ln -s /usr/local/bin/yaz-marcdump /opt/local/bin/yaz-marcdump

10. pear upgrade pear

then:

pear install –onlyreqdeps DB
pear install –onlyreqdeps DB_DataObject
pear install –onlyreqdeps Structures_DataGrid-beta
pear install –onlyreqdeps Structures_DataGrid_DataSource_DataObject-beta
pear install –onlyreqdeps Structures_DataGrid_Renderer_HTMLTable-beta
pear install –onlyreqdeps HTTP_Client
pear install –onlyreqdeps HTTP_Request
pear install –onlyreqdeps Mail
pear install –onlyreqdeps Mail_Mime
pear install –onlyreqdeps Pager
pear install –onlyreqdeps XML_Serializer-beta
pear install –onlyreqdeps Console_ProgressBar-beta

Note: Will come back and finish this note when the Oracle code becomes available.