RSS feed from Voyager

Had a request so here is the perl code for creating the ‘on-the-fly’ RSS feed, based on a call number prefix.

This script uses the intermediate newbooks.txt file that Michael Doran’s newbooks program creates. It isn’t necessarily the best perl code but if you realize that, you probably have the power to fix it up. I gave some thought to optimizing the thing (in terms of where processing gets done) but I’m sure it could be tuned further.

Basically, the URL to call the feed looks like this:

http://myserver.edu/cgi-bin/newrss.pl?QA76

which will produce a feed of books that have QA76 in the call number. It uses the ‘squashed’ callnumber (without any spaces) to do the comparison. Anything after the ‘?’ in the URL must be at the start of the call number to be retrieved.

#!/m1/shared/perl/5.8.5-09/bin/perl
# this program processes a flat file created by M. Doran's 
# newbooks system (newbooks.txt)
# author of this program:
# w. grotophorst, (c) 2005, Lost Packet Planet
# Program may be freely copied, modified & improved.
#########################################
#
# less variable variables
#
$fromlink = "http://lso.gmu.edu/index.php";
$inputfile = "newbooks.txt";
$NumToFeed = "15";
$URL2Voyager = "http://magik.gmu.edu/cgi-bin/Pwebrecon.cgi?BBID=";

#########################################

$ToFind = $ENV{'QUERY_STRING'};
$ToFind =~ tr/+/ /;
$ToFind =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;
$ToFind =~ tr/\cM/\n/;
$ToFind =~ s/[a-z]/[A-Z]/;

$titlestr = "New Books - University Libraries - ".$ToFind;
open(INFILE,$inputfile);
print "Content-type: text/xml\n\n";

print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
print "<!DOCTYPE rss PUBLIC ";
print "\"-//Netscape Communications//DTD RSS 0.91//EN\"\n";
print "\"http://my.netscape.com/publish/formats/rss-0.91.dtd\">\n";
print "<rss version=\"0.91\">\n";
print "<channel>\n";
print "<title>$titlestr</title>\n";
print "<link>$fromlink</link>\n";

$line = <INFILE>;
$foundit = 0;
$numfed = 0;

while ($line ne ""){

if ($numfed < $NumToFeed) {

# check string to see if characters matching call# stem appear anywhere, if
# not, go on to next line from newbooks.txt 


$t = index($line,$ToFind);


if ($t >= 0) {

    # call # stem is in the line, blow it apart & see if it is actually in call#
    # section...the 8th data element that M. Doran's system puts on the line
    $line =~ s/&/&amp;/g;
    $line =~ s/>/&gt;/g;
    $line =~ s/</&lt;/g;
    $line =~ s/"/&quot;/g;
    $line =~ s/'/&apos;/g;

        
    @itemdata = split(/\t/,$line);


    $call = @itemdata[7];

    $t = index($call,$ToFind);
    if ($t == 0) {

    # now assign other 'split' values from the itemdata array

    $bibid = @itemdata[0];
    $author= @itemdata[1];
    $title = @itemdata[2];
    $publ  = @itemdata[4];
    $location = @itemdata[5];

    $numfed++;

    print "<item>\n";
    print "<title>$title</title>\n";
    print "<link>$URL2Voyager$bibid</link>\n";
    print "<description>$author $title $publ $location $call</description>\n";
    print "</item>\n";
    }
   }
  }
 $line = <INFILE>;
 }

print "</channel>\n</rss>\n";
close INFILE;

p.s., You may not know how hard it was to get this code listing to appear correctly in this blog entry…but if you do then this will qualify as “tip of the week.”
I tried a couple of WordPress plugins but nothing seemed to work right (code was being rendered as HTML or worse, WordPress was making bad assumptions about what I was trying to do). Finally, I opened the perl script in SubEthaEdit on my desktop (to do some line shortening) and when I selected “all” and got ready to copy it back to an xterm window, there was the option I needed, had never been more than a simple right mouse click away: “Copy as XHTML“. SubEthaEdit even threw in a bit of XHTML markup to make the little black box around the entry. Yet another reason that’s a great editor.

Post Views: 1,583