Perl Case Study - News Grabber CGI
By Lisa Hui
The libwww-perl bundle has two functions that pluck web page information for you from the internet, which ends up being quite handy. LWP::Simple and LWP::UserAgent provide the functionality and you can use the use operator to include them in your script.
This module has a get() function that returns a long string with the HTML code from a URL specified within the parentheses. Just to show you how it works, try a simple test script like this:
$_ = get("http://www.thinkquest.org");
print "Content-type: text/html\n\n";
See this script in action: lwpsimple-test.cgi
Running those three lines of code on your server would result in what seems to be the ThinkQuest front page loading in your browser. Check the URL - it isn't a redirection - just the script copying over the HTML code at the specified URL.
Also notice that it is all stored in one string variable (the default one $_). But how do you get what you want from this string? You'll want to use substitution expressions (pattern matching) to remove the unwanted data.
[Note: the script would not run on this server - possibly because the bundle files are not installed here]
Since we're using the default variable, we can omit explicitly stating this in the substitution expressions below:
They are the same as explicitly stating $_ =~ s///s; ("s" stands for substituion - meaning that the value inbetween the first two slashes / / is being removed. The second set of slashes is what is being substituted in its place.
What's the difference between LWP::SIMPLE and LWP::UserAgent then? Simple can handle only GET queries (in which the data is passed through the URL itself) whereas LWP::UserAgent can 'send' POST queries - and retrieve the data with the help of HTTP::Request::Common.
We're not going to go into this as of yet - we did cover what we set out to do: a quick run through of how the simple module can "grab" news from a page - but I'll let you know when this section gets an overhaul.
Last Updated August 16, 1999
©1999 Team 26297 "Ad Infinitum Web." All rights reserved. Any reproduction of this document for commercial or redistribution purposes without the permission of the author is forbidden.