July 21, 2004

Scripting for hotties

As i previously mentioned, i had some difficulty installing the LWP perl module, but i was finally able to get that done before i went to bed yesterday. I wanted to get that done because i had a reason to write a perl script from scratch for the first time. So far, the only ones i've tried to run are those I copied out of a book. It's the first time that i had a problem that perl seemed like the perfect tool for.

Here's what i was doing. I was going through the Civic Theatre web site logs, and i noticed an unusual number of referrals from the domain pickthehottie.com. I checked out the site and it's one of those sites where you can post your picture to see how "hot" you are by putting your face next to someone else's and where the casual web surfer can cast his opinion as to who is more attractive. The limited reporting options available from the GRCT site was unable to point me to the picture responsible for this traffic. I clicked around for a bit to see if i could stumble across it; but after spending more time than i should have on it, i determined this method was ineffective. I needed a way to automate the process. I choose perl.

Perl makes it very easy to grab the HTML from a web server via the LWP::Simple module. I just call get() function with something similar to "$html = get($pageURL);" and i'm ready to extract information. If you enter a particular picture code, the site will show you who that picture recently lost or won to. I grabbed one of those pages, and via the magic of regular expressions, extracted the URL to each competing image as well as its corresponding picture code. I stored picture codes that i came across in a hash and "pushed" a link to its own results page to an array. I then "shifted" through each of the values in the array to determine which page to crawl next. I used another regular expression to check each image URL i came across to see if it contained the "grct.org" domain. When i found one that did, the script would quit and show me the corresponding picture code.

It took me a bit of tinkering to get my script up and running. (My biggest source of mistakes seemed to be semicolon related.) When i finally got something that worked, i went to the site, grabbed a picture code from the home page to start with, and let it run. Surprisingly, it took about seven hops, and came back with my answer. It worked like a charm.

Admittedly, this was probably a silly thing to use perl for, but it's nice to know i can add a few more tricks to my bag in case a more important task should arise. Oh yeah - the guilty party? Laurie. He he he.

[Update -7/22/04] I went ahead and added the script to my code lab.

Posted by Matthew at July 21, 2004 11:44 PM
Comments