|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV SYS-CON.TV WEBCASTS |
TOP COLDFUSION LINKS CF101 Writing an RSS Aggregator
Completing the task
By: Jeffry Houser
Aug. 31, 2006 10:00 AM
Two months ago I put together an article about building an RSS aggregator (CFDJ, Vol. 8, issue 5). Before reading this you might want to refresh your mind on the original article. Go over here - http://coldfusion.sys-con.com/read/235976.htm - to read it.
The last article stepped you through the thought process of designing the database and object model. We built two of the components: an RSSCategory component that is used to categorize the RSS feeds and an RSSFeed component that is used to enter an RSS feed into the database. We also wrote some admin code to enter a new RSS feed into the database. The article was, unfortunately, lacking the real meat of things, which is the RSSAggregator component. In this article, we'll flesh it out along with the item component and the scheduled task for running in. Before we do that, I did find one bug so let's fix that.
One Quick Bug Fix MyXMLVar.rss.channel.title.xmltext The problem here is that the root element, RSS, is hard coded. When I tried to run this code against the weblogs.macromedia.com site, it didn't work. The reason is the RSS feed offered by weblogs.macromedia.com is RDF. The root element isn't RSS, it is rdf:RDF. The fix for this was easy: MyXMLVar.xmlroot.channel.description.xmltext Instead of hard coding the RSS root name, I used the xmlroot value. RDF and RSS handle items differently too, so this will come into play in some of the code from this article.
Writing the Item Component The component starts with the cfcomponent tag (of course) and the pseudo constructor code. The pseudo constructor sets up the instance variables of the component. Once again, our components are borrowing Hal Helms basecomponent from http://halhelms.com/webresources/BaseComponent.cfc. I use this instead of writing manual getter and setter methods. Other than inherited methods, this component contains an init method and a commit method. The init method takes an ItemID and the datasource and loads all the relevant information from a database. The commit method will insert, or update, the information in the database as needed.
Creating the Aggregator The first method is GetAllFeeds. It is a private method, so it cannot be called outside of the CFC. It runs a query to retrieve all the feeds that are being watched in the database. The method returns the query. There is nothing special about this. The second method it called ItemExists. It accepts an item component (which you learned about in the previous section of this article) and checks to see whether this item already exists in the database. If it does, it returns true, otherwise it returns false. I made the assumption that each item has a unique URL pointing to it, so that is the value the code checks to see if the item is unique. The third method is an init method. This is the one that retrieves the feeds and stores the data in the database, if relevant. This is the only public method in the component; the getallfeeds and ItemExists methods are used by init. The init method starts by setting some local variables. These are the local variables:
Next comes a try block. Inside the try block is code to retrieve the RSS feed. If the feed times out, a catch block switches the error value to try. Exiting the try block, if the error is false, the code processes the feed. If the error is true, skip the processing and go right to the next feed. Although left out at this time, there should probably be some sort of logging for feeds that cause errors. Earlier I spoke about the differences between RSS and RDF. Most blogs I read pass out data in the RSS format, but weblogs.macromedia.com was using RDF. You can read more about RDF at www.w3.org/RDF/. In RSS, items are stored inside the channel. In RDF they are not. The ItemArray is initialized differently depending on the root. (If this code tries to parse another flavor of XML, it will cause problems.) The next code block uses cfloop to loop over the Item Array. It creates an item object using the tempitem variable. It sets the relevant instance data, then it uses the ItemExists function to check whether or not the item exists yet. If it doesn't, the commit method is run to save the data. Otherwise, nothing happens. The loop ends, and the method returns true. This is simple stuff, right?
The Scheduled Task
<cfscript> It is probably one of the easiest scheduled tasks I've ever written.
Conclusion Every good project needs a code name, and I decided to give this project one. After some deep soul-searching, I've decided to name this project MyFriend. There are two reasons for this. The first is that I modeled the whole idea after the LiveJournal friends list. The second is that, My Friend is the name of a song that my first band recorded the first time we went into a recording studio. It was my first time in a professional recording studio, and I had been playing bass for less than a month. The results came out better than you might have expected, really. I'm a sentimental freak. Check out my www.jeffryhouser.com for the latest version of this code and let me know what you think. CFDJ LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||