Welcome!

ColdFusion Authors: Yakov Fain, Pat Romanski, Liz McMillan, Maureen O'Gara, Greg Ness

Related Topics: ColdFusion

ColdFusion: Article

The XPath Factor

XML and XPath

It's 3:00 P.M. on a sunny Saturday afternoon. The birds are chirping, the leaves are blowing, and you can hear the lake waters breaking on its rocky shores. The sounds of a baseball game randomly crack in the distance, and the roar of competition erupts on the basketball courts nearby.

My burgers are just getting medium-well as my wife is returning from a little potty walk with the dog. Our blanket is set and our picnic looks like it will be wonderful. Who knew a Web geek like me could pull off such a seemingly perfect day? Interestingly enough, I owe much of its success to XML and XPath.

XPath is to XML what SQL is to databases. Databases would be quite pointless if you could not query information out of them, and the same holds true for XML documents. XPath is a language for finding information in XML documents. XPath provides access to all of the elements and attributes of an XML document. It became a World Wide Web Consortium (W3C) recommendation on November 16, 1999 and since then it has become a huge part of the XML world. It is a major element of the W3C's XSLT standard, and both XQuery and XPointer are built upon XPath expressions.

An expression is XPath's primary construct, a string that, in its basic form, resembles a file path. True XPath engines examine the expression and return a node-set (more on nodes later), Boolean, number, or string from the XML document. ColdFusion's implementation of XPath only supports the node-set return type. For more complete XPath support you can use one of the many Java libraries available on the Internet.

XML has reared its head all over the Web, and with XPath being such a cornerstone of the XML world, a solid understanding of it is crucial. XML can be found any place where you need to provide or acquire data from a third-party vendor, client, application, or server. Web services, WDDX packets, and RSS feeds are all implemented on XML technology. XPath could be used to integrate various combinations of any of these technologies. A good example of this type of integration was the weekend planner application I used to find the perfect Saturday activity.

The application reads an RSS feed of a local community events calendar. It leverages XPath to read the zip codes and dates of each event. These zip codes and dates are then sent to a Web service that provides the weather forecast for that area of town, on that day. That information showed me that a picnic at Blanchard Park on Saturday was a better idea than going to the carnival at the Central Florida Fair Grounds on Sunday.

XPath can be used anytime you need a subset of data from a larger XML dataset. Its use is analogous to the use of Regular Expressions. Regular Expressions retrieve a substring based on pattern matching within a string. They are very powerful, although one might not think so at first look. This is because ColdFusion implements Regular Expressions as a single argument used in a small handful of functions. A deeper look into Regular Expressions would reveal book after book dedicated to its intricacies. In the same manner, XPath makes its ColdFusion appearance as a lone argument of one function, yet has novels dedicated to its deeper functionality.

XPath is leveraged via the function XmlSearch(). XmlSearch() requires two parameters: xmlDoc; an XML Object, and XPathString; and XPath expression. It returns an array, with each element of the array containing an XML node. As stated before, XmlSearch() cannot return strings, Booleans, or numbers.

The first parameter of XmlSearch() is an XML Object. When XML is read in by ColdFusion it is treated just like any other string, however calling the function XmlParse() on that string will load it into memory, create, and return a ColdFusion XML Object. Figure 1 shows the comparison of an unparsed XML document dumped on the left, and the same XML document parsed and dumped on the right.

Now that our XML has been converted to an object, we can run our first XPath expression against it. In order to understand how to write XPath expressions, we must understand and identify the different parts and elements that make up an XML document.

The easiest way to identify the parts of an XML document is to compare it to a file structure. If we were to run the command "C:\inetpub\wwwroot" in Windows, the OS would open the "wwwroot" directory, showing all of its contents. In this example, "C:" of course is a reference to the disk, and "inetpub\wwwroot" tells the system to look into the "wwwroot" folder, which is nested inside the "inetpub" folder. As nomenclature of nested sets goes, it could be said that "inetpub" is the parent "wwwroot." Likewise, "wwwroot" is the child of "inetpub." Extending that paradigm out, "C:" would then be a grand parent of "wwwroot." In such instances, "C:" would be known as an ancestor and "wwwroot" would be known as a descendant. Since "C:" is the absolute oldest ancestor or the top node in the nested set, it is also given a special name - root.

If we were to change our command to "C:/inetpub" and run it, the "inetpub" folder would open showing all of its children. Besides "wwwroot," there might be other directories or files. For example, I have an "ftproot" folder nested below "inetpub." Since "inetpub" is a parent to "ftproot" as well as to "wwwroot," "ftproot" and "wwwroot" are said to be siblings.

XML nodes can also be identified using the same rules as the "family-name-game" above. Listing 1 is a breakdown of every battle from Iron Chef America: The Series (Season 1). In it, the "ICA" node is equivalent to "C:." It is the root node (also known as the document node in XML), an ancestor to the "IronChef" node, and parent to the "Battle" node. Since the "IronChef" and "Battle" nodes share the parent node "Battle," they are siblings. In XPath, just as in our file path example above, we can reference the "IronChef" node using a simple path syntax: "/ICA/Battle/IronChef."

More Stories By Nik Molnar

Nik Molnar is a ColdFusion/Flex developer with over seven years experience. He has led teams through the development of enterprise applications for the mortgage, sports ticketing, and stock industries. He is an amateur Iron Chef and posts regularly at his blog: foodDuo.com. He lives with his wife Katy and his dog Jacques in Orlando, Florida.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.