|By Kelly Brown||
|August 31, 2001 12:00 AM EDT||
It's almost impossible to pick up a trade magazine these days and not find a reference to XML. It's an important enabling technology for the sophisticated Web applications of the future.
This article will cut through some of the hype to explain what XML can be used for and, equally important, how it can work effectively with ColdFusion. To illustrate how these two technologies can be used in conjunction, I'll create a simple ColdFusion application allowing a user to edit the content of an XML file.
For those of you not familiar with XML I'll introduce some of the basic concepts. XML stands for eXtensible Markup Language. It's similar to HTML, but instead of describing how to display the information, it describes what the information is. For instance in HTML you use the <B> tag to tell the browser to display a name in a bold font. In XML you'd put the <person> tag around the name to indicate that the text is a person's name. So you've told the computer what the name is, but how do you get it to display properly? That's where eXtensible Stylesheet Language comes into play. XSL allows you to define how tags in an XML file should be displayed. For example, we could use XSL to display the text in the <person> tag mentioned previously using a bold font.
XML allows us to separate the structure of the content from the manner in which the content is displayed. This makes XML content considerably more reusable than HTML content, where content and display logic are inextricably mixed.
Another big difference between XML and HTML is that in XML you get to define what your tags are. I used the example of a person tag. There's no predefined person tag in XML. There was a person's name in my data so I just decided I would create a tag to identify it.
You mean you can just create any tag you want? Won't it be confusing if you just go along creating new tags all the time? Absolutely. To create organized XML, you should first create a document type definition (DTD). The DTD defines what XML tags I'm going to use and how I'm going to use them. It sets the rules I'll use when I create my XML documents.
Each type of data you work with will have its own DTD with tags that are relevant for that data. There are several companies trying to standardize DTDs for certain purposes, but right now things are still up in the air. It's really a matter of agreeing on a DTD with whomever you're going to be sharing data. If you're creating the data you may just determine the DTD and provide it to partners so that they can use your XML content. Several companies in an industry may get together and decide they're all going to use the same DTD, thereby creating a standard for that industry.
This illustrates two more advantages of XML. First, you define whatever tags are a natural fit to your data. Second, a DTD provides an effective way of communicating the structure of your content to any other person or organization that wants to use it.
On to the Application
For purposes of this article, we'll be editing books. Each book will have a title, the number of pages, a price, an ISBN number, and an author. The book can also belong to several categories. Listing 1 shows the DTD for our book.
The DTD is created using tags with "less than" and "greater than" that look similar to tags in HTML. The first line in our DTD tells us what version of XML we're using, which is 1.0. Next we start defining the elements. An element will be a tag in our XML document. We create a definition for each of our elements. The definition tells us what the name of the element is, and what kind of data will be in the element.
The first element we define is the title, which means there will be a <title> tag in our XML document. We also define what kind of data will be in the title tag. In this case it will be PCDATA, which stands for parsed character data. This basically means that the tag will have some text in it. We then define our page count, price, ISBN, and category as PCDATA.
The author tag is a bit different. We break the author name down to the first name and last name. In the DTD we define the author tag as a list of other tags, using the parentheses to select the elements that the author tag will consist of. Next we define the firstname and lastname elements that will be used for the author's name.
The last element is the book tag. The book tag contains all the other tags. This creates a hierarchy of tags within tags. So there will be a top level <book> tag with all the other tags in it. There are a couple things to notice about the book element. The firstname and lastname elements we defined aren't in the book. The book includes the author element, which includes these tags. The other thing is that the category element has an asterisk after it.
The asterisk indicates that the element can occur zero or more times. The other elements didn't have a notation after them, so they can occur only once in the XML document. There are also some other indicators you can use. A plus sign after the element indicates the element will occur one or more times in the XML, and a question mark indicates that the element will be used zero or one times. If you're familiar with regular expressions, this is the same syntax used for pattern matching.
Listing 2 shows a sample XML file that conforms to our DTD. The first line once again specifies what version of XML we're using. The second line tells us what DTD is used for this document. In this case we're using the book.dtd file that we defined above. We have our top-level book tag followed by the other tags we defined. The author tag contains the firstname and lastname tags that we defined.
Now we have an XML file, but how do we access it from ColdFusion? Fortunately, Microsoft has provided us with an XML parser COM object that is both powerful and easy to use. The object is called XMLDOM. You can download this component from the Microsoft site for free. Windows 2000 ships with this COM object, but it's an older version and I suggest you download and use the latest version. You can find this component on the Microsoft site by searching for the Microsoft XML SDK.
Once you download and install the component you can instantiate it in ColdFusion with the following syntax:
Once the object is created we can use the LOAD method to retrieve an XML file. Before we do this we want to set the async property. The async property stands for asynchronous and tells the component to wait until the XML file is loaded to continue. This is an important property to set. The default is true, which means your code will continue before the file is loaded and you won't be able to access any of the data. Note that you must provide the full path along with the file name as shown below:
<cfset objXMLDOM.async = false>
After we make changes to the XML we can save the changes using the SAVE method. The syntax is the same as the LOAD and takes the full path for the XML file to be saved.
Finding Information Using XPath
Once we load an XML document we have several methods to access and manipulate the data. The structure of the data in the COM object is referenced as a tree structure of nodes. Figure 1 shows a visual representation of our XML book structure with each element as a node. The book is the top node with the title, pages, price, ISBN, author, and category as child nodes. The author node has two children nodes of firstname and lastname.
We can find data in our XML file using the SelectSingleNode method. This method uses the XPath language to find a node. The XPath language is very powerful, full-featured, searching language. We will cover only some of the basic searching syntax. The XPath syntax is similar to a directory path in the file system. The top node is like the top directory that contains the other nodes as directories. The following syntax would select the title node for our book:
Once we have the book node we access the TEXT property to get the data contained in the tag. If we wanted to print out the title in ColdFusion it would look like this:
<CFSET title_node= SelectSingleNode("/book/title")>
The text in the node can be changed by assigning data to the TEXT property of the node.
<CFSET title_node.text="My New Book">
This does not automatically save the changes back to the XML. It simply changes the data in memory. You have to use the SAVE method to update the file.
More Searching Techniques
There are other search mechanisms available to us in XPath. A double forward slash will search the entire XML structure for the specified node. The example below would return the firstname node even though it's a subnode of author.
Be careful when using this searching method. If I had another firstname tag, like the name of the editor for instance, I would not know whose first name I was accessing.
In our DTD we defined category with the ability to occur multiple times within our document. We need to use the SelectNodes method if we want to get more than the first occurrence of a node. The SelectNodes method returns a collection of nodes that meets the specified criteria. Here's how we'd get a list of the categories defined for our book:
XPath also provides us with the capability to search for nodes that contain specific data. We can include additional search information in square brackets as shown:
SelectSingleNode("/book/category[. = ""mycategory""]")
This would return a category node where the text is "mycategory". We're using the " " in this example to get quotes within our quotes.
Adding and Deleting Nodes
We can create new nodes for our XML using the CREATENODE method. This method takes three parameters. The first is the node type. Type one is an XML element. The second parameter is the name of the node, which in this case is category. The last parameter is the URI name space, which we aren't using so we'll leave it blank.
<CFSET new_cat = objXMLDOM.createNode(1,"category","")>
Once we have a node we can add it to an existing node using the APPENDCHILD method. In the following example we get the book node and then append our new category node to it.
<CFSET book_node = objXMLDOM.SelectSingleNode("/book")>
When opening an XML file that you've added a new node to, you may notice some strange formatting. The new tag is added onto the end of the existing tag without any spacing. New nodes are actually separated by newline characters as specified in the XML standard, but in a windows environment most editors won't show the newline character so everything will appear to be on one line.
The delete is similar to the edit. We find the node we want to delete and the node it belongs to in the hierarchy. This time we use the removeChild method to remove the category node from the book node.
<CFSET cat_node = objXMLDOM.SelectSingleNode("/book/category")>
<CFSET book_node = objXMLDOM.SelectSingleNode("/book")>
We have the basic tools we need, so let's take a look at a sample application. Our sample consists of six ColdFusion pages. That's a lot of pages, but they are fairly simple and each one presents a different principle. The first page is called list.cfm. This page shows a list of XML files we can edit and allows us to select one to edit.
The xmledit.cfm page displays most of the book fields in a form and allows us to change them. This demonstrates searching and displaying information from an XML document.
The next page is the xmledit_action.cfm page. This page handles the form submission from xmledit.cfm, and it demonstrates how to update nodes in an XML file. The category is handled differently from the other fields because it can occur multiple times within the XML.
We have an addcat.cfm page that allows you to add a new node to the XML file. The editcat.cfm allows you to change an existing node and demonstrates some of the advanced searching capabilities of Xpath. The final page is delcat.cfm, which allows you to delete a category by removing a node from the XML file.
This page is pretty straightforward. It uses cfdirectory to get a list of all the XML pages in the specified directory. It displays these files as links to xmledit.cfm with the file name passed in on the URL. Passing file paths over the URL is generally a poor security practice, but we're keeping things simple for this example. We also display the size of the XML file for good measure (see Listing 3). (Listings 3-8 can be found on the CFDJ Web site, www.sys-con.com/coldfusion/sourcec.cfm.)
In this page we load our XML file. We create a form to edit our data. We put the file name in a hidden field to pass onto the action page. We use the select single node to get most of our data. You can see several different searching methods working in this page. Our category search can return more than one node, so we used cfloop to loop over the collection that is returned. For each category we generate a link to the catedit.cfm and catdel.cfm pages. We also provide a link to add a new category (see Listing 4).
This page gets the form data from the xmledit.cfm page and updates our XML document. We load the document like we did in the xmledit.cfm page. We find the nodes using the same search as before, but this time instead of outputting the node text in our form we set the node data to the form data. Once the data has been changed we use the SAVE method to write our changes back to the file, as seen in Listing 5.
There isn't a separate action page for the category manipulation pages. The same page is used to generate the form and process it by checking to see if there's data to be processed. For the add form we pass along the document in the hidden field and give the user a single text box to enter the name of a category. The form is processed in a similar manner to the other pages we've seen. First create the XML and load the XML file. In this case, create a new node with the CREATENODE method. Once we have the node we assign our data to it using the text property as we did in the xmledit_action.cfm page. Now we have our new node, but we need to add it into our structure. We want to add it to our main book node so we get the book node and use the APPENDCHILD method to add it to the book node. The changes are complete, so we save them back to the file and send the user back to the xmledit.cfm page where the new category will now show up (see Listing 6).
For the edit category we are changing an existing node, so we need to save the file name and the original node data in hidden fields. We provide the user with a text box to enter the new category text for the node. On the processing side we load the data as usual. This time, however, we have to find the correct node to change. We're going to use the square brackets in our XPath search to find the correct node. In the code we use the original category name we passed in to find the category we want to edit. We replace the text of the node with the new category text and save the file. The user is then sent back to the xmledit.cfm page (see Listing 7).
For the delete category we also pass in the name of the category to be deleted. We place the document and category name in hidden fields and ask the user to verify the delete. If the user verifies the delete, we process the form; if not, we redirect them to the xmledit.cfm without making any changes. By now you know the drill, we load up the XML file. The delete is similar to the edit. We find the node to delete and the book node. This time we use the removeChild method to remove the category node from the book node. We save our data and return to the xmledit.cfm page (see Listing 8).
Some Things to Add
One thing not dealt with in the sample application is adding and deleting the XML files. Since we're dealing with files, you can use the CFFILE tag to delete any files you don't want. To create a new file you could create all the nodes and append them together, but there's an easier way. You can create an XML template file with the correct structure, but with all the nodes blank. When you want to create a new XML file you can make a copy of this file and then send it to the editor. This saves us the work of creating a new add XML file by reusing our edit.
XML is an important technology for the future. In a medium such as the Internet where content is king, XML is an enabling technology that will allow organizations to maximize the potential of their content. ColdFusion can be used to effectively manipulate XML and to incorporate XML into sophisticated Web applications. The ColdFusion application presented in this article has only scratched the surface of XML's potential.
Start with these basic tools and concepts, and keep learning. Welcome to the world of XML!
|Don 12/22/07 10:31:51 AM EST|
The zip file for the complete source code is throwing an error. Any chance of me getting this?
- Where Are RIA Technologies Headed in 2008?
- The Next Programming Models, RIAs and Composite Applications
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Constructing an Application with Flash Forms from the Ground Up
- Building a Zip Code Proximity Search with ColdFusion
- Personal Branding Checklist
- CFEclipse: The Developer's IDE, Eclipse For ColdFusion
- Has the Technology Bounceback Begun?
- Adobe Flex 2: Advanced DataGrid
- i-Technology Viewpoint: We Need Not More Frameworks, But Better Programmers
- Web Services Using ColdFusion and Apache CXF
- Passing Parameters to Flex That Works