Welcome!

ColdFusion Authors: Maureen O'Gara, Hovhannes Avoyan, Yakov Fain, Pat Romanski, Liz McMillan

Related Topics: ColdFusion

ColdFusion: Article

Structurizing Your Database

Structurizing Your Database

Using CF structures to store data allows you to access individual records without querying a database. Until now, successfully implementing a scalable, high-volume site has entailed five basic principles:

1. Write tight, efficient code.
2. Use cached queries.
3. Set up a proper network architecture.
4. Use top-of-the-line hardware.
5. Implement a properly designed database.

This article suggests a new alternative to cached queries (principle #2): for some sites, using ColdFusion structures to store database data may reduce the bottlenecks caused by excessive database calls, thus improving overall function.

We'll demonstrate how structures, used in conjunction with databases, can let you build a more robust, scalable and potentially faster Web site. We conclude with a rundown of the pros and cons of "structurized" databases so you can judge which solution - Cold Fusion structures, cached queries or a combination of the two - is best for your site.

Transforming Tables into Structures
The first step in structurizing a database is to load all data from tables into structures, which can either be local to the template or persistent in the server's memory. To transform tables into structures, you'll need to use a persistent variable scope so the data can be accessed from any template for the life of the server. To accomplish persistency, you prepend all structures with "server." For example, to load the results of a product category table into memory, you'd use the code in Listing 1.

You may have noticed that the top-level structure is the root product category structure itself. All individual categories are implemented as nested structures, which are keys of the parent structure and the structures themselves. Structure nesting allows you to perform category-wide operations such as structcount() and structkeylist() on different structure levels.

This type of structure methodology can be expanded to include tables with relationships. For example, to implement a one-to-many relationship between the product category table and the product table, you could implement deeper levels of nested structures. Product records are implemented as nested structures of their corresponding product category structures.

To demonstrate adding product information to the product category structure, see Listing 2. As shown in the listing, creating structures that imitate one-to-many relationships is straightforward, but the task can become difficult to manage once nesting levels exceed three to four structures.

Furthermore, building a structure system that can simulate one-to-one or many-to-many relationships, although not impossible, can be arduous.

Retrieving Information from Structures
Once all the information from a database table has been loaded into a structure, you can access that data directly. For example, to access the category name of product category 1 in Figure 1, see Listing 3.

To return the number of records in a structure, you'd simply use the structcount() function. To return a list of a structure's keys, use the structkeylist() function. To return an array of keys, use the structkeyarray() function.

If you attempt to access a structure key that doesn't exist, ColdFusion will return an error. You can avoid this by prechecking for the existence of that key using the structkeyexists() function.

Multiple records can be retrieved by looping through the structure (see Listing 4). Note: Due to the nature of RAM, the structure keys won't necessarily be returned in any particular order. Also, looping through nested structures requires traversing via nested loops. (Structure sorting and traversing is beyond the scope of this article.)

Performing Updates, Deletes and Inserts
Structures aren't meant to replace your database but to augment and enhance its function. It's important to remember that any updates performed on your structures must also be performed on your database.

To insert data or update a structure's data, you need to set that particular structure key(s) to the new desired value(s), the same way you would a simple variable assignment. ColdFusion provides structinsert() and structupdate() functions to accomplish these tasks, but the easiest method is to use absolute variable declarations. For instance, to change the category name key of product category 1, see Listing 5.

Updating the primary key of a structure is a bit trickier because the key itself is the structure. First use the structcopy() function to copy the existing structure into a new one, then delete the original structure. Note: Structure functions such as structcopy() and structdelete() affect the structure and all its nested structures (see Listing 6).

Deleting structures can be done easily by using the structdelete() function. For an example of a structure deletion, refer to the call to the structdelete() function in Listing 6.

Finally, structure locking is vital. The importance of properly locking structures can't be overstated, whether they're being read from or written to. If structures aren't locked properly, you could run into synchronization problems where the same structure is accessed or modified at the same time by two different processes, resulting in potential data corruption. Structure locking can be accomplished with the <cflock> tag and isn't illustrated in any of the sample code provided.

Benefits of Using Structures
In theory, using structurized databases improves two important aspects of a Web site: speed and scalability. Storing data in structures can improve data-access speeds, especially in heavy load situations, because the structures reside in RAM. Rather than performing a database call, the Web server retains a copy of the database in memory and accesses it the same way it would a simple variable, significantly reducing data lookup times.

At this point you may be wondering why structures are better than cached queries. The answer is simple: a cached database query still requires calls to a database once data is modified or its cachedwithin attribute has expired. Structures, on the other hand, mirror the data in a database and can be likened to cached queries that never need to be refreshed. This attribute eliminates the need to access a database server when performing lookups. Structures also provide a means of exceeding ColdFusion's limit of 100 simultaneous cached queries. In essence, cached queries provide a static view of a database, while structures themselves are the evolving database.

In terms of scalability, Web servers with structurized databases can theoretically support large amounts of load because they generate far less network traffic and database activity. Moreover, structures can greatly reduce the need for additional database servers since the query load on them is minimal. Simply add more ColdFusion servers to your cluster to handle the extra load!

Drawbacks of Using Structures
Although structures can improve a site's speed and scalability, they have some potentially serious drawbacks that deserve careful consideration. One problem could arise from the type of primary key contained in your database tables. Structures work best with tables that have a single-integer primary key. Using structures with tables that have other datatype or multiple primary keys isn't recommended as the key itself is part of the structure name and must follow ColdFusion variable-naming requirements. Furthermore, the use of integer primary keys will facilitate the enumeration and sorting of your structures.

The size of the database could also create a potential problem with structures. Large amounts of data require Web servers with large amounts of RAM for storing the structures. This can be costly, and there's a limit to the amount of RAM your server can hold. Since RAM isn't permanent memory, all structures that reside in it are lost when your server shuts down. To reinstantiate your structures, a template must be run at boot time to reload all data. Depending on the amount of data stored in the database, reloading structures can significantly increase the server's reboot time.

The final drawback entails replication of structures in Web sites that implement multiple Web servers. When a structure is modified, it's modified only on the server on which the modification occurs. Other servers won't be aware of the structure modification unless a replication system is in place. Developing a replication system can be difficult and adds an additional step to the structure-modification process. With replication in place, structure modification would require:

1. Modification of the structure itself
2. Modification of the database
3. Replication of the updated structure across all Web servers

(which could take a long time, given multiple Web servers)

A simple way to develop a replication system could be to design a sender ColdFusion script that loops through a list of servers and posts structure modifications via WDDX packets to a receiver ColdFusion script that processes these modifications. Although this solution may seem straightforward, the scripts must take into account situations in which replication can't take place - for instance, when servers in its list are down or in the process of booting.

Is "Structurizing" Your Database the Right Solution?
Admittedly, structurizing a database is not an easy task. The various benefits and drawbacks of using ColdFusion structures this way can make it difficult to decide whether they're the right storage mechanism for your Web site.

For sites with relatively infrequent database updates, structures can be an excellent solution. For example, an e-commerce site with a relatively static product base could benefit from structures since updates are infrequent and replication wouldn't have to occur very often.

Sites that are highly dynamic, especially sites that frequently update their databases from user input, wouldn't reap the speed or scalability benefits of structurized databases because of the added code complexity and replication overhead. Perhaps the best solution to building more scalable and faster database-driven Web sites lies in a mix of traditional database calls and structures. For those sections of your site that are mainly lookup oriented, the right choice might be structures; for those sections that are update oriented, the right choice may be database queries.

More Stories By Bryan Murphy

Bryan Murphy is the owner of GuardianLogic, Inc. (www.guardianlogic.com), an information security firm that provides application and network vulnerability assessments and hardening. He is also one of the authors of Metazoa (www.metazoa.ca), a security-enhanced content management system; Membrane, an application-level firewall; and MetaGuard, a CFC that provides role-based login, authentication, and access control. Bryan has been an ethical hacker since the old-school BBS days. Visit his blog at www.downgrade.org.

More Stories By Shahriyar Neman

Shahriyar Neman is CTO of the Next Network, an ASP that delivers total computing packages to small- and medium-sized businesses through the Internet. He holds
a BA in computer science from NYU and is currently
pursuing his master's degree.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.