Welcome!

ColdFusion Authors: John Ryan, Fuat Kircaali, Yeshim Deniz, Anatole Tartakovsky, Steve Lesem

Related Topics: ColdFusion

ColdFusion: Article

Displaying Grouped Output with ColdFusion

A great tool for your arsenal

Knock! Knock! Hey [insert your name here], can you create a report of all the users who have registered on our site? Geography is really important to the sales folk, so make sure you categorize it by state. Oh, and can I have it done today?"

"Sure, I think we can do that."

The Problem
So, you've been asked to create a report on all the users who have registered on your site. The report has to be categorized by state, for the sales folk. Our result listing will probably look something like this:

The name of the state is listed first, and all the users from a certain state are listed underneath their state category. You won't be able to do much without reviewing the data in the database. This is what you're dealing with:

The two relevant tables are a user table and a state table. The user table contains all the relevant user information, such as the user's name and some address information. The state information is stored in the state table. The two tables are linked using a relationship on the StateID. By looking at the user's StateID, you can find the state that user exists in. Using a StateID, you can find all the users in a specified state.

This Works, But It Isn't Perfect
How do you parse the data to generate the listing? If you are new to the SQL language, you might take an approach like this:


<cfquery name="GetStates" datasource="january05">
 select * from states
</cfquery>


<cfoutput query="GetStates">

 <cfquery name="GetUsers" datasource="january05">
  select * from users where users.StateID = #GetStates.StateID#
 </cfquery>

 <cfif GetUsers.RecordCount NEQ 0>
  <b>#GetStates.State#</b><br>
  <cfoutput query="GetUsers">
  #GetUsers.FirstName# #GetUSers.LastName#<br>
  </cfoutput>
  <br>
</cfif>

</cfoutput>

First, the code runs a single query to get the list of all states. The GetStates query is looped over, using cfoutput. The code inside the loop will retrieve all the users. If there are users for that given state, the state information is displayed along with the name of the user. This works and makes the boss happy, at least for now.

It Just Keeps Getting Better...
A couple weeks later, the boss gives a knock on your cubicle door. "Hey, you know that state/user report you created for me? I love the way that it works - that I can see all the users. Our membership is really increasing, but the page is running really slow. Can you fix it?

Let's start by analyzing the original code a bit more. How many queries will run on the page? One query will execute to retrieve the list of states. For each record returned from that query, another query will run. For U.S. states, you'll get at least 50 additional queries, more if you include areas such as the District of Columbia and Puerto Rico. With a small amount of data, this approach may not be a problem. However, if you have a lot of registrations on your Web site, this could start to become an inefficient approach. Fifty or more queries on a page are undesirable. I bet we can cut down the number of queries drastically.

If you join the user table and the state table in a single query, you can return all the data from the previous template in a single query:


<cfquery name="GetUsers" datasource="january05">
 select * 
 from users, states
 where users.StateID = states.stateID
 order by state
</cfquery>

This query will return all the data that we need. It will return all the states that have users in them, along with all the same user data. You'll just need a way to loop over the data to create the display. Here's one method:


<cfoutput query="GetUsers">
 <b>#GetUsers.State#</b><br>
 #GetUsers.FirstName# #GetUSers.LastName#<br>
</cfoutput>

Try this code in the browser. Unfortunately, with this approach the state is displayed for every user. The page is more cluttered and the functionality has changed. This won't do.

The state data should be displayed only when the state changes. You'll need to know the value of the state in the previous loops execution. You can store that in a variable. If the previous state value is different than the current state value, then the code has moved onto a new state; thus its name should be displayed. A variable, PreviousState, can hold the value of the previous state.


<cfset PreviousState = "">

<cfoutput query="GetUsers">
 <cfif PreviousState NEQ GetUsers.State >
  <br><b>#GetUsers.State#</b><br>
  <cfset PreviousState = GetUsers.State>
 </cfif>	

 #GetUsers.FirstName# #GetUSers.LastName#
</cfoutput>

PreviousState will have to be initialized, which I do to an empty string. That makes sure that the first state is displayed. The rest of the code should be straightforward.

...And Better
The report code is starting to get pretty tight, but I know it can be even better. In the previous iteration, we had to write some code so we can tell when the state information has changed. We can avoid having to write this additional code by using the group attribute of the cfoutput tag and a nested cfoutput block. The group attribute is used to something called query grouping. Portions of the query are split up into groups, based on a column of the query. The outer loop executes once for each entry in the column. The inner loop will execute for each individual row. This is better explained with an example. In our problem, we want to group the query by state.

A PreviousState variable is no longer needed, since the grouping handles this functionality for us. When outputting over the query, the group attribute is used. It specifies the field that the code will use to loop over the group:


<cfoutput query="GetUsers" group="State">
  <b>#GetUsers.State#</b><br>
  <cfoutput>
   #GetUsers.FirstName# #GetUSers.LastName#<br>
  </cfoutput>
  <br>
</cfoutput>

The first thing inside the loop is a display of the state data. Then we have another cfoutput block. In versions of ColdFusion prior to CFMX, this was the only place where you could use nested cfoutputs. The nested cfoutput will loop over all the rows until the state changes. Conceptually this is very similar to the second method, except we are not writing additional code to keep track of when the code moves onto a new state.

Many of you may have heard about the group by clause in SQL. I want to specify that using the group attribute of cfoutput is not identical to using a group attribute in SQL. They are similar, but different. In SQL, the group by clause is used to run aggregate functions against subsets of data. For example, if we wanted to get the number of users in a particular state, we could use the grouped output in CFML with a loop counter to find the number of users. However it would be better to calculate that information in the query, like this:


<cfquery name="GetUsers" datasource="january05">
 select states.state, count(users.lastname) as UserCount
 from users left outer join states on (users.StateID = states.stateID)
 group by state
 order by state
</cfquery>

The aggregate function, count, is used to return the number of users. Instead of returning the total number of records in the query, we get the state, and the number of users in that state. A great article on the group by clause can be found at www.4guysfromrolla.com/webtech/011202-1.shtml. More information about aggregate functions in SQL Server can be found at http://msdn.microsoft.com/library/default.asp?url=/ library/en-us/tsqlref/ts_fa-fz_9yuk.asp.

Conclusion
Grouped queries can be a powerful tool in your arsenal. It will work especially well combined with the use of aggregate functions. For example, you could use the count function to find the total number of users in any state. ColdFusion will also support multiple levels of grouping with cfoutput. You may want to use this if you are working in an international application and want to separate the output by country, then state.

About Jeffry Houser

Jeffry is a technical entrepreneur with over 10 years of making the web work for you. Lately Jeffry has been cooped up in his cave building the first in a line of easy to use interface components for Flex Developers at www.flextras.com . He has a Computer Science degree from the days before business met the Internet and owns DotComIt, an Adobe Solutions Partner specializing in Rich Internet Applications. Jeffry is an Adobe Community Expert and produces The Flex Show, a podcast that includes expert interviews and screencast tutorials. Jeffry is also co-manager of the Hartford CT Adobe User Group, author of three ColdFusion books and over 30 articles, and has spoken at various events all over the US. In his spare time he is a musician, old school adventure game aficionado, and recording engineer. He also owns a Wii. You can read his blog at www.jeffryhouser.com, check out his podcast at www.theflexshow.com or check out his company at www.dot-com-it.com.

Comments (2) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Taylor 03/24/08 11:56:00 PM EDT

This is the best explanation of grouping data I have found yet. Thanks so much for the article and especiallly showing the progression you went through to reach more optimized code. You helped fix an issue I was having (missed a small step) and I greatly appreciate it.

D 10/29/05 10:05:36 AM EDT

Is there some way to limit the output in the inner loop? For example, if I wanted to display the first five records for a specific group then move on to the next group. MAXROWS is only available for the outer loop, not the inner loop.