Welcome!

You will be redirected in 30 seconds or close now.

ColdFusion Authors: Yakov Fain, Jeremy Geelan, Maureen O'Gara, Nancy Y. Nee, Tad Anderson

Related Topics: ColdFusion

ColdFusion: Article

Using Querysims to Analyze Log Files

Using Querysims to Analyze Log Files

Query simulations, or querysims, are a means of simulating returned records from a database when no database exists. This article explores a method of using the <cf_querysim> tag to create an easy approach to custom log file processing.

Querysims 101
The <cf_querysim> custom tag was developed by Hal Helms as a tool to make development of Fusebox applications less linear. The idea was to disconnect the front-end CFML development from the back-end database and query development. To do this, <cf_querysim> provides a way to generate ColdFusion recordsets without querying a database. Instead, lines of text data are converted into a recordset.

As an example, let's imagine we're building a site that needs to display a list of employees. We need to retrieve each employee's first and last names, employee identification number, department, and supervisor ID number from the database. A <cfquery> to satisfy this requirement is shown in Listing 1. All the listings in this article have Fusedoc blocks at the top to document the function of the template. More information on Fusedoc can be found at www.fusebox.org or www.halhelms.com.

Listing 2 details dspEmployees.cfm, a template that produces a table based on the data returned by qryGetEmployees.cfm.

To tie the two together, we include them in a calling template, exampleOne.cfm, shown in Listing 3. This technique separates the back-end data portion of the code from the front-end display portion, another idea used extensively in Fusebox.

The end result, produced by running exampleOne.cfm, is shown in Figure 1. This is familiar territory for most Cold-Fusion developers. The twist comes when we want to develop and test the display component of this example before the database exists. This allows us to continue development regardless of whether there's a database yet.

To accomplish this goal, we need a way to make qryGetEmployees.cfm produce output just as though the database was done. This is where the querysim custom tag comes in. Listing 4 shows a version of qryGetEmployees.cfm that creates a querysim of desired data. The first line inside the <cf_querysim> tag defines the name of the recordset that will be produced, the second line specifies field names, and the remaining lines specify the data.

When we run exampleOne.cfm using the new querysim, the output looks exactly the same as it did in Figure 1. The querysim has taken away the need for the database.

Common Uses for Querysims
As shown in the previous examples, querysims were developed to allow developers to get on with the work of creating an application's front end without having a complete database on hand. This means that the project's participants can work in parallel, reducing the calendar time required to build the application. ColdFusion coders can work on their side of the application, supported by querysims to represent live data, while database developers work independently on the back end. As query files are written, using SQL, they're put in the application in place of the querysims that stand in their stead.

Querysims can be useful in other ways as well. For example, most of us have had to build a form to be used to add and edit data. When adding a record, we need a blank form. When editing a record, we need the form to populate with data from the database. One typical solution to this problem is to create conditional logic for each input on the form, populating the input with data if a record is available, otherwise leaving the input without a value.

Querysims make this task much more manageable. We start with the idea that a form is always in edit mode. The only difference between creating a new record and editing an existing record is that, in the case of creating, we're really editing a record with all blank fields. So we create a single piece of conditional logic at the top of the form that checks to see if we're editing a record. If not, we create a recordset using <cf_querysim>. This recordset has one record with all blank fields. This way, the code that displays the record's values for editing won't throw an error for a creation action - the recordset always exists, regardless of whether we're editing an existing record or creating a new one. Listing 5 shows a simple example of this technique.

Notice that there is no conditional logic inside the form in Listing 5. All the work is done by the querysim. Regardless of whether we're creating or editing a user, we always deal with a recordset, so there's no need for cluttered conditional logic.

Parallel development and form manipulation are powerful uses of querysims, but something came up that led me to explore more ways to take advantage of them.

The Problem
Now that we've had a quick tour of querysims, I'll get into the subject problem for this article. I recently had a request to create a project status page for one of my clients. The request was to provide daily status reports on the project using a Web page.

The restrictions on creating such a page were interesting, though. The client asked that it be quick, easy, cheap, and attractive. Quick means "Don't spend much of my money putting it together," easy means "Don't spend much of my money by making it time-consuming to update," cheap means "Don't spend much of my money," and attractive means "You're not allowed to shove a plain text page at me."

For a bunch of developers, this should be an easy request. After all, everyone on my team can write HTML, so it would be an easy matter to pop up a page of HTML and let everyone edit it daily to add their progress notes. We certainly could have gone this way, but this particular client has a habit of changing his mind, particularly where layout-related things are concerned. So I fully expected him to change his mind at some point about how he wanted these daily updates presented. That, combined with my ingrained Fusebox thinking that tells me to separate data from process and presentation, led me to consider something different.

The Solution: Idea One
The approach was simply to create a query file with a querysim in it to contain the daily update log. The querysim would present the log data for a display file to render for the user. With this approach, if the presentation requirements changed, we could just change the display file. In addition, we'd be able to use the same query file as input to a variety of displays, just in case things got interesting.

The query file I worked up is shown in Listing 6. I refer to this as "Idea One" as it became the foundation for more ideas in the same vein.

Listing 7 shows the display file I used to process the log, and Figure 2 shows the log displayed in a browser, again using a calling file (ideaOne.cfm) to pull together the query and display files.

Left at this point, the solution might have been fine. However, the ways of Fusebox, once learned, aren't easily ignored. Having developers editing the log data right in the querysim definition made me a little nervous. Everyone on the project knew better than to mess around with the CFML and to simply edit the data inside the <cf_querysim> tag, but on the off chance that someone would slip a finger and accidentally delete the starting bracket on the </cf_querysim> closing tag, I decided I needed to keep the data somewhere other than embedded directly in the <cf_querysim> tag. Enter Idea Two.

The Solution: Idea Two
Probably the simplest part of the solution, Idea Two represents the true power of this approach. The idea is simple: separate the data from the <cf_querysim> tag through the use of <cfinclude>. Using this idea, the qryWorkLog.cfm file became two files. The first is qryWorkLog2.cfm, which is just qryWorkLog.cfm with a small modification to remove the data and replace it with a <cfinclude> tag. The second is WorkLog.txt, which contains the data removed from qryWorkLog.cfm. These two files are shown in Listings 8 and 9.

The end result is the same output as shown in Figure 2. Nothing has really changed about the data or how it's presented. On the back end, though, we now have a standalone text file that can be edited without fear of breaking the querysim code.

Having implemented this solution, I looked at WorkLog.txt and realized it was nothing more than a simple log file, much like those generated by Web servers. That realization led me back to some discussions from various listservs and newsgroups about Web statistics packages and parsing server logs. It occurred to me that the use of querysims represented an easy way to import a server log into a CF recordset for further processing. And so we go on to The Next Idea.

The Next Idea: Server Logs to Recordsets
The records in a querysim data file are pipe-delimited. That is, each field is separated from the next by a vertical pipe (or bar) character. Most server logs simply have spaces between fields, making them problematic to parse efficiently. In order to use the querysim tag, I'd have to take one of two approaches. I could either modify the querysim tag to parse the server log, or I could modify the server log to comply with the querysim tag's requirements. Because spaces aren't particularly good delimiters to begin with, I decided on the latter approach.

Fortunately, I do most of my work on servers that run Apache, so modifying the server log was really very simple. I went into the Apache configuration file, httpd.conf, and added the following line along with the other LogFormat lines:

LogFormat "%h|%l|%u|%t|\"%r\"|%>s|%b" pipedcommon

This defines a new log format called "pipedcommon", which is identical to the common server log format except that it uses pipes instead of spaces between fields. I then modified the CustomLog directive to use this new log format:

CustomLog logs/access.log pipedcommon

A quick restart of Apache and it was ready to go. Every request to the server causes a line to be written to the access log, so I made a few page requests to add lines to a new log file, creating the file in Listing 10.

Then I took a copy of the log file over to my ColdFusion test directory, where I had a new file waiting for it. This file, qryWebLog.cfm, is shown in Listing 11. It's identical in concept to the qryWorkLog2.cfm file seen in Listing 8, but the querysim has a different name and the field headings are altered to match the format of the server's access log. In addition, I've added a <cfdump> tag to the end of the file to quickly show that the server log has indeed been processed into a recordset.

For live use I created a display file (dspWebLog.cfm) and a calling file (nextIdea.cfm), similar to the examples shown earlier, to create an attractive display of the access log's data. These files are shown in Listings 12 and 13, and the output from running nextIdea.cfm is shown in Figure 3.

As you can see, the Web log is neatly displayed in the browser window, in just the format I specified. This is the launching point for whatever sort of log analysis you might wish to perform. Particularly with ColdFusion's query-of-query capability, you could do just about any sort of analysis you might want on this recordset.

Other Applications for Querysims
As you think about querysims, of course, more and more uses for them become apparent. You can take advantage of <cf_querysim> any time you might want to convert text data into a recordset without worrying about a custom parser.

For example, you might want to create a bulk loader for text data. With <cf_querysim> loading the data into a recordset for you, loading the data into a database becomes a simple matter of looping over the recordset with a <cfquery> to insert the data. No doubt your imagination will be able to come up with its own uses for this extraordinarily useful custom tag.

More Stories By Jeff Peters

Jeff Peters works for Open Source Data Integration Software company XAware.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
In this presentation, Striim CTO and founder Steve Wilkes will discuss practical strategies for counteracting fraud and cyberattacks by leveraging real-time streaming analytics. In his session at @ThingsExpo, Steve Wilkes, Founder and Chief Technology Officer at Striim, will provide a detailed look into leveraging streaming data management to correlate events in real time, and identify potential breaches across IoT and non-IoT systems throughout the enterprise. Strategies for processing massive ...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
In his session at Cloud Expo, Alan Winters, an entertainment executive/TV producer turned serial entrepreneur, presented a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to ma...
SYS-CON Events announced today that Enzu will exhibit at SYS-CON's 21st Int\ernational Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to focus on the core of their ...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business - from apparel to energy - is being rewritten by software. From planning to development to management to security, CA creates software that fuels transformation for companies in the applic...
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assista...
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists looked at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deliver...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), provided an overview of various initiatives to certify the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldwide re...
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, discussed some of the security challenges of the IoT infrastructure and related how these aspects impact Smart Living. The material was delivered interac...
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...
"When we talk about cloud without compromise what we're talking about is that when people think about 'I need the flexibility of the cloud' - it's the ability to create applications and run them in a cloud environment that's far more flexible,” explained Matthew Finnie, CTO of Interoute, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The Internet giants are fully embracing AI. All the services they offer to their customers are aimed at drawing a map of the world with the data they get. The AIs from these companies are used to build disruptive approaches that cannot be used by established enterprises, which are threatened by these disruptions. However, most leaders underestimate the effect this will have on their businesses. In his session at 21st Cloud Expo, Rene Buest, Director Market Research & Technology Evangelism at Ara...
No hype cycles or predictions of zillions of things here. IoT is big. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, Associate Partner at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He discussed the evaluation of communication standards and IoT messaging protocols, data analytics considerations, edge-to-cloud tec...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...
When growing capacity and power in the data center, the architectural trade-offs between server scale-up vs. scale-out continue to be debated. Both approaches are valid: scale-out adds multiple, smaller servers running in a distributed computing model, while scale-up adds fewer, more powerful servers that are capable of running larger workloads. It’s worth noting that there are additional, unique advantages that scale-up architectures offer. One big advantage is large memory and compute capacity...
Artificial intelligence, machine learning, neural networks. We’re in the midst of a wave of excitement around AI such as hasn’t been seen for a few decades. But those previous periods of inflated expectations led to troughs of disappointment. Will this time be different? Most likely. Applications of AI such as predictive analytics are already decreasing costs and improving reliability of industrial machinery. Furthermore, the funding and research going into AI now comes from a wide range of com...