Welcome!

You will be redirected in 30 seconds or close now.

ColdFusion Authors: Yakov Fain, Jeremy Geelan, Maureen O'Gara, Nancy Y. Nee, Tad Anderson

Related Topics: ColdFusion

ColdFusion: Article

Using Querysims to Analyze Log Files

Using Querysims to Analyze Log Files

Query simulations, or querysims, are a means of simulating returned records from a database when no database exists. This article explores a method of using the <cf_querysim> tag to create an easy approach to custom log file processing.

Querysims 101
The <cf_querysim> custom tag was developed by Hal Helms as a tool to make development of Fusebox applications less linear. The idea was to disconnect the front-end CFML development from the back-end database and query development. To do this, <cf_querysim> provides a way to generate ColdFusion recordsets without querying a database. Instead, lines of text data are converted into a recordset.

As an example, let's imagine we're building a site that needs to display a list of employees. We need to retrieve each employee's first and last names, employee identification number, department, and supervisor ID number from the database. A <cfquery> to satisfy this requirement is shown in Listing 1. All the listings in this article have Fusedoc blocks at the top to document the function of the template. More information on Fusedoc can be found at www.fusebox.org or www.halhelms.com.

Listing 2 details dspEmployees.cfm, a template that produces a table based on the data returned by qryGetEmployees.cfm.

To tie the two together, we include them in a calling template, exampleOne.cfm, shown in Listing 3. This technique separates the back-end data portion of the code from the front-end display portion, another idea used extensively in Fusebox.

The end result, produced by running exampleOne.cfm, is shown in Figure 1. This is familiar territory for most Cold-Fusion developers. The twist comes when we want to develop and test the display component of this example before the database exists. This allows us to continue development regardless of whether there's a database yet.

To accomplish this goal, we need a way to make qryGetEmployees.cfm produce output just as though the database was done. This is where the querysim custom tag comes in. Listing 4 shows a version of qryGetEmployees.cfm that creates a querysim of desired data. The first line inside the <cf_querysim> tag defines the name of the recordset that will be produced, the second line specifies field names, and the remaining lines specify the data.

When we run exampleOne.cfm using the new querysim, the output looks exactly the same as it did in Figure 1. The querysim has taken away the need for the database.

Common Uses for Querysims
As shown in the previous examples, querysims were developed to allow developers to get on with the work of creating an application's front end without having a complete database on hand. This means that the project's participants can work in parallel, reducing the calendar time required to build the application. ColdFusion coders can work on their side of the application, supported by querysims to represent live data, while database developers work independently on the back end. As query files are written, using SQL, they're put in the application in place of the querysims that stand in their stead.

Querysims can be useful in other ways as well. For example, most of us have had to build a form to be used to add and edit data. When adding a record, we need a blank form. When editing a record, we need the form to populate with data from the database. One typical solution to this problem is to create conditional logic for each input on the form, populating the input with data if a record is available, otherwise leaving the input without a value.

Querysims make this task much more manageable. We start with the idea that a form is always in edit mode. The only difference between creating a new record and editing an existing record is that, in the case of creating, we're really editing a record with all blank fields. So we create a single piece of conditional logic at the top of the form that checks to see if we're editing a record. If not, we create a recordset using <cf_querysim>. This recordset has one record with all blank fields. This way, the code that displays the record's values for editing won't throw an error for a creation action - the recordset always exists, regardless of whether we're editing an existing record or creating a new one. Listing 5 shows a simple example of this technique.

Notice that there is no conditional logic inside the form in Listing 5. All the work is done by the querysim. Regardless of whether we're creating or editing a user, we always deal with a recordset, so there's no need for cluttered conditional logic.

Parallel development and form manipulation are powerful uses of querysims, but something came up that led me to explore more ways to take advantage of them.

The Problem
Now that we've had a quick tour of querysims, I'll get into the subject problem for this article. I recently had a request to create a project status page for one of my clients. The request was to provide daily status reports on the project using a Web page.

The restrictions on creating such a page were interesting, though. The client asked that it be quick, easy, cheap, and attractive. Quick means "Don't spend much of my money putting it together," easy means "Don't spend much of my money by making it time-consuming to update," cheap means "Don't spend much of my money," and attractive means "You're not allowed to shove a plain text page at me."

For a bunch of developers, this should be an easy request. After all, everyone on my team can write HTML, so it would be an easy matter to pop up a page of HTML and let everyone edit it daily to add their progress notes. We certainly could have gone this way, but this particular client has a habit of changing his mind, particularly where layout-related things are concerned. So I fully expected him to change his mind at some point about how he wanted these daily updates presented. That, combined with my ingrained Fusebox thinking that tells me to separate data from process and presentation, led me to consider something different.

The Solution: Idea One
The approach was simply to create a query file with a querysim in it to contain the daily update log. The querysim would present the log data for a display file to render for the user. With this approach, if the presentation requirements changed, we could just change the display file. In addition, we'd be able to use the same query file as input to a variety of displays, just in case things got interesting.

The query file I worked up is shown in Listing 6. I refer to this as "Idea One" as it became the foundation for more ideas in the same vein.

Listing 7 shows the display file I used to process the log, and Figure 2 shows the log displayed in a browser, again using a calling file (ideaOne.cfm) to pull together the query and display files.

Left at this point, the solution might have been fine. However, the ways of Fusebox, once learned, aren't easily ignored. Having developers editing the log data right in the querysim definition made me a little nervous. Everyone on the project knew better than to mess around with the CFML and to simply edit the data inside the <cf_querysim> tag, but on the off chance that someone would slip a finger and accidentally delete the starting bracket on the </cf_querysim> closing tag, I decided I needed to keep the data somewhere other than embedded directly in the <cf_querysim> tag. Enter Idea Two.

The Solution: Idea Two
Probably the simplest part of the solution, Idea Two represents the true power of this approach. The idea is simple: separate the data from the <cf_querysim> tag through the use of <cfinclude>. Using this idea, the qryWorkLog.cfm file became two files. The first is qryWorkLog2.cfm, which is just qryWorkLog.cfm with a small modification to remove the data and replace it with a <cfinclude> tag. The second is WorkLog.txt, which contains the data removed from qryWorkLog.cfm. These two files are shown in Listings 8 and 9.

The end result is the same output as shown in Figure 2. Nothing has really changed about the data or how it's presented. On the back end, though, we now have a standalone text file that can be edited without fear of breaking the querysim code.

Having implemented this solution, I looked at WorkLog.txt and realized it was nothing more than a simple log file, much like those generated by Web servers. That realization led me back to some discussions from various listservs and newsgroups about Web statistics packages and parsing server logs. It occurred to me that the use of querysims represented an easy way to import a server log into a CF recordset for further processing. And so we go on to The Next Idea.

The Next Idea: Server Logs to Recordsets
The records in a querysim data file are pipe-delimited. That is, each field is separated from the next by a vertical pipe (or bar) character. Most server logs simply have spaces between fields, making them problematic to parse efficiently. In order to use the querysim tag, I'd have to take one of two approaches. I could either modify the querysim tag to parse the server log, or I could modify the server log to comply with the querysim tag's requirements. Because spaces aren't particularly good delimiters to begin with, I decided on the latter approach.

Fortunately, I do most of my work on servers that run Apache, so modifying the server log was really very simple. I went into the Apache configuration file, httpd.conf, and added the following line along with the other LogFormat lines:

LogFormat "%h|%l|%u|%t|\"%r\"|%>s|%b" pipedcommon

This defines a new log format called "pipedcommon", which is identical to the common server log format except that it uses pipes instead of spaces between fields. I then modified the CustomLog directive to use this new log format:

CustomLog logs/access.log pipedcommon

A quick restart of Apache and it was ready to go. Every request to the server causes a line to be written to the access log, so I made a few page requests to add lines to a new log file, creating the file in Listing 10.

Then I took a copy of the log file over to my ColdFusion test directory, where I had a new file waiting for it. This file, qryWebLog.cfm, is shown in Listing 11. It's identical in concept to the qryWorkLog2.cfm file seen in Listing 8, but the querysim has a different name and the field headings are altered to match the format of the server's access log. In addition, I've added a <cfdump> tag to the end of the file to quickly show that the server log has indeed been processed into a recordset.

For live use I created a display file (dspWebLog.cfm) and a calling file (nextIdea.cfm), similar to the examples shown earlier, to create an attractive display of the access log's data. These files are shown in Listings 12 and 13, and the output from running nextIdea.cfm is shown in Figure 3.

As you can see, the Web log is neatly displayed in the browser window, in just the format I specified. This is the launching point for whatever sort of log analysis you might wish to perform. Particularly with ColdFusion's query-of-query capability, you could do just about any sort of analysis you might want on this recordset.

Other Applications for Querysims
As you think about querysims, of course, more and more uses for them become apparent. You can take advantage of <cf_querysim> any time you might want to convert text data into a recordset without worrying about a custom parser.

For example, you might want to create a bulk loader for text data. With <cf_querysim> loading the data into a recordset for you, loading the data into a database becomes a simple matter of looping over the recordset with a <cfquery> to insert the data. No doubt your imagination will be able to come up with its own uses for this extraordinarily useful custom tag.

More Stories By Jeff Peters

Jeff Peters works for Open Source Data Integration Software company XAware.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, wh...
Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...
When people aren’t talking about VMs and containers, they’re talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT. In his session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, will detail how to build a distributed serverless, polyglot, microservices framework using open source tec...
Connected devices and the industrial internet are growing exponentially every year with Cisco expecting 50 billion devices to be in operation by 2020. In this period of growth, location-based insights are becoming invaluable to many businesses as they adopt new connected technologies. Knowing when and where these devices connect from is critical for a number of scenarios in supply chain management, disaster management, emergency response, M2M, location marketing and more. In his session at @Th...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life sett...
Cloud Expo, Inc. has announced today that Andi Mann returns to 'DevOps at Cloud Expo 2016' as Conference Chair The @DevOpsSummit at Cloud Expo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited t...
"We work in the area of Big Data analytics and Big Data analytics is a very crowded space - you have Hadoop, ETL, warehousing, visualization and there's a lot of effort trying to get these tools to talk to each other," explained Mukund Deshpande, head of the Analytics practice at Accelerite, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
"delaPlex is a software development company. We do team-based outsourcing development," explained Mark Rivers, COO and Co-founder of delaPlex Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
IoT is rapidly changing the way enterprises are using data to improve business decision-making. In order to derive business value, organizations must unlock insights from the data gathered and then act on these. In their session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, and Peter Shashkin, Head of Development Department at EastBanc Technologies, discussed how one organization leveraged IoT, cloud technology and data analysis to improve customer experiences and effi...
Basho Technologies has announced the latest release of Basho Riak TS, version 1.3. Riak TS is an enterprise-grade NoSQL database optimized for Internet of Things (IoT). The open source version enables developers to download the software for free and use it in production as well as make contributions to the code and develop applications around Riak TS. Enhancements to Riak TS make it quick, easy and cost-effective to spin up an instance to test new ideas and build IoT applications. In addition to...
The cloud market growth today is largely in public clouds. While there is a lot of spend in IT departments in virtualization, these aren’t yet translating into a true “cloud” experience within the enterprise. What is stopping the growth of the “private cloud” market? In his general session at 18th Cloud Expo, Nara Rajagopalan, CEO of Accelerite, explored the challenges in deploying, managing, and getting adoption for a private cloud within an enterprise. What are the key differences between wh...
The idea of comparing data in motion (at the sensor level) to data at rest (in a Big Data server warehouse) with predictive analytics in the cloud is very appealing to the industrial IoT sector. The problem Big Data vendors have, however, is access to that data in motion at the sensor location. In his session at @ThingsExpo, Scott Allen, CMO of FreeWave, discussed how as IoT is increasingly adopted by industrial markets, there is going to be an increased demand for sensor data from the outermos...
CenturyLink has announced that application server solutions from GENBAND are now available as part of CenturyLink’s Networx contracts. The General Services Administration (GSA)’s Networx program includes the largest telecommunications contract vehicles ever awarded by the federal government. CenturyLink recently secured an extension through spring 2020 of its offerings available to federal government agencies via GSA’s Networx Universal and Enterprise contracts. GENBAND’s EXPERiUS™ Application...
Presidio has received the 2015 EMC Partner Services Quality Award from EMC Corporation for achieving outstanding service excellence and customer satisfaction as measured by the EMC Partner Services Quality (PSQ) program. Presidio was also honored as the 2015 EMC Americas Marketing Excellence Partner of the Year and 2015 Mid-Market East Partner of the Year. The EMC PSQ program is a project-specific survey program designed for partners with Service Partner designations to solicit customer feedbac...
The IoT is changing the way enterprises conduct business. In his session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, discussed how businesses can gain an edge over competitors by empowering consumers to take control through IoT. He cited examples such as a Washington, D.C.-based sports club that leveraged IoT and the cloud to develop a comprehensive booking system. He also highlighted how IoT can revitalize and restore outdated business models, making them profitable ...
There are several IoTs: the Industrial Internet, Consumer Wearables, Wearables and Healthcare, Supply Chains, and the movement toward Smart Grids, Cities, Regions, and Nations. There are competing communications standards every step of the way, a bewildering array of sensors and devices, and an entire world of competing data analytics platforms. To some this appears to be chaos. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, Bradley Holt, Developer Advocate a...
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, provided an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profession...
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
Apixio Inc. has raised $19.3 million in Series D venture capital funding led by SSM Partners with participation from First Analysis, Bain Capital Ventures and Apixio’s largest angel investor. Apixio will dedicate the proceeds toward advancing and scaling products powered by its cognitive computing platform, further enabling insights for optimal patient care. The Series D funding comes as Apixio experiences strong momentum and increasing demand for its HCC Profiler solution, which mines unstruc...