| Close Window |
There are many resources we should analyze to ensure optimal ColdFusion operation or to help diagnose problems. Fortunately, there's an awesome free tool that comes to our aid to turn voluminous data into useful information.
In this article, I'd like to introduce you to the free Log Parser tool from Microsoft. Yes, it's free. And while you may not run ColdFusion on Windows, that's okay. You can use it on a Windows machine to monitor resources on a Linux machine. The tool applies just as well to those running BlueDragon or any CFML, PHP, .NET, or other environment.
I'll show you the many ways you can use the tool to solve common challenges in ways that no other monitoring feature in CF or even the dedicated monitoring tools like SeeFusion or FusionReactor can do. In one example, I'll show you how it can provide application-specific error log information that many struggle to obtain.
Basics of the Tool
Despite its name the tool is about more than just log files. What kind of resources can Log Parser monitor? It started out focused on Web server log files, and indeed it can do that (in IIS, W3C, and NCSA formats), but it can also analyze all manner of tabular text files (CSV and TSV), as well as XML files, the Windows Registry, the file system, the IIS metabase, the Windows event logs, the Windows Active Directory, and NetMon files.
And while it can produce simple textual results, it can also generate output to a file in plain text, CSV, TSV, or XML formats, as well as produce charts, and store results in a database
But the best, most compelling, and truly unique aspect of the tool is how you go about analyzing the input files. There's no interface for the tool. So how do you describe what data to retrieve? Would you believe you use SQL? That makes it an especially compelling tool for CFML developers, since we're used to using SQL already.
I'd like to focus here on particular forms of information that would be useful for CF developers and administrators to analyze using the tool, so I won't focus on its installation or basic use. I'll direct you to other resources at the end of this article that will get you started.
I'd just like to clarify that Log Parser is meant to be used at the command line (logparser.exe). I'll assume that you have it installed and configured so you can issue commands such as its help option:
Logpart -h
Of course, you can also use such a tool from the CFEXECUTE tag, but again that's beyond the focus of this article. I'll point you to a resource later that will cover such additional details.
Analyzing CF Log Files
It may seem odd at first to contemplate using SQL to analyze log files and the other non-database resources mentioned above, but it really is effective. While most of the existing resources that introduce the tool focus on analyzing Web server logs, I'd like to start by showing an analysis of ColdFusion logs.
Now, how could a tool that analyzes log files regard the columns in that file as columns to be used in SQL? Well, many log files do have a first line called a "header" line that provides a list of names that identify each column in the log file.
Even if the log file doesn't offer a header file, the Log Parser tool has a clever mechanism to analyze the first 10 lines of the file to try to determine what kind of data is in each column and create generic column names.
For this article, let's consider the application.log file that tracks errors in your CFML code and is stored in the "logs" directory where CFMX is installed. On my machine, that's c:\cfusionmx\logs. To query all the columns in all the records of that file, I could issue:
logparser "select * from C:\CFusionMX\logs\application.log" -i:csv
Note the familiar "select *" SQL statement that says "select all columns" and the "from" clause that names the actual log file to be processed, all of which is embedded in double quotes. The additional "-i:csv" argument tells the logparser engine that the file is a CSV (comma-separated value) format file. A subset of the result as it might appear is:
Filename RowNumber Severity ThreadID Date Time Application Message
--------------------------------- --------- ----------- -------- -------- -------- ----------- -------------
C:\CFusionMX\logs\application.log 2 Information jrpp-0 06/21/06 19:53:15 <NULL>
C:\CFusionMX\logs\application.log initialized
C:\CFusionMX\logs\application.log 3 Error jrpp-0 06/21/06 19:53:15 <NULL>
Error Executing Database Query.Data source not found.
The specific sequence of files included or processed is: C:\Inetpub\wwwroot\test.cfm
Yes, it's a jumble for now (all dumping onto your screen at the command line), but later we'll see how to limit what columns to show as well as how to write the output to a file for more effective observation.
Understanding Column Headers
The first line above shows all the column headers that were found in that header file. You can also have the tool list the columns that it's found or detected using the option -queryinfo, such as this:
logparser "select * from C:\CFusionMX\logs\application.log" -i:csv -queryinfo
In the data shown will be the list of columns found. In this example, it would be:
Query fields:
| Filename (S) | RowNumber (I) | Severity (S) | ThreadID (S) |
| Date (S) | Time (S) | Application (S) | Message (S) |
Note that the "S" and "I" indicators mean the columns hold strings or integers, respectively. Other types are "T" (timestamp) and "R" (real). We can use any of these column names in the SELECT statement to limit what columns we display. More important, we can use them to limit what "records" we display.
Limiting the Records Found
Going back to the earlier example that just did a "select *" the result was basically the same as if we'd just dumped the file to the screen. Where the tool gets powerful is in using additional SQL clauses to refine the search. For instance, since we see that one of the columns is "severity," which has values such as "Error" or "Information," we could limit the list to just those with errors using:
logparser "select * from C:\CFusionMX\logs\application.log where severity='Error'" -i:csv
Note the addition of where severity='Error.' As with most SQL engines, the string value must be surrounded with single (not double) quotes. On the other hand, notice that the first character of the value is capitalized because, unlike most SQL engines, the comparison here is case-sensitive.
Of still more interest may be to limit the errors to a given application. Since that's another field, you can use this:
logparser "select * from C:\CFusionMX\logs\application.log
where severity='Error' and application='test'" -i:csv
This would list only those errors for the application "test."
Grouping Records
Now, perhaps a different interest would be to see a breakdown of errors by application. Again, this is easy in SQL and therefore easy in Log Parser:
logparser "select application, count(*) as NumErrors
from C:\CFusionMX\logs\application.log
where severity='Error' group by application" -i:csv
First note the use of the "group by application" clause, which changes the SQL to group all the records of the same application together. Then note the use in the SELECT clause of "COUNT(*) as NumErrors," which will count how many records there are in each such group (by application), reporting it as a NumErrors column. Finally, note the addition of application to the Select list to show the name of each application. The result of this might be:
| Application | NumError |
| --------------------------- | -------- |
| <NULL> | 473 |
| cfadmin | 9 |
| App1 | 1 |
| test_admin | 60 |
| demoapps | 3 |
| ContactManager | 1 |
Of course, if we wanted the list to be sorted either by the application names or the number of errors, we could add an "Order by" to the SQL.
Creating an Output File
The examples to this point
have simply output the results to the screen (command line), but you
may prefer to write the results to a file instead. You do this using an
INTO clause, which is placed at the end of the SELECT (and before the
FROM), to name a file, such as:
logparser "select * into test.txt from
C:\CFusionMX\logs\application.log where severity='Error'" -i:csv
Notice here that I've switched back to the earlier example of listing all the errors from the application.log file. Note as well that the INTO clause can specify a full or relative path (or a UNC path) for the file name, otherwise if none is provided, it will simply be stored in the current directory where the command takes place.
The file output format will default to what has been shown on screen to this point, which Log Parser calls a "NAT" format (native or tabular data format). But as mentioned at the outset, you can also cause the output (on screen or to a file) to be any of several other formats such as CSV, TSV, and XML. For instance, you could change the output file to an XML file:
logparser "select * into test.xml from
C:\CFusionMX\logs\application.log where severity='Error' " -i:csv o:xml
Creating an Output File Per Application: Finally Error Files Per App!
A unique feature of the Log Parser tool helps resolve a problem that
has long plagued CF administrators. If you have multiple applications
and development teams on one server, you (or they) have probably wished
you could create a separate error log file for each application. Well,
Log Parser makes this a snap, using a feature called "multiplexing."
It's really as simple as using wildcard characters in the INTO clause, which then correspond to the first column listed in the SELECT clause. Continuing the example above, recall that it's listing all the columns for all the messages that reflect an error. We can alter that to name the application column first, and while we're at it we may as well limit it to just that column, the filename, message, and date and time, as in:
logparser "select application,filename,message,date,time into errors_*.txt
from C:\CFusionMX\logs\application.log where severity='Error' " -i:csv -o:csv
Using the data in the log file reflected so far, where there were errors for the applications such as "App1" and "demoapps," this would create a file such as errors_App1.txt, with the errors for App1, and errors_demoapps.txt, with the errors for demoapps. You'll notice that I've changed the output format to CSV. For some reason, the default NAT format isn't supported for multiplexing, though TSV, XML, and others are. See the product docs for more information. This is really a very powerful mechanism. Imagine that after creating each app-specific file, you could go further and send each file to the appropriate application owner.
In fact, you can go further still to create a directory for each application instead. The secret is that you may use the * anywhere in the path. Just as the command above will create new files even if they didn't exist, you could have the tool create new directories as well. Consider the following:
logparser "select application,filename,message,date,time into *\errors.txt
from C:\CFusionMX\logs\application.log where severity='Error' " -i:csv -o:csv
Note the use of "*\errors.txt" for the Into clause. This will create a new directory for each application and store the file as errors.txt. (You could certainly repeat the * again if you wanted the application name in the file name too.) Imagine how you could now give the owners of each application access to "their" directory. And again, the path you name for the output directory could have used a UNC path to put the output on some shared server.
(As an aside regarding multiplexing, note that the first column listed is used only for determining the file name and doesn't appear in the result. If you really want it in the result, simply list the column twice.)
But How Do I Get Only the Latest Log Entries?
We're about out of space, but I'd like to demonstrate just one more
really compelling feature. Regarding the last example, you may be
thinking, "this all sounds kind of interesting, but it will stink as
the logs grow and grow in size. I wish I could just get the latest log
entries." That's a very logical concern. And like the old steak knife
commercials on TV, the Log Parser tool makes me want to say, "but wait,
there's more!"
The tool has a feature called checkpointing that's really as simple as naming a checkpoint file that will track the row number in the log file where processing stopped. It's really as simple as naming a checkpoint file. Consider the following:
logparser "select application,application,filename,message,date,time into *\errors_*.txt from
C:\CFusionMX\logs\application.log where severity='Error' " -i:csv -o:csv
-icheckpoint:test.ipc
Note the specification of the "-icheckpoint" argument, naming a file, in this case "test.ipc." Again, the checkpoint file specification can include a path so it can be put anywhere, or it will default to the directory where the logparser command is executed. The feature tracks where log processing has stopped and where subsequent log analysis should start. It can be used whether you're outputting the results to screen or file.
There's one thing to think about regarding use of this feature with the multiplexing (or indeed any log file output) operation: it will overwrite any previously existing file. This operation may make most sense when you're running the operation periodically and immediately sending the logfile somewhere else. That way, the recipient will get a new file whenever there's new content of interest to them. Nifty!
Other Log Parser Features: Too Many to Name
I
could go on. Clearly, this is a powerful tool, and we've only scratched
the surface. I've provided several resources to help you learn more and
see more examples. Consider briefly the following additional features:
- Again beyond log files, the tool can analyze Web server logs, the
metabase, the registry, the file system, windows event logs, the Active
Directory, and so on
- Input files can not only be pointed to on local and UNC paths, but
they can also be specified as URLs as well. So, for example, you can
use this tool to read remote CSV and XML files (including RSS feeds)
- You can pipe the output of command-line tools to be processed as input to the tool
- You can store the report output in a database
- You can generate charts from the tool
- Besides simple tabular output to the screen, there is also an available Windows datagrid
- You can choose to store the SQL in a file and point to it from the command line, and even pass in parameters to such a file
- You can name more than one file in the FROM (filename,filename), and
the tool will automatically connect the files together (a "union" in
SQL terms)
- Since you can read in XML files, consider that this lets you perform SQL against the XML, which CFML can't currently do
- The SQL grammar available is quite substantial. I've only used the simplest examples
- It also offers many useful SQL functions, including ones unique to the tool, like PropCount and PropSum
- The -h command-line option also offers help on SQL grammar and shows examples as well
- And this is just some of what it does
Also, I can think of additional CF-specific kinds of analysis, of various input sources, that I'd like to elaborate. For instance, I've only showed working with the application.log file, and only the logs in the \cfusionmx\logs directory. Note that there are many other log files in ColdFusion that could be useful to analyze. (You can even use Log Parser's ability to query the file system to report what log files exist.) Of course, it's often useful to analyze Web server logs (whether IIS, Apache, or the built-in Web server in CFMX) to help identify performance problems.
Beyond that, there's data in the event logs (related to running CF services), as well as possibly data in the registry (such as client variables, perhaps existing unexpectedly). And since the tool can read XML files generated and available via a URL, it could be used to monitor and report on the output of both SeeFusion and FusionReactor, which can output XML files. I'll offer a URL at the end to name a place where I plan to elaborate on such opportunities as an extension to this article.
Resources for Learning More
Clearly, there's much
more to learn. Perhaps the most important resource to start with is how
to download the tool. There are a couple of links to note. The
Microsoft link is www.microsoft.com/technet/scriptcenter/tools/logparser/default.mspx. The tool includes a help file with considerable extra detail. Beyond that, there's also an "unofficial" Web site at www.logparser.com/. It includes a knowledge base, forums, and links to articles.
But the best resource of all, to really learn and appreciate the tool, is the book Microsoft Log Parser Toolkit, available from Syngress at www.syngress.com/catalog/?pid=3110. Since it's a couple of years old, the book is also available used at Amazon. Even so, it's an awesome resource. You'll learn things you never dreamed were possible.You may enjoy the following articles:
Finally, as I alluded to above, I plan to create an area on my Web site to expand on this discussion and elaborate more CF-specific aspects of using Log Parser. Look for that at http://www.carehart.org/logparser/. I welcome your feedback.
© 2008 SYS-CON Media Inc.