|By Eric Brancaccio||
|April 8, 2002 12:00 AM EDT||
When an application is functioning exactly as expected, does that mean it's bug-free? Not necessarily.
ColdFusion provides developers with many useful and powerful debugging techniques. If I find that part of an application isn't functioning the way I expect, I can usually use one or more of them to target the problem.
I consider debugging to mean much more than targeting obvious malfunctions in an application. It also involves taking into account performance and security, and making sure that execution time of all pages is optimal and that no part of the application is vulnerable to malicious attacks.
Obviously, you could write an entire book on performance and security checks, but I hope to share some of the ones I feel are most important and most often overlooked. An application could seem to be functioning fine, but it could still be inefficient and unsafe once it's opened up to the outside world. I go through several distinct procedures to make sure my code is bug-free.
First I configure my development environment to give me as much feedback as possible about the pages being processed. This includes enabling all debug output to be appended to each processed page. From the security/performance standpoint this tells me a lot about execution time, and it lets me see all form, URL, and browser variables, which should always be dealt with carefully. This article discusses some of the practices I use, once this is in place, to keep my code fast and safe.
Paramaterized Queries and Bind Variables
I believe that database interaction is the area in which most developers can improve their coding techniques. Many performance and security issues arise when communicating with a database. I develop the majority of my applications using an Oracle 8.1.6 database. The <CFQUERYPARAM> tag provides tremendous performance and security benefits when used with this database because it enables the use of bind variables in queries.
Before ColdFusion 4.5.1 the only way to implement bind variables was to write a stored procedure. For example, if I want to insert a large block of text into my database, I'd have to write a pl/sql block or a stored procedure to convert a literal to a bind variable like this:
DECLARE Long_Text LONG;
Long_Text := '#LongTextVariable#';
The introduction of the <CF-QUERYPARAM> tag made the process much more manageable. It's possible to code an entire application without using a bind variable, but I've found that using them within my queries has resulted in performance increases of up to 700%. The reason for this performance increase lies in Oracle's execution path.
Let's look at the following query:
<CFQUERY DATASOURCE = "dsn">
WHERE survey_id = #session.survey_id#
If the survey table contains 50,000 records, depending on the session variable, the SQL statement generated by ColdFusion Server will be something like this:
SELECT survey_name FROM survey WHERE survey_id = 34791
When the Oracle database receives this statement, it must decide on an execution path for retrieving the data. The path it finds is the optimal way for it to retrieve the data from the table. For example, it would be more optimal to do an index read than a full table scan.
The execution path that Oracle generates can be represented as a tree structure in which each node represents an operation. This path can be analyzed by using Oracle's EXPLAIN PLAN command. To minimize the time it takes to find this optimal execution path, Oracle keeps a cache of queries and their optimal execution paths. Each value for the session variable above will result in a different cache entry, however, so not much time will be saved.
The key to taking advantage of Oracle's caching of queries and their execution paths is use of the <CF-QUERYPARAM> tag, which will pass the session variable to the database as a bind variable. Optimizing the query above, it becomes:
<CFQUERY DATASOURCE = "dsn">
WHERE survey_id = <CFQUERYPARAM VALUE = "#session.survey_id#" CFSQLTYPE =
"CF_SQL_NUMERIC" MAXLENGTH = "5">
The query that's sent to the database is now:
SELECT survey_name FROM survey WHERE survey_id =: 1
The ": 1" that's substituted in the query is a parameter reference, so now there will be only one cache entry and corresponding execution plan for every possible value of the session variable. When dealing with large tables, the SQL cache will fill up very fast if bind variables aren't used. Performance will be lost since the CPU will be busy parsing so many different execution plans.
Using the <CFQUERYPARAM> tag also takes care of handling potentially unsafe data, primarily FORM, URL, COOKIE, and browser variables, which can be especially threatening when we use them to interact with our database. Many applications include queries that contain FORM or URL variables. I once came across an application that would log in users with a query like:
<CFQUERY DATASOURCE = "dsn" NAME = "checkusers">
WHERE user_name = '#FORM.user_name#'
AND password = '#FORM.password#'
<CFIF CHECKUSERS.RECORDCOUNT IS 1>
< !--- Login is valid --->
Clearly this query would be very vulnerable to a malicious attack and is a very insecure login method. If someone were to modify the contents of the FORM.password variable to something like:
"(SELECT password FROM users WHERE user_name = 'jdoe')"
they could log in as any user whose username they happen to know. A simple solution to this problem would be to use a parameterized query:
<CFQUERY DATASOURCE = "dsn">
WHERE user_name = <CFQUERYPARAM VALUE = " #FORM.user_name#" CFSQLTYPE =
"CF_SQL_VARCHAR" MAXLENGTH = "10">
AND password = <CFQUERYPARAM VALUE = " #FORM.password#" CFSQLTYPE =
"CF_SQL_VARCHAR" MAXLENGTH = "10">
Now if a string longer than length 10 is passed in, an error will be thrown and the query won't be executed.
There are situations in which parameterized queries aren't the right choice. In the previous example the session variable will change often depending on which of the 50,000 surveys is being accessed. Let's consider a situation in which we are writing a query based on a variable that can have only a few values:
<CFQUERY DATASOURCE = "dsn" CACHEDWITHIN =
WHERE region_id = #region_id#
In this situation it's most efficient to cache the query.
There are a few final things I want to touch on concerning database interaction. Make sure to pay close attention to the query execution time in the debug output. Sometimes simply reorganizing the order of the SQL clauses can give a performance gain, so all possible structuring of a SQL statement should be tested. Also, when you expect a large amount of rows to be returned, make sure to specify a blockfactor. This will reduce unnecessary trips to the database and can cut down execution time substantially.
The GetTickCount( ) function is a valuable tool for analyzing code performance. ColdFusion Server allows a developer to get the execution time of entire templates, but often we need to get the execution time of specific pieces of code within a template. I use the GetTickCount( ) function along with the IsDebugMode( ) function throughout my applications so I can get feedback on every bit of processing that ColdFusion Server will perform. I surround blocks of code that I'm analyzing like this:
<CFIF IsDebugMode( ) >
<CFSET GetTickCount( ) = t1>
<!- - - Code Block Here - - ->
<CFIF IsDebugMode( ) >
<CFSET TotalTime = GetTickCount() - t1>
The IsDebug( ) function is used to test whether debugging has been enabled. If this function returns false, the variable "TotalTime" won't be displayed. This practice is very useful, especially on pages when a lot of server-side processing is being done. It tells a developer the most efficient coding techniques to use to accomplish certain tasks. It's a good idea to write a custom tag to time code blocks so you can analyze blocks of code like this (see Listings 1 and 2):
<!--- Code Block Here --->
If security is an issue, all data validation should be performed on the server side. Client-side validation is functionally fine, but it can easily be bypassed. Here are some server-side validation techniques I use to make sure the data I use is safe. First I make sure every request is coming from my server:
<cfif not Find( "myDomainName", CGI.HTTP_Referrer)>
<! --- This has not come from my server --->
This check alone is still not good enough because the http_referrer is a client variable and can be modified. A developer must look at every variable being handled and ask: "How could this variable have unwanted effects on my application?" It's important to check for the existence of all variables using the IsDefined( ) function. Then I use the <CFPARAM> tag, not to assign a default value if one doesn't exist, but to validate the data type:
<CFPARAM NAME = "user_id" TYPE = "numeric">
Not only does specifying a data type speed up processing, but also, if the user_id variable isn't numeric, ColdFusion will throw an error.
I believe it's important to become familiar with Regular Expressions. Often when I handle string data I need to put some complex validation rules in place, and the use of Regular Expressions facilitates this. For example, any variable that will be outputted on a page should be checked to make sure it doesn't contain any HTML tags other than <B> or <I>. If a </td> or a </div> tag is outputted on a page, it can ruin the display of the page. I have come across so many message board applications in which a user can post some text with a hyperlink that starts an infinite loop of window.open( ) calls and crashes the user's browser. A simple fix for problems like these, using Regular Expressions, is:
<CFSET TextToBeDisplayed =
For numeric data I always check it first with the Val( ) function. This returns either a number or zero (if the string passed in can't be converted to a number). This is useful because some values that ColdFusion considers to be numeric could have undesired results if handed off to a database. The Val( ) function converts these values to a number:
<CFSET MyNum = "5d3">
<CFPARAM NAME = "MyNum" TYPE = "numeric">
<CFSET MyNum = Val(MyNum)>
<CFOUTPUT> #MyNum# </CFOUTPUT>
<! --- 500 will be printed --->
I then make sure the length and range of the number are consistent with what I'm expecting.
Clearly, data validation is something that must be done; however, if it's performed on the client side, it's basically worthless protection against people who are purposely trying to disrupt your site.
Encrypting and Encoding
I believe in encrypting all URL and FORM and form variables. This makes it much more difficult for someone to tamper with an application. For example:
<input type = "hidden" name = "user_id" value = "25"> or
These would give a user much more information than:
<input type = "hidden" name = "user_id" value = "04J039M9384N"> or
ColdFusion provides the Encrypt( ) and Decrypt( ) functions, but I recommend developing your own tag, which should include some sort of check to make sure the data hasn't been changed between the encryption and decryption process.
Not only should URL variables be encrypted, but they should be encoded as well, using the URLEncodedFormat( ). This will ensure all characters that could cause problems will be escaped.
A stronger form of encryption can be achieved through the Hash( ) function. This can be used only to compare values since there's no way to decrypt the string. I use this occasionally as a way to check whether certain values have been changed.
A thorough debugging is a vital part of the development process. In many instances an application that my team developed ran perfectly in the test environment. Once we opened it up to the world, however, various problems would arise - primarily database performance problems due to the CPU on our Oracle box parsing so many execution plans. Even the load testing we did while testing failed to reveal this problem.
To improve our security I began logging all queries to better understand the attacks on our systems. This taught me all sorts of new methods for protecting my applications. Choosing the code-checking techniques I thought were most important and at the same time most infrequently used by most developers was a difficult task. Hopefully these techniques will improve the overall quality of your applications.
|Kevin Brown 04/09/02 10:44:00 AM EDT|
Being a reader of CFDJ for about a year now, I'd like to point out that this is one of the most useful/understandable articles I have seen so far. Could you please direct me to past articles written by Eric Brancaccio.
- Where Are RIA Technologies Headed in 2008?
- The Next Programming Models, RIAs and Composite Applications
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Constructing an Application with Flash Forms from the Ground Up
- Building a Zip Code Proximity Search with ColdFusion
- Personal Branding Checklist
- CFEclipse: The Developer's IDE, Eclipse For ColdFusion
- Has the Technology Bounceback Begun?
- Adobe Flex 2: Advanced DataGrid
- i-Technology Viewpoint: We Need Not More Frameworks, But Better Programmers
- Web Services Using ColdFusion and Apache CXF
- Cloud People: A Who's Who of Cloud Computing