|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV SYS-CON.TV WEBCASTS |
TOP COLDFUSION LINKS Feature Untrusted Data Sources
Untrusted Data Sources
By: Jackson Moore
Oct. 4, 2001 12:00 AM
To secure your Web-based application you must close all known holes in your hardware and software as well as those you inadvertently open in your application's code. This article addresses possible holes in your ColdFusion code that result from explicitly trusting the data your code accepts from URL parameters, form fields, cookies, browser variables, databases, or other external data sources. You must take measures to ensure that data from these sources won't cause your application to display improperly, crash, permit a security breach, or allow unintended server-side operations to be performed. Although the exploits described in this article aren't specific to ColdFusion and many have been around for years, we'll examine ColdFusion practices for protecting your application, including data validation, encryption, and data integrity.
Untrusted Data Sources
With URLs a user can simply add, change, or delete URL parameters in the query string. Form fields might not be as obvious, but it's a trivial exercise to save a form-based HTML page on your computer, change hidden form fields, or bypass client-side validation and resubmit the form. Likewise, a cookie can be easily manipulated by using a text editor to make a few careful changes. CGI variables that originate in the browser ("cgi.http_referer") must also be treated as suspect since they can be spoofed or blocked. Last, and maybe least considered, are the problems that can occur by not validating data retrieved from databases. Unless yours is the only database application, your application can't blindly trust the data in the database. What would happen if another application inserted a username containing a "<" character into the database? These are the biggest sources of untrusted data, but you must audit your own situation carefully. Other sources of untrusted data are CFFILE, CFHTTP, COM objects, and any other means of accessing data outside the direct control of your application.
What Can Happen If you have a search form that allows users to search your Web site, you might display a response to the user such as "Your search for 'ColdFusion' resulted in a gazillion matches." But what happens if your user searches for something like "</TD>" or "<B>" or simply a "<"? This may cause your page to display improperly - reformatting your text or omitting portions of your page altogether. Furthermore, some browsers are known to crash when parsing ill-formed HTML. Improper display can also be caused when pulling data from a database that you don't exclusively control. A database of magazine articles may contain characters that cause display problems if you don't validate the data after you retrieve it from the database.
Bypassed Client-Side Validation
As mentioned, "cgi.http_referer" (or any other client-side variable) can't be relied upon to verify the origin of a form submission. "cgi.http_referer" can be spoofed to make you believe that the form was submitted from your site or blocked (by a firewall or privacy software), preventing valid users from submitting forms.
Cross-Site Scripting
<script>window.open('http://somepornsite/');</script> If you're not validating this text field, every user that views your message board will be "treated" to a new window that originates from and appears to be authorized by your site. Other potentially damaging HTML tags include <OBJECT> and <APPLET>.
Error Messages
As you can see from this figure, the user is presented with an error message revealing a physical path on the server. The amount of critical information displayed will depend on the type of error, your ColdFusion server settings, server platform, and database server.
Unauthorized Access
Unintended Server Operations
<cfquery name="qArticles" datasource="testDB"> This works fine if the URLs submitted by the user have valid article numbers, but not URLs like this: http://magazine/display.cfm?article_id=14%20delete%20from%20articles With Microsoft SQL Server (among others) this user just deleted every record in the "articles" table. Granted, the user must know the name of your table, but through the error messages mentioned earlier or simply guessing common table names, this isn't difficult. You need to analyze your other server software for similar "features." For example, Allaire Security Bulletin ASB99-09 (www.allaire.com/handlers/index.cfm?id=11069&method=full) warns of an issue with Microsoft Access that allows users to append VBA commands to a SQL string. Any server software your application interfaces with that supports scripting needs to be audited for related patches. If you validate all user input, however, you're likely to avoid these vulnerabilities.
What You Can Do The built-in validation options are somewhat limited, though, so you'll probably need to write some of your own validation code. If you utilize the server-side method, be sure to use CFERROR to specify a more presentable validation error-handling template. Last, since the validation instructions (server-side) and JavaScript functions (client-side) are sent to the client with the form, they can be bypassed. That said, these techniques are useful in some cases and provide built-in validation for required fields, data types such as integer and date, as well as data values such as credit card and social security numbers. The online help provides documentation and examples for each method.
Use Server-Side Validation
Furthermore, client-side validation can't perform some data validation such as preventing duplicate database records or testing uploaded files. The advantage to client-side scripting is that it provides immediate feedback to the user without a round-trip to the server for validation. I advocate using client-side scripting to check for required fields only, since you must validate the data on the server anyway. The advantages to this approach are that the required JavaScript implementation is small and you won't have to maintain two libraries of validation routines. Checking for required fields can be easily implemented on the client-side through the CFINPUT tag without any knowledge of JavaScript. To make a field required, specify the "required" attribute: <cfinput type="text" name="firstname" required="yes"> ColdFusion will supply the necessary JavaScript with the page returned to the browser. If the user tries to submit the form and leaves the "firstname" field blank, a JavaScript alert box will pop up to inform the user that this field is required (note that spaces are considered valid characters). Again, this assumes that JavaScript is enabled on the user's browser and hasn't been bypassed.
Scope Variables
Error Handling
Disabling the "Display the template path in error messages" setting in the ColdFusion Administrator will prevent physical paths from being displayed in most, but not all (see Figure 1), error messages that do get through to the user. For more information on handling errors look through your back issues of CFDJ for the "Toward Better Error Handling" series (Vol. 2, issues 10 and 12, and Vol. 3, issue 2) by Charles Arehart. These are good articles that will help even beginners get up to speed on error handling.
Data Type Validation
Two useful tags for this purpose are CFPARAM and CFQUERYPARAM. With CFPARAM, specify the "type" attribute with any of the following data types: array, binary, boolean, date, numeric, query, string, struct, or uuid. <cfparam name="url.myValue" type="numeric"> If the value of "url.myValue" is anything other than numeric, ColdFusion will throw an error. Using this in a meaningful way will require the use of CFTRY/CFCATCH blocks to catch the error and handle it properly. CFQUERYPARAM tags are nested within CFQUERY tags and will give you some control over data validation. Refer to the ColdFusion documentation for CFQUERYPARAM for a list of SQL data types that can be validated. Again, ColdFusion throws an error if the validation fails. ColdFusion also has several built-in functions (IsDate(), IsNumeric(), etc.) that allow you to determine data types without using error handling. If you need more specific data-type validations (such as integer or hexadecimal), you'll have to write your own validation code. Another helpful ColdFusion function for validating numeric data types is Val(). Val() will return a number from the beginning of a string or 0 if the string can't be converted to a number. This function is particularly handy when dealing with a peculiar aspect of ColdFusion's support for scientific notation. ColdFusion treats the value "1D2" the same as "1E2" - both are scientific notations for "100". ColdFusion considers both values to be valid numeric values, but your database probably won't even if it supports scientific notation. Using the Val() function will ensure that the value is converted to the more recognized scientific notation format:
#val("1d2")# <!--- 100 ---> For example, if your user enters his or her age into a form as "3D1" (30), you want to make sure that inserting this value into your database won't cause an error even though this is a valid number in ColdFusion.
Data Value Validation
http://magazine/display.cfm?article_id=14 If the user manually changes "article_id" to a different (but valid) integer, should the user be presented with the article carte blanche? In this example the answer is no. For each article the user requests, we must authenticate and verify that he or she is allowed to view the requested article. You can't assume that users won't try to guess new paths or values for URL parameters, form fields, or cookies in order to access restricted portions of your Web site. Another case where data value validation is critical is when the range of legal data values for a data type in ColdFusion differs from your database. ColdFusion, for example, considers dates as large as December 31, 9999, as valid dates. In SQL Server 7, the "datetime" data type will accept this value, but the "smalldatetime" data type won't. To prevent database errors with date data types, you need to ensure that the date value is within an acceptable range for your database column. There are many other reasons why you need to validate the value of a piece of data, not just its data type. Preventing duplicate records in a database or making sure the length of a string value is within acceptable minimum and maximum lengths are additional examples of when the value of the data must be validated. The bottom line is that (just like data types) you can't make any assumptions about the value of data received from a client.
Defining Character Sets
In nearly all cases you'll protect yourself better by first defining which characters to allow and then making sure that the suspect string contains only those characters. This is preferable to defining which characters are illegal (though this may seem easier to implement), because it's very easy to forget one and it reduces your vulnerability to (as yet) unknown exploits. As an example, let's consider a form field that accepts a username field and requires all usernames to contain only letters. It's safer to verify that only letters are submitted as opposed to checking for all possible illegal characters. You can use a regular expression to verify that no other character is allowed:
<cfif refindnocase("[^a-z]",form.username) neq 0> The validation becomes more complicated for string data that has to be more flexible. Our example of the cross-site scripting exploit was due to a developer allowing <SCRIPT> tags in message board posts. This particular example could have been avoided by defining a character set that excluded the "<" and ">" characters. But what if you want to allow users to use <B>, <I>, and <U> for formatting purposes within their message post? Now you have to allow some tags but not others. As with defining single character sets, you're far safer determining which HTML tags to allow as opposed to identifying the tags that may potentially cause problems. I take issue with some of the available custom tags that focus on which HTML tags to disallow because future additions to the HTML specification or browser-specific tags may require you to revisit your code. If you want to allow bold, italic, and underline formatting in your message posts, allow only those and disallow all others. The following code will remove all tags except for <B></B>, <I></I>, and <U></U>:
<cfset This, of course, won't validate the HTML to ensure it's well formed. Limiting the allowable HTML tags will minimize the amount of validation needed since you don't want to have errors caused by a user who failed to provide an end tag or improperly nested certain tags. If the user leaves out a "</B>" tag, you won't have a problem if the message post is encapsulated in a table cell. However, a misplaced <DIV> or </TD> can really wreak havoc on the display of your page or crash the browser.
Encryption
http://magazine/display.cfm?article_id=0101J442509E78541 The new value of "article_id" makes it less obvious to the user how "article_id" is used. Curious users may still tamper with the value, but after failing once or twice will give up. ColdFusion offers two complementary functions, Encrypt() and Decrypt(), to perform encryption. They both take a string to encrypt as well as a key to use in the encryption process. Encrypt()'s XOR-based encryption, though useful, is weak compared to encryption methods (such as that used in 128-bit SSL) considered secure by today's standards. Encrypt() shouldn't be used for sensitive information (credit card numbers or medical information), but it can be used to prevent most attempts to casually change "nonsensitive" values in URLs, form fields, and cookies. Strings encrypted with Encrypt() can contain many potentially damaging characters (new lines, form feeds, backspaces, greater-than signs, carriage returns, etc.): encrypt("Macromedia","key") = *=(IX 0B@5M%$V#\! The last character (which you can't see) is a new line character. Practical use of the Encrypt() function will generally require another encoding step such as URLEncodedFormat() to escape all illegal characters in the string. Your decryption process then becomes a two-step process as well - decode and decrypt. Another drawback to the Encrypt() function is that there's no data integrity check to ensure that the encrypted string hasn't been tampered with. Using the above example I decrypted the encrypted string, but changed the number 5 to 2: decrypt("*=(IX 0B@2M%$V#\! ","key") = Macromydia As you can see, the decrypted string was successfully decrypted but resulted in a value different from the original. At the risk of losing some objectivity, I'll mention a tag I wrote called CF_CRYP (available from the Developers Exchange, http://devex.allaire.com/developer/gallery/). CF_CRYP builds on the default encryption of the Encrypt() function, but encodes the resulting string and adds a checksum. The result is an encrypted string containing only numbers and letters with a checksum. Changing a single character in a CF_CRYP-encrypted string can be detected because the checksum of the decrypted value won't match the checksum of the original unencrypted value. CF_CRYP also provides a return structure with error information so you can detect tampering and, potentially, block that user from further access. Using CF_CRYP to encrypt a value of "14" with a key of "somekey" produces "0101J442509E78541". To access the subscriber-only articles, the user would have to know the encoding scheme, key, and recalculate the checksum. This is a good solution for this example because if the encryption scheme was broken, the worst that can happen is an unauthorized user views some subscriber-only articles. If this happens, you should probably give the user a free subscription to your magazine anyway!
Use CFCONTENT
Database Validation
Even if a user manages to get multiple commands through to your server, only those operations authorized for this user would be allowed. You must combine this type of security with the other guidelines in this article since even a benign SELECT operation can allow a user to view unauthorized content. If you use other server-side applications (including COM objects), take advantage of any built-in user authentication. Most databases will also allow you to perform data validation at the database level. This is a good solution when you have numerous applications inserting data into a database. By moving some data validation to the database, you can remove redundant validation routines from each application.
Conclusion
Taken together, the techniques presented in this article will help you protect your application from any malicious data sent by a user. Whether the data is tampered with accidentally or on purpose, it's your job as the developer to close these code-level holes. Resources
YOUR FEEDBACK
CFDJ LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||