|By Charlie Arehart||
|February 22, 2001 12:00 AM EST||
What database do you use for your ColdFusion applications? Are you fully satisfied with your choice? Did you even have a choice? Would you be interested in knowing what other alternatives you may have? In this article I present some of the discussions that inevitably arise when considering database choices available to ColdFusion developers.
Not a Simple Matter
When this special database issue was planned, the editors asked if I'd write an article on this subject. I was at once intrigued and concerned. Any discussion of databases is likely to touch off hot buttons. There are so many issues to be considered, including:
Still, I'll share some thoughts from my perspective as a developer, instructor, and consultant, as well as my observations of such discussions in various forums. I don't claim this article to be a definitive resource on the subject by any stretch. Instead, I hope it stimulates debate and brings some people to a better understanding of the issues and alternatives.
Much of the discussion refers to Microsoft alternatives. As most CF developers are familiar with them, I can make some references to put the discussion in context. At the end, I'll discuss other choices as well, including different platforms and vendors.
The Big Picture Issues
Regardless of the preferences you may have, the constraints you face, or the choice you ultimately make, you'll have certain big picture issues to consider when selecting a DBMS. Probably the biggest factor for most people is cost.
Cost is a complex variable in the equation of determining DBMS choice. While the up-front price (or lack thereof) to purchase a DBMS may be the most obvious aspect of cost, there are actually multiple facets, including:
- Licensing: Purchasing the product, in-cluding the cost for developer connections and connectivity from Internet-connected clients
- Platform requirements: Implementing the physical infrastructure needed to support the DBMS, which may include server hardware, networking support, and more
- Connectivity: Purchasing drivers to connect ColdFusion to the DBMS, if needed
- Administration: Installing and administering the DBMS, including ongoing care and feeding of backup and recovery, performance tuning, database administration, and much more
- Development: Creating applications against the chosen DBMS, including cost of modifications or migration if moving from one DBMS to another
- Training: Training developers and administrators, and in some cases end users, who may be provided query and reporting tools by the DBMS
As you can see, there's much more to the "cost" of a DBMS alternative than the simple up-front purchase price. This point doesn't necessarily favor one DBMS over another. Some may use it to argue that it points up the "false economy" of choosing a "free" open-source DBMS (if the other costs exceed comparable totals for another DBMS), but it could also be argued that a "high-priced" DBMS may be favorable (because the sum of those costs is less in the long run than a free or inexpensive alternative).
Even then, there's so much more than cost to consider when choosing a DBMS.
By availability I mean the choice of DBMSs already installed and available to you (in-house if you're hosting your Web application yourself, or on your Web service provider if your site is hosted).
Of course, this again presupposes that the "simplest" choice is the best (let's use what we've got), when in fact a careful analysis may show that other alternatives are indeed preferable. If you're not the one who decides what software is installed, though, you're left to choose among what you have.
You may even be specifically precluded from installing an alternative, even a "free" one, because of organizational restrictions on installing new software.
In all these cases, the simple fact is that for some people availability is more important than true cost. As such, the question becomes what choices are available to you? Don't be so quick to presume you know. In an organization of more than a few people, it's possible there are DBMSs installed that you may not know about. Be sure to thoroughly investigate what's available.
Further, as will be discussed later, you may unknowingly have alternatives available at no "cost" (if that's an issue) that have not yet been installed. (Specifically, I'm referring to MSDE, which is available to users of Microsoft Access 2000 and other MS development tools.) You may find you can greatly improve many facets of your DBMS platform by taking advantage of such "undiscovered" alternatives.
If you've developed a database application of any size, you've certainly learned the importance of database performance. By many estimates, 90% of the cause of poor performance in these applications, especially Web applications such as those enabled by ColdFusion, can be attributed to poor database performance.
The question is: What's the cause of the performance problem? Is it the inability of the DBMS to effectively process the database requests issued against it? The poor quality of the program performing the database requests? Bad database design?
There's no denying some DBMS platforms make much better choices for high-performance, demanding applications. If you need to serve hundreds, thousands, or indeed millions of users a day, your database better be up to the task. Fundamental architectural capabilities are built into large-scale DBMSs to make them more suitable for such high-performance applications.
It's also true that sometimes the very infrastructure required of higher-end DBMSs is a key to their greater capability (this is the "platform requirements" cost mentioned previously). As such, even a lower-cost (lower-end) DBMS may run well with the right combination of platform and programming skills.
Regardless, it's often the case that your current DBMS could handle the job better in your current platform. As pointed out in other CFDJ articles (see Resources at the end of this article), it's quite possible to stretch a good performance out of even MS Access. Many regard Access as absolutely unsuitable for production use (arguments are made in some of those articles on that very point), but there are situations in which it has proven to work satisfactorily for some applications.
The real question is: What's "production use"? What are the performance requirements of the application? For low-volume requirements, it's possible that low-end DBMS solutions may indeed be capable of performing well.
A more important question, though, is what are the expected performance requirements over the long term? This is a question of scalability: the ability of the database to handle a growing volume of use.
Even if you find that a given DBMS choice (or database and application design) is adequately meeting your needs, as your site or application grows in popularity (by good marketing, word of mouth, growing acceptance of it as a technology among your user base, etc.), it may not in the long run.
As the volume increases, is your database capable of handling that higher volume?
A more insidious facet of this issue comes into play when moving a project from the development to the deployment stage. Many organizations or individuals have faced public humiliation when an application that "worked" all through the development cycle suddenly crumbled to its knees when exposed to access on a larger scale.
And "larger scale" is also relative. Some applications fall to pieces when someone other than the developer, even just one other user, uses the application. More realistically, challenges often come when several people (or perhaps not until hundreds or thousands of them) use the application at once. There may be issues of concurrency (problems when several users try to access the same data at once) or integrity (the application may not have been designed to handle all the possible inputs that a broad base of users would provide).
Higher-end DBMSs have better built-in support for larger volumes and these matters of concurrency (row-level versus page-level locking strategies, for instance) and integrity ("declarative referential integrity" and "triggers," to name two).
Then there's the matter of the DBMS's ability to store a given volume of data. Even on a site with relatively sparse traffic (and no concurrency problems), the amount of data generated by the number of visitors trickling through may still eventually overwhelm the DBMS's ability to store that data.
Related (in a way) to the matter of increasing traffic/volume is ensuring that intended users only access data they're permitted to see. There are actually two parts to that:
- Authentication: Who are they?
- Authorization or permissions: What are they allowed to do?
The question is: How is the database secured? In too many instances, the database (especially in the case of Access databases) is sitting on the server with no real security. Anyone who can access the file (whether a hacker or another developer on the same server when directories are not adequately protected) could grab the database, open it, and have full access to the data. The sidebar provides information on how to protect your database from public access.
There are two issues involved with fundamental authentication. Is the database protected so it's not open to anyone who can get hold of it? If it's secured with a password, at least it's a step in the right direction.
If you want to have user IDs for each user in a real multiuser application, the question is: Where are the user IDs (and passwords) stored? Are they stored in the DBMS (as in SQL Server's native logins)? In a separate file (as in Access's "workgroup information" file)? Or in the operating system authentication repository (such as SQL Server's ability to leverage NT user IDs).
Further, you want a way to associate users with their respective permissions. That's the authorization component. A more complete DBMS security implementation would allow the assignment of a broad range of rights (from the types of SQL statements a user can execute, to changes that can be made to the database, and lots more). Better still would be a means to associate user IDs with groups, so you can assign permissions to groups and place people into or remove them from groups.
Fortunately, even Access can support this kind of security structure (see the Microsoft documentation for more information). Of course, using this sort of security raises the question of whether and how to integrate an application's login security with the database's login security, which is beyond the scope of this article.
The issue of database administration is often given short shrift by ColdFusion developers. They figure once the database is created and working, the job is done. There are indeed those aforementioned issues of managing the performance, watching for file growth, and ensuring adequate security. Each of these is more effectively managed if the database provides management tools.
On the one hand, an effective interface for implementing such features is valuable. On the other, a means to report on (and act upon) management information is also vital in larger applications. If you're not actively watching the database, you better hope your DBMS (or a tool you can set up) is doing it for you and can notify you (or take corrective action itself). SQL Server 7, for instance, added substantial improvements in the self-diagnosis and repair of a database. It's always been replete with an interface for managing databases (Enterprise Manager), which has only gotten better in Release 7.
Other administration factors include database design and modification, database backup/restore, recovery in the event of a crash, and support for the administration of remote databases, to name a few.
Some Other Issues
We really can't cover every issue in the consideration of DBMS alternatives, though those above are the key factors. Other important ones include:
- Portability: Support of open standards and compliance with SQL standards
- Integrity: Includes support for referential integrity, transactions, and triggers
- Stored procedures: A means to store and easily reuse a library of precompiled, parameter-driven queries
- Replication: Keeps multiple, perhaps geographically segregated, databases synchronized
- XML support: The ability to present query results in XML format, or to accept XML to drive database queries and updates
Typically, these features are found only in higher-end databases, but they're also offered - or can be made to mimic them - at the lower end (see my article, "Stored Procedures in Access? Yes, Indeed!" [CFDJ, Vol.1, issue 5]).
Things also change with new releases. For instance, Access 2000 has dramatically improved its ability to support SQL that's compliant with SQL Server (which conforms closely to the ANSI-92 standard). From the SQL DDL for defining and altering database objects to the terms used to describe different data types, such as Varchar(20), in addition to Text(20), you can now create applications against Access that will more easily migrate to SQL Server (or any other ANSI-92-compliant DBMS).
Some DBMS Alternatives
I'll now briefly look at some of the major DBMS alternatives. This is also a sticky proposition - besides the matter of different vendors, there's the matter of platforms. If you work in a Linux environment, for instance, you don't have quite the same choices as a site hosted on a Microsoft platform. You may even be interested in creating a ColdFusion application against a mainframe DBMS (ODBC drivers from mainframe DBMS vendors may enable this).
Naturally, this article can't evaluate all the available DBMS choices. And frankly, if you have a unique platform (by which I'll infer a non-Microsoft platform or your preference to use open source products), you probably have a clear picture of your options. I'll leave it to you to investigate those choices.
However, I think it's fair to assert that most CF developers who can select commercially available DBMSs will seriously consider Microsoft alternatives because of their ubiquity. Still, these people may not realize all the Microsoft options available, or get the chance to consider them carefully. There are three major Microsoft alternatives: Access, MSDE (which may be entirely new to some readers), and SQL Server.
Microsoft Access, in both Access 97 and Access 2000 flavors, is a natural choice for many people since it's relatively inexpensive (or is included in some tools and suites). It's a low-cost alternative (on all those cost factors mentioned). It has its detractors, in some cases deserved, as discussed; however, it has its place (see Resources section). For more information visit www.microsoft.com/office/access/default.htm.
Microsoft SQL Server
Microsoft SQL Server, available in SQL Server 7 (circa 1999) and SQL Server 2000 versions, is regarded as the "expensive," enterprise alternative. While it does have higher costs on nearly all counts, the price brings substantial benefit, for some of the reasons cited earlier. You can learn more about it at www.microsoft.com/sql/.
CF developers should note that there's a SQL Server Desktop edition, available in some suites, that's great for single-user development situations. And SQL Server 7 allows the product to run in Windows 95 and 98 platforms (though those aren't good choices for a production platform).
There's also a simple-to-use Upsizing Wizard available in Access that supports upgrading a database to SQL Server (or MSDE). See "Using the Microsoft Access 2000 Upsizing Tools with SQL Server 7.0 or MSDE" at http://msdn.microsoft.com/ library/techart/upsize00.htm.
A relative newcomer to the debate over which Microsoft platform to choose is Microsoft Data Engine, or MSDE. It's basically SQL Server without an administration interface (it doesn't include Enterprise Manager). Other than that, it looks, smells, and tastes like SQL Server. Seriously, it's SQL Server, just an unusual packaging of it.
It seems to have been created as a bridge that enables Access developers to easily move to SQL Server. Believe it or not, you don't have to buy it; in fact, you can't. It's bundled inside other tools, and here's a surprise: it's bundled with Access 2000 and Office 2000 Developer edition. Install it from the respective CDs of those products.
There's no single resource I can recommend, though a couple of useful ones are worthwhile reading if you're new to MSDE:
- Microsoft SQL Server: "Microsoft Access 2000 Data Engine Options" - http://msdn.microsoft.com/library/backgrnd/html/acmsdeop.htm
- Microsoft Data Engine (MSDE) for Microsoft Visual Studio 6.0: "An Alternative to Jet for Building Desktop and Shared Solutions" - http://msdn.microsoft.com/library/backgrnd/html/msdeforvs.htm. (Don't let the Visual Studio reference throw you. It's perfectly applicable to CF developers and has no references to Visual Studio as a tool.)
MSDE is tuned for best optimum performance at five concurrent users or less. MSDE can support more than five concurrent users both from a technical and licensing standpoint, but Microsoft strongly suggests that the number of concurrent users be five or less in order to enjoy the performance levels available with MSDE.
I've also heard that MSDE was not licensed for use with Internet applications, but as I haven't seen the specifics of that, I leave it to you to investigate further.
Finally, though Enterprise Manager is not provided with the tool, Access 2000 has been redesigned to support design, administration, and management of MSDE databases via the concept of "projects." See the product for more information on that capability. The bottom line is that if for some reason you haven't yet moved to SQL Server, MSDE provides a great alternative for getting started.
Other DBMS Alternatives
My apologies to those who feel I'm leaving out their preferences. Remember, the premise of this article was to outline the issues and features to consider in your DBMS choices, not to be a comprehensive review.
I've run out of space and, of course, there are a number of other DBMS alternatives to consider, including Oracle, Informix, DB2, and Sybase on the high end and MySQL, SQLDB, and others in the middle to low end. There are still others, and again your mainframe database may provide an ODBC driver to enable access via tools such as CF (as does CCA's Model 204, where I got my start in databases nearly 20 years ago).
I hope this overview of database alternatives and critical issues has helped in your selection of databases as a CF developer.
- Thompson, R. (2000). "DB Conversion and CF - A User's Tale." ColdFusion Developer's Journal. May. http://www.sys-con.com/coldfusion/article.cfm?id=105
- Forta, B. (1999). "Take Your Database Out of Retirement." ColdFusion Developer's Journal. May/June. http://www.sys-con.com/coldfusion/article.cfm?id=27
- Van Horn, B. (2000). "In Defense of MS Access." ColdFusion Developer's Journal. April. http://www.sys-con.com/coldfusion/article.cfm?id=97
- Forta, B. (2000). "Access Denied." ColdFusion Developer's Journal. June. http://www.sys-con.com/coldfusion/article.cfm?id=115
PROTECT YOUR DATABASE FROM PUBLIC ACCESS
You may ask, "Are you saying the database could be grabbed by just anyone?" Yes, if you haven't been careful and have instead placed your database file (for example, an Access .mdb or a SQL Server 7 .mdf file) in the same directory as your Web pages. That's asking for trouble. Most likely nothing will stop a user from downloading the file via the browser using a request such as http://yourservername/yourcodedir/dbfilename.mdb.
Where would they learn your database file name? How often is it the same as the data source name? "Users can't see my code to know my data source name, can they?" No, but what if they get (or can cause) an error? Have you enabled the administrator feature to prevent showing the data source name in error messages? (See my article, "Toward Better Error Handling, Part 1" at www.syscon.com/coldfusion-archives/0210/arehartp42/index.html.) Forewarned is forearmed.
|Sheila Handler 08/20/03 01:14:52 PM EDT|
When I go to the archives, I don't have a problem. It asks for my access code, I enter it and can then select a volume an read its articles.
But when I go to the "search box" and enter something, I receive a search results list. When I click on one of the articles in the search results, I only get to see the first page. When I continue it takes me to the sys-con home page. At not time does it ask for my access code.
- Where Are RIA Technologies Headed in 2008?
- The Next Programming Models, RIAs and Composite Applications
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Constructing an Application with Flash Forms from the Ground Up
- Building a Zip Code Proximity Search with ColdFusion
- Personal Branding Checklist
- CFEclipse: The Developer's IDE, Eclipse For ColdFusion
- Has the Technology Bounceback Begun?
- Adobe Flex 2: Advanced DataGrid
- i-Technology Viewpoint: We Need Not More Frameworks, But Better Programmers
- Web Services Using ColdFusion and Apache CXF
- Passing Parameters to Flex That Works