Welcome!

You will be redirected in 30 seconds or close now.

ColdFusion Authors: Yakov Fain, Jeremy Geelan, Maureen O'Gara, Nancy Y. Nee, Tad Anderson

Related Topics: ColdFusion

ColdFusion: Article

Maintaining Live Verity Collections in a Clustered Environment

Maintaining Live Verity Collections in a Clustered Environment

The task seemed monumental: "Come up with a way to keep live Verity data indexed and accessible to multiple Web servers in a clustered environment." The scope of the data was huge - hundreds of thousands of pieces of content - with new additions made 24 hours a day, every day of the year. All known methods to accomplish this task just did not seem to do the job adequately. It was time to get creative.

While analyzing the inner workings of the stock Verity 97 engine that ships with ColdFusion 4.5, the following keys to success were identified:

  1. Verity searches are CPU-intense; thus it would be best not to force all the servers in the cluster to search on one server, but rather make each server responsible for its own searches to best spread out the workload.
  2. Verity indexing is even more CPU-intensive than the searching; thus any box stuck with the task of continual indexing would spend most of its time pegged at 100% CPU usage and would not be desirable for any other task.
The theory started to clarify. We would dedicate one server to continually update the indexes. This would be called the Utility Server. As soon as this server finished a round of updates, it would start the next. As mentioned above, this Utility Server would spend most of its time at 100% CPU usage, so it would be used only for the index updates.

The next piece of the puzzle was to determine the best way to share file access to these Verity collections. With ever-changing data and a CPU operating at 100%, the Utility Server would not be an option. So we decided to dedicate another box to simply hold the most current and searchable version of the indexes. This would be called the Index Server.

The final component was the Web servers. This was easy, as all they had to do was map Verity collections to the Index Server.

The basic setup made sense, but some of the details were still missing:

  1. If the Utility Server was copying updated index files to the Index Server, what would happen to any ongoing searches that were currently using the collections on the Index Server?
  2. How would we clean old index data from the Index Server without impacting ongoing searches?
One idea was to work with multiple collection groups on the Index Server, say collection "A" and collection "B." While collection "A" was being indexed and copied over, collection "B" would handle all of the searches from the Web servers. This sounded like a possible solution but added an extra degree of complexity. Was the extra complication necessary?

What if we could simply use CFFILE to copy new index data and delete old index data directly to the live set of collections on the Index Server? But that would go directly against what we have been told is a best practice (locking all Verity collections during update transactions). Before we gave up on the idea, we contacted Verity representatives to see if they could offer any advice. We were pleasantly surprised when they informed us that the Verity 97 engine was designed to handle ongoing "file transactions" while live searches are being performed. Time to run some tests and see what we could find out.

For our test environment we set up a small clustered environment of our own (see Figure 1). On another set of eight machines (two machines per Web server), we opened a total of 20 browser instances on each of the four Web servers and ran test code (see Listing 1) that was looping over a basic CFSEARCH on a Verity collection.

At the same time the searches were being performed, we were updating the Verity collection on the Utility Server, copying the collection to the Index Server, and deleting old Verity files from the Index Server. Building the indexes using CFINDEX and copying the files with the use of CFDIRECTORY and CFFILE was very straightforward (see Listing 2) but deleting the files required a little more thought. Because Verity collections store many files in pairs, we would need some logic that deleted all but the latest pair of files in each Verity subfolder. So we came up with a custom tag called tag_delete (see Listing 3).

Deleting Old Verity Files
It's important to understand the structure of the Verity files. This helps to better understand why we delete what we do with the Delete tag. To learn the file structure of Verity collections and why the deletion of the old Verity files is important, see www.allaire.com/handlers/index.cfm?id=18429&method=full ("Understanding Verity Collections in ColdFusion").

<!--- call to the tag ---> <cf_tag_Delete source="" fileDateTime="">
The tag is called by passing two parameters. The source parameter is the directory path in which to begin deletion, and fileDateTime is a time stamp used to mark how far in the past the tag can delete (see Listing 3).

As the code in Listing 1 was executing, the output would begin by returning the old record count. As we updated the indexes, the output would change to reflect the new data:

103
103
103
103
127
127

Over the course of many such tests, we did not experience any collection corruption or errors of any kind.

Interacting with Copies of Index Files
To clarify a major point in regard to not locking our Verity index transactions, we're talking only about interacting with copies of the index files themselves and CFSEARCH, not the actual index creation and modification process of CFINDEX - as that code is always run with single thread access on the Utility Server.

Live for Six Months
Our tests were successful. We moved from our test data to an actual clustered environment and started using real data. The final product has been live for six months (at the time of this writing) with no ill effects. We are confident in this theory, but, as always, make sure to run tests in your own environment to guarantee that it will work for you.

The following is a more detailed look at the machines we used and a basic set-up procedure for each machine.

Machines Used

  • One Utility Server
  • One Index Server
  • Four Web servers (multiple)

Utility Server
The Utility Server has a scheduled task set to run every minute. This scheduled task sets a lock file (lock_verityCollectionUpdate.txt). When this scheduled task begins, it checks the existence of this lock file. If the file exists, the task will end without attempting to do an update. If the lock file does not exist, the task will run. It will write the lock_verityCollectionUpdate.txt file, retrieve all new data since the last time it ran, and then update and optimize all Verity collections. Next the task will copy all new index files to the Index Server. It then reads over the Index Server's collection files and removes all but the latest files. Finally, the task will delete the lock_verityCollectionUpdate.txt file.

Index Server
The Index Server simply holds the "live" Verity collections.

Web Servers
The Web servers all map their collections to the Index Server. There are no local collections on these machines; thus, as per load balancing setup, each Web server will supply the CPU usage for any searches its visitors require. As Verity searches can become CPU-intense, this gives your searches the most bang for your buck.

Setup Procedure

  • Ensure that all ColdFusion tasks are running under accounts that have security privileges to work with all servers. (In Windows NT 4.0 and 2000, go to Services, then check the properties of the CF Services to see what account they're running under.)
  • Make sure to use realistic request timeouts, so that your indexing code will not timeout before it's complete.

Utility Server

  • Map a drive to the location in which you'll store the files on the Index Server.
  • Create the collections on this machine. Depending upon how long it takes for your collections to build, it's advisable to build your collections using Netscape 4.X - as IE has been known to time out.
  • Set up a scheduled task to run task_verityCollection-Update.cfm (see Listing 4) to run every minute. Enable this task as soon as the Index Server is set up.

Index Server
Copy the collections from the Utility Server to the Index Server. This will prime the pump and needs to be done by hand only once, for initial setup. This step is also important to set up the proper directory structure for the files on the Index Server, so that the file transactions can take place.

Web Servers

  • Map drive to the Index Server.
  • In the CF Administrator, set up a new mapped Verity collection pointing the map drive to the Index Server you just set up. Note: You will need to get the collection name exactly as it's shown on the Utility Server.

More Stories By Jeremy Petersen

Jeremy Petersen is certified in ColdFusion and has been using it since version 1.5. He is the manager of Web application engineering for TeachStream, Inc., in Salt Lake City, Utah.

More Stories By Dan Kison

Dan Kison has worked with ColdFusion since version 3.0. He created Web sites for the Air Force for four years. Last year, he took a position in Austin ,Texas, where he continues to build Web applications.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
As businesses evolve, they need technology that is simple to help them succeed today and flexible enough to help them build for tomorrow. Chrome is fit for the workplace of the future — providing a secure, consistent user experience across a range of devices that can be used anywhere. In her session at 21st Cloud Expo, Vidya Nagarajan, a Senior Product Manager at Google, will take a look at various options as to how ChromeOS can be leveraged to interact with people on the devices, and formats th...
SYS-CON Events announced today that Yuasa System will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Yuasa System is introducing a multi-purpose endurance testing system for flexible displays, OLED devices, flexible substrates, flat cables, and films in smartphones, wearables, automobiles, and healthcare.
Organizations do not need a Big Data strategy; they need a business strategy that incorporates Big Data. Most organizations lack a road map for using Big Data to optimize key business processes, deliver a differentiated customer experience, or uncover new business opportunities. They do not understand what’s possible with respect to integrating Big Data into the business model.
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities – ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups. As a result, many firms employ new business models that place enormous impor...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Taica manufacturers Alpha-GEL brand silicone components and materials, which maintain outstanding performance over a wide temperature range -40C to +200C. For more information, visit http://www.taica.co.jp/english/.
SYS-CON Events announced today that Dasher Technologies will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Dasher Technologies, Inc. ® is a premier IT solution provider that delivers expert technical resources along with trusted account executives to architect and deliver complete IT solutions and services to help our clients execute their goals, plans and objectives. Since 1999, we'v...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they b...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that TidalScale, a leading provider of systems and services, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale has been involved in shaping the computing landscape. They've designed, developed and deployed some of the most important and successful systems and services in the history of the computing industry - internet, Ethernet, operating s...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their ass...
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
Amazon is pursuing new markets and disrupting industries at an incredible pace. Almost every industry seems to be in its crosshairs. Companies and industries that once thought they were safe are now worried about being “Amazoned.”. The new watch word should be “Be afraid. Be very afraid.” In his session 21st Cloud Expo, Chris Kocher, a co-founder of Grey Heron, will address questions such as: What new areas is Amazon disrupting? How are they doing this? Where are they likely to go? What are th...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...