You will be redirected in 30 seconds or close now.

ColdFusion Authors: Yakov Fain, Jeremy Geelan, Maureen O'Gara, Nancy Y. Nee, Tad Anderson

Related Topics: ColdFusion

ColdFusion: Article

Macromedia's Data File Access API Architecture Unleashed

Macromedia's Data File Access API Architecture Unleashed

Like its predecessors, Macromedia's most recent installment of the Devnet Resource Kit (DRK 3) is stocked with many excellent utilities for Flash developers. Unlike previous releases, DRK 3 aims to make the lives of ColdFusion developers easier by including many applications and development tools for use in CFMX applications. One of these is an Application Programming Interface I developed called the Data File Access API (DFA API).

When I set out to design and develop the DFA API, the goal was to develop an API that would allow developers to store data in text files as either XML or CSV text, and to access and manipulate that data as easily as if it were stored in a database. With the goal of that core functionality in mind, the primary objective was to invent an architecture that would perform as optimally as possible.

In addition, two other primary objectives were to make the API very easy to use and to make it flexible enough for developers to extend or implement in any way they might need. Ideally, as developers become more familiar with the API, they will be inspired to use it as the backbone of more creative solutions to meet their applications' needs. Let's examine how the features of ColdFusion MX were used to meet these objectives.

The first thing I needed to consider was how to architect the API not only to define and store data, but also to make this data available for very fast filtering and retrieval. What I decided was to create a ColdFusion Component that houses all of the methods for working with the data and that stores all of the data in memory (as needed) in a proprietary XML DOM format, whether the data came from CSV or XML text. I refer to these as "data tables" and think of them as being analogous to database tables cached in memory.

The API needs to be flexible enough to allow data to be retrieved and filtered using XPath or SQL, so component methods exist to determine whether the query passed is XPath or SQL. XPath is applied directly to the XML DOM in memory and in order to use SQL, the XML DOM is first converted to a ColdFusion query object and then queried using Query of Queries. Any call to the API to extract data can retrieve that data as XML or as a ColdFusion query.

Another challenge in developing the API was how to physically define and store the data used in applications. I broke the data table definition task between three XML files: one that defines table definitions (column names, data types, default values, etc.); one that maps the definitions to the actual storage locations (as relative or absolute path or URL) so that the API would know where to find the data; and one XML file that contains the data itself. I chose this architecture so that developers can easily write validation routines (the data type and required properties of data table columns aren't actually used), share data table definitions between applications, etc.

In addition to the ability to retrieve data mentioned above, methods were also written to parse SQL statements that perform INSERT, UPDATE, or DELETE operations on data tables. Unlike when a SELECT statement is passed to the API, converting the XML data table to a ColdFusion query object will do no good for INSERT, UPDATE, and DELETE commands, as ColdFusion does not support these SQL statements in a Query of Queries.

Instead, the SQL is broken into its various components and then executed against the appropriate XML nodes directly. In the case of DELETE commands, working with the data directly as XML proved more efficient than converting the XML to a query object, retrieving the data not being deleted, and converting the new query object back to XML.

In addition to working with XML, the API needed to support CSV so a method was added to parse CSV text and convert it to an XML table in memory. The first row of values in the CSV content is used to create a data table definition, and all other rows populate that definition. Other methods were also added for validation of various entities being used, to make debugging easier, etc. Two other major concerns while developing the API were how to create an easy way for all developers to use the API, and how to handle concurrency issues.

In order to deal with data table memory and physical file concurrency issues, all data retrieval is performed within "read-only" named locks. When row(s) of data are inserted, updated, or deleted from a data table, the data table in memory is first manipulated within an "exclusive" named lock. Afterwards, the entire data table is written to file as an XML string, also within an "exclusive" named lock. This approach minimizes locking on the server, and prevents developers from having to lock API access in their applications, because the API is handling all of the locking - local to the code blocks that require locks.

To make the API easier for developers to use, I wrote a custom tag "wrapper" for the API CFC. The idea behind the custom tag was to give users the ability to use syntax similar to what they are already used to with the <cfquery> tag in order to query the DFA API data for their application. Like <cfquery>, when retrieving data a "name" attribute is passed to assign a name to the result set returned (may be in query or XML format). A "returntype" attribute is used to specify whether to return the results of a query as XML or a ColdFusion query object.

Also similar to <cfquery>, the SQL to SELECT data is passed as the contents of the tag. An XPath attribute is used to select data using an XPath query. Rather than passing a "datasource" name, the tag accepts a "datatable" attribute in order to determine what data table to apply XPath queries to (SQL queries simply name the data table in the SQL). There is also an "XSLT" attribute for performing XSL Transformations on the data (the attribute value is either the location or contents of an XSL stylesheet) and a "CSV" attribute for passing the location of a CSV file (or CSV content) to be parsed into a data table.

Within the tag body a SQL INSERT, UPDATE, SELECT, or DELETE command may be passed, as well as a DROP command to remove a data table from memory, and a SAVE command for committing a data table already in memory to file. The tag itself creates a DFA API instance in the application scope if one doesn't already exist (in start mode), and performs all of its "work" in end mode. The only thing required to use the tag is the existence of three request scope variables that store the locations of the DFA API component, the location of the XML file that defines data table "columns," and the location of the XML file that "maps" these definitions to physical files.

Though the API was never intended for use in large enterprise-level applications, early tests have yielded surprising performance results. The API definitely does perform very well...exactly how much data or how many concurrent users is too many is something you'll have to test for yourself. I wouldn't be surprised to find that even large-scale solutions can be delivered, driven by the API rather than by a traditional database. Even if you decide to stick with the more traditional methods of data storage, I highly recommend looking to the DFA API to serve as an example of how to best architect an API and as an example of how ColdFusion Components, Custom Tags, Query of Queries, and XML Parsing functionality in ColdFusion MX can be combined to achieve amazing results in your applications.

More Stories By Simon Horwith

Simon Horwith is the CIO at AboutWeb, LLC, a Washington, DC based company specializing in staff augmentation, consulting, and training. Simon is a Macromedia Certified Master Instructor and is a member of Team Macromedia. He has been using ColdFusion since version 1.5 and specializes in ColdFusion application architecture, including architecting applications that integrate with Java, Flash, Flex, and a myriad of other technologies. In addition to presenting at CFUGs and conferences around the world, he has also been a contributing author of several books and technical papers.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...