Issues with .NET integration in ColdFusion 9.0.1

If you are using ColdFusion .NET integration and you are planning to upgrade or have already upgraded to ColdFusion 9.0.1, read on. Lately we have seen lot of people facing issues with .NET integration after upgrading to ColdFusion 9.0.1. You will usually see the error “Dot Net side does not seem to be running”. There might also be an InvalidLicenseException in the logs.

To fix this, you need to uninstall .NET integration services and re-install it using “ColdFusion 9.0.1 .NET Integration Service Installer” provided at http://www.adobe.com/support/coldfusion/downloads.html.

Tags: ,

ColdFusion Survey : We are listening

Here is your chance to shape the next version of ColdFusion. The ColdFusion team has put together a survey that gives you an opportunity to provide your feedback on ColdFusion 9. The survey also lets you indicate what you want to see in the next version of ColdFusion. So, don’t wait any longer, just visit the URL below, and provide your feedback.

http://www.surveymonkey.com/s/ColdFusionServer

CFBuilder tip : TailView can kill server performance

If you use ColdFusion Builder and if you use tailview, this post is for you! Tail View is a great functionality in ColdFusion builder that allows you to tail the contents of any log file in real time. It is a great productivity feature that saves you from opening the log file and continuously reloading it.

TailView in ColdFusion Builder keeps watching the file for any modification and if there is any, it updates the view with the latest content. That causes a race condition between the tail view process which is trying to read the file and the server process that is writing to the same file. Generally, in such race conditions, the preference is almost always given to the reader. If there is any read operation (and there can be multiple simultaneous reads) in progress, write operation waits.

Somehow this race condition is not very much visible when the server is running from console. However, when the server is running as a service, this race condition becomes so severe that the server becomes almost unresponsive. The jrunsvc.exe process just hogs up all the CPU cycles and the server becomes nearly dead.

A simple way to fix this is to increase the period at which the tail view checks for a change in the file content. Thankfully ColdFusionBuilder provides a preference to do that.
Go to Windows > Preference > HTML >TailView and change the value of “Read Timeout” from 100 to 1000 or any other suitable higher value. This will give enough time to server threads to write contents to the log file.

Tags: , , ,

EntityReload and performance

I received a small test application today for a bug, that said that the EntityReload performance was really bad. My first reaction was – there is nothing much we can do – thats all Hibernate code there. On a second thought, I decided to have a look at the application. The application had an object where one of the field’s value was generated by DB. Since the application never sets the value for this property while insertion, it had defined the property with attribute “insert” set to false. When the object is saved, the sql does not include this property and hence its value is generated by the database. So far so good. However the application needs to get the value that is generated by the database immediately after the object is saved. In order to do that, the application calls EntityReload() on the object and the generated value for the property gets loaded. Perfect? Not at all.

Here is the simplified component definition

/**
* @persistent
*/
Component Group
{
/**
* @fieldtype Id
* @generator native
*/
property id;
 
property name;
 
/**
* @dbdefault "'Y'"
* @insert false
*/
property char active;
 
/**
* @fieldtype one-to-many
* @cfc User
* @fkcolumn userid
* @lazy true
* @cascade all
* @inverse
*/
property users;
 
}

Here ‘active’ is the property that is generated by DB and the application calls EntityReload, so that all the property values including ‘active’ are fetched and populated in the object.

There are two issues here

  • EntityReload is getting called immediately after the insertion which is completely unnecessary in this case. All we need here is to get the generated value from the database and you can do that by specifying ‘generated’ attribute on the property. This will tell hibernate that active property is generated by database at the time of insertion and thus suggests hibernate to fetch the value of ‘active’ property from DB right after the row is inserted. When the object is persisted in the DB, the value for ‘active’ property will be populated back in the object immediately. This will save all the overheads of EntityReload.
  • Notice the value for cascade here – it is ‘all’. This means that all the operations, including EntityReload, will be cascaded to the association. Thus all the associated objects will get reloaded irrespective of whether the association is lazy or not. And that is not all – If the associated objects has further associations with ‘cascade’ set on them, the operation will be cascaded on them as well. As a result, the whole object graph will get loaded irrespective of the lazy value on the relationship and that is so so expensive.

The lesson here is not to use EntityReload unless it is necessary. And if you need to use it, check if you really need to cascade this operation to the association. Whenever you specify the cascade value for the association, think about the operations that need to be cascaded and then specify only those values that apply. The valid values for cascade are – save-update, delete, refresh, delete-orphan, merge, all, all-delete-orphan. This is how they map to the operations

cascade value Function
save-update EntitySave
delete EntityDelete
refresh EntityReload
merge EntityMerge

Here is how the CFC looks after fixing these two issues.

/**
* @persistent
*/
Component Group
{
/**
* @fieldtype Id
* @generator native
*/
property id;
 
property name;
 
/**
* @dbdefault "'Y'"
* @generated insert
*/
property char active;
 
/**
* @fieldtype one-to-many
* @cfc User
* @fkcolumn userid
* @lazy true
* @cascade "save-update,delete,merge"
* @inverse
*/
property users;
}

ColdFusion ORM : Keeping database in sync with your model

One of the most convenient thing I love about ColdFusion ORM is that it lets you build your database automatically from the object model. It generates the table from the CFC mapping that you have provided, automatically defines all the necessary constraints from the relationships or property definitions and allows you to populate the database with some initial data using a script file. It also allows you to keep your database always in sync with your application, which means that if you add a new CFC in your application or if you add new properties in your components, you don’t have to go to the database and add them. CF-ORM (or Hibernate) will do that automatically for you.

Though this is really nice, ColdFusion does not do it by default. You need to enable it specifically if you want it to build the database for you. You can do this by setting ‘ormsettings.dbCreate‘ to ‘dropcreate‘ or ‘update‘ in application.

ormsettings.dbcreate can have following values

dropcreate : With this setting, CF-ORM drops the table if it exists and then creates it. This starts the application with a clean slate and one should be careful while using this. Careful because all your data will be deleted and tables will be created afresh whenever the application starts or when ever ORM is initialized. This setting is awesome at the development time. With this setting, you can also specify a sqlscript file to initially populate the tables once they are created. You can specify that using ormsettings.sqlscript in application.

update : With this setting, CF-ORM will create the table if it does not exist or update it if it exists. This setting is very convenient as you don’t need to make any changes in the database table yourself whenever changes are made in the application. Hibernate will do that for you. However that is not absolutely true all the time. (I can see lot of people complaining/logging bugs about it :-) ). Here is what Hibernate does when you have this setting on :

  • Create table if a new CFC mapping is found in the application or if the table name of a CFC is changed. If the table name is changed for a CFC, it will not rename the table in the DB. It will simply create a new table leaving the old table as it is.
  • For existing table, add column if a new property is added or if the column name for property is changed. If the column name is changed for a property, it will not rename the column but will simply create another column with the new name.
  • For existing column, add the constraints, if a new foreign key constraint is required for relationship. However none of these are modified in the table -  datatype, length, not-null, unique, precision, scale, index, uniqueKey.
  • Change the id generation strategy if generator is changed for id column.

none : This is the default setting where tables are not created or modified by CF-ORM. It uses the existing tables in the database. One should switch to this setting once the application goes in production.

Advanced ColdFusion ORM @MAX

Thanks to everyone who attended my session at MAX on ‘Advanced ColdFusion ORM’. As promised, here is the presentation slides and demo code snippets. I have also uploaded the slides at SlideSix.

ColdFusion ORM : Using DB Views instead of Table

One of the frequent question that comes up for ORM is – Can I use database Views instead of the table? And the answer is “of course”! From ORM perspective, there is no difference between database view and a table. Any query that ORM generates will work on the views in the same way as it does on a table. So while defining the persistence metadata for your CFC, just use the view name instead of the table name and you should be all set.

Of course views are used just for the query and not for the insert/update/delete. Hence method like EntitySave/EntityDelete which will try to do insert/update/delete on View will not change the view and would throw an error at the time of flushing the ORM session.

Tags: , ,

ColdFusion ORM : What is “N+1 Select problem”

In my last two posts, I mentioned that immediate fetching or lazy fetching can cause ‘N+1 select problem’. If you are wondering what exactly is this, read on.

Consider the example of Department and Employees. When you call EntityLoad(“Department”), following sqls will be executed.

SELECT * FROM department;
 
SELECT * FROM employees WHERE deptId = ?

The first query will be executed once (1) and the second query will be executed as many times as the department (N). Thus the above entityLoad call results into ‘N+1′ sql execution and thus can be a cause of performance bottleneck. Because of N+1 sqls, this is known as ‘N+1 select problem’. This will almost always happen when the fetching is “Immediate” (using fetch=”select”) or can happen with lazy loading.

With immediate fetching it is obvious why this would happen. When lazy=’true”, this can happen when the association is accessed immediately on each of the owning object (department in this case).

If you think this could be happening in your application, use either of these two options.

  1. set lazy=”false” and use fetch=”join” so that the departments and its employees get loaded together. (Eager fetch)
  2. Keep lazy=”true” but load the department using hql with join. So instead of using EntityLoad(“Department”), use
    ORMExecuteQuery("from Department dept left join fetch dept.employees") 
    

Tags: , , ,

ColdFusion ORM : Performance tuning – Lazy loading

In the previous post, we talked about different fetching strategies and when to use them. In this post, we will go little deep in lazy loading which is the most popular and commonly used fetching strategy.

As we said in the earlier post – with this strategy, when you load an entity, ColdFusion ORM will load the entity’s data but relations and any mapped collections and are not loaded. They are loaded only when you want to load them i.e by calling the getter method for it and accessing it. Thus the relations and collection mappings are lazily loaded. To give an example, when Department is loaded, all its employees are not loaded and they are loaded only when getEmployees() is called.

There are three types of lazy loading that is provided by ColdFusion ORM for relationship.

  • lazy : This is the default lazy loading that applies to collection mapping, one-to-many and many-to-many relationship. In this case, when you call the accessor for the collection/relation, the collection is fully loaded. Thus when you call EntityLoad() for a particular department, its employees are not loaded at that time. When you call dept.getEmployees(), all the employees object belonging to the department will get loaded. This is achieved by setting lazy=”true” on the relationship property definition in the CFC.Example : In Department.cfc
    <cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="true" >
  • Extra lazy : This applies to one-to-many and many-to-many relationship. This is similar to lazy loading but goes one step ahead of it and does not load the associated objects for for calls like size(), contains(Object). This means that calls like ArrayLen(dept.getEmployees()) or ArrayContains(dept.getEmployees(), anEmployee) or ArrayFind(dept.getEmployees(), anEmployee) will not result into loading any employee object. It will just execute the sql for finding size or finding if the employee belongs to the department. The employee objects will be loaded only when a employee is accessed from this collection. This is very useful if the collection is huge. This is achieved by setting lazy=”extra” on the relationship property definition in the CFCExample : In Department.cfc
    <cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="extra" >
  • proxy : This applies to one-to-one and many-to-one relationship. When an object is loaded, the associated object is not loaded from the database. ColdFusion will only create a proxy object for the related object and when any method is invoked on the related object, the data for the proxy object is loaded from the database and populated in the proxy object. To give an example, if the Employee-Department relation is lazy, when Employees is loaded, the department is not loaded and when you call employee.getDepartment(), you would only get a proxy object. When you call any method on the proxy object, query will be executed on the DB to load department’s data. This is achieved by setting lazy=”true” on the relationship property definition in the CFCExample : In Employee.cfc
    <cfproperty name="department" fieldtype="many-to-one" cfc="department" fkcolumn="deptId" lazy="true" >

    An important thing to note here is – An entity is loaded only once in the request (in Hibernate session to be more specific) and there will always be only one copy of it in the request. So for Employee-Department relationship, which is lazy, if the department is already loaded, calling employee.getDepartment() will not create a proxy object and will return the loaded department object.

Lazy loading can be disabled by setting lazy=”false” on the relationship property definition in the CFC.

Choosing an appropriate lazy loading option is very important for the performance of your application. Extra lazy means more number of trips to the database (each trip to the DB is expensive) but less data in memory whereas no lazy loading means a huge object graph in the memory. So you need to make a balance depending on the application need.

While lazy loading is very useful and helpful in reducing the amount of data loaded from the database and thus reducing the number of objects in memory, overdoing it can have an inverse effect. Lets say in your application, when you load an object, you always access its associated data, lazy loading will again cause ‘N+1 select problem’. This means that a huge number of sqls will be executed which can be avoided by using eager fetch or using HQL with join (See query example of “Eager Fetch” in this post).

There are some other important things to remember/note while using lazy loading

  1. The lazy collection (including one-to-many and many-to-many) is not immediately loaded when you call the getter for the relationship. The sql is executed only when you access anything on the result of the getter (either get its size, or iterate over it etc). lazy=”extra” is little extra lazy (see “Extra Lazy” above).
  2. has*** methods on the entity for relationship are optimized in such a way that it will not result into loading the associated object.
  3. You can quite easily hit the famous “LazyInitializationException“. Mark Mandel explains this nicely in his post on “Explaining Hibernate Sessions“. Ray Camden also talks about his experience with it here. So you need to be careful when using detached object.
  4. If you are retrieving ORM entities in flex, even if you set lazy=”false”, ColdFusion will not send the whole object graph. If you need the relation data to be serialized to flex, you need to set “remotingfetch=’true’” on the relationship property. More on this later.

Tags: , , ,

ColdFusion ORM : Performance tuning – Fetching Strategy

In any application that needs database interaction, DB operations are the key to the application performance. Most of the application performance problems come because the sqls being executed are not optimized or there are huge numbers of queries being executed or there is too much data getting loaded by the query or the columns are not properly indexed or there is no caching being done and the application always hits the DB. In this series, I will try to cover different strategies that you need to use for a good performing ORM based application.

As we all know, the fundamental strategy to tune an application performance is to optimize the sql queries. As a general practice, object retrieval using many round trips to the database is avoided and you would fetch all the required data for a particular operation using a single SQL query using Joins to retrieve related entities. Also, you would fetch only the data that is required i.e data will not be fetched unnecessarily if it is not needed so as to reduce the load on the DB. However this becomes an issue when you use ORM because you no longer write the SQL queries yourself and queries are generated and executed by the underlying ORM engine.

Thankfully ORM engine like Hibernate provides various hooks to optimize the sql as well no of trips that will be made to the database. The most important of these hooks is “fetching strategy” which defines what data will be fetched, when and how.

There are four fetching strategies for loading an object and its associations. (We will use Department-Employee relationship for all the explanation)

  1. Immediate fetching : In this strategy, the associated object is fetched immediately after the owning entity is fetched, either from the database using a separate SQL query or from the seconadary cache. This is usually not an efficient strategy unless the associated object is cached in the secondary cache or when separate queries are more efficient than a Join query. You can define this strategy by setting lazy=”false” and fetch=”select” for the relationship property definition in the CFC.example :
    <cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="false" fetch="select">

    With this strategy, on loading the department object, its employees object will be loaded immediately using a separate SQL query. As a result, this strategy is extremely vulnerable to ‘N+1 Select problem’.

    pros :
    The association is loaded immediately and hence the associated object can be accessed even after the ORM session is closed.
    cons : A large number of sqls get executed causing a higher traffic between application and the database. The association is loaded even if it might not be needed.

    When to use : When the association is almost always read after loading the object and executing separate sql is more efficient than executing a join query.

  2. Lazy fetching : In this strategy, the associated object or collection is fetched lazily i.e only when required. For example, when you load a Department object, all the associated employees will not be loaded at all. It will be loaded only when you access it. This results in a new request to the database but it controls how much of data is loaded and when is it loaded. This helps in reducing the database load because you fetch only the data that is required and is a good default strategy. We will talk about this in much more detail in the next post. For the time being lets just say this is the most commonly used and the default strategy for obvious reasons. You can define this strategy by setting lazy=”true” or lazy=”extra”.
    example :

    <cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="true" >

    pros : Only the minimum required data is loaded. This avoids loading of entire object graph in memory and hence the performance is generally good.
    cons : If the association is always accessed after loading, this would result in extra sql execution. If the loaded object is accessed in another ORM session (i.e has become detached), extra care must be taken to avoid errors like ‘LazyInitializationException’ or ‘NonUniqueObjectException’.

    When to use : When the association is not immediately read after loading the object. This is the most commonly used and default strategy.

  3. Eager fetching : In this strategy, the associated object or collection is fetched together with the owning entity using a single SQL Join query. Thus, this strategy reduces the number of trips to the database and is a good optimization when you always access the associated object immediately after loading the owning entity. You can define this strategy by setting fetch=”join” for the relationship property definition in the CFC.example :
    <cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" fetch="join">

    With this strategy, on loading the department object, both department and employees data will be fetched from the database using a single join query.

    Even if the eager fetching is not defined in the CFC metadata, it can be done at runtime using ORMExecuteQuery. This can be very powerful in scenarios where in most of the cases, you choose the assocition to be lazily loaded but in some cases, you want to immediately load it. In those case, use Join in the HQL and execute that using ORMExecuteQuery.

    Example :

    ORMExecuteQuery("from Department dept left join fetch dept.employees")
    ORMExecuteQuery("from Department dept left join fetch dept.employees where dept.id=1001")

    pros : The association is loaded immediately and hence the associated object can be accessed even after the ORM session is closed. The association is loaded using a single join query which usually is more efficient than executing multiple queries.
    cons : The association is loaded even if it might not be needed. Since the query used is a join query, the resultset returned by the DB will typically contain lot of repititive data. If used for more than one collection of an entity, this will create a cartesian product of the collection’s data and thus causing creation of a huge resultset.

    When to use : When the association is almost always read after loading the object. More suitable for many-to-one and one-to-one association or single collection where the associated objects can be loaded using join query without much overhead.

  4. Batch fetching : This strategy tells Hibernate to optimize the second SQL select in Immediate fetching or lazy fetching to load batch of objects or collections in a single query. This allows you to load a batch of proxied objects or unitialized collections that are referenced in the current request. This is a blind-guess optimization technique but very useful in nested tree loading.
    The concept of batch-fetching is slightly confusing (at least I got confused when I first read about it). So you need to pay careful attention to this.
    This can be specified using “batch-size” attribute for CFC or relationship property. There are two ways you can tune batch fetching: on the CFC and on the collection.

    • Batch fetching at CFC level : This allows batch fetching of the proxied objects and hence is applied to one-to-one and many-to-one relationship. To give an example, cosider Employee-Department example where there are 25 employee instance loaded in the request(ORM session). Each employee has a reference to the department and the relationship is lazy. Therefore employee objects will contain the proxied object for Department.If you now iterate through all the employees and call getDepartment() on each, by default 25 SELECT statements will be executed to retrieve the proxied owners, one for each Department proxy object. This can be batched by specifying the ‘batch-size’ attribute on the Department CFC like

      <cfcomponent table=”Department” batch-size=”10″ …>

      When you call getDepartment() on the first employee object, it will see that department should be batch fetched, and hence it will fetch 10 department objects that are proxied in the current request.
      So for 25 employee objects, this will make Hibernate to execute at max three queries – in batch of 10, 10 and 5.
      You must note that batch-size at component level does not mean that whenever you load a Department object, 10 department objects will get loaded in the session. It just means that if there are proxied instances of Department object in the session, 10 of those proxied objects will get loaded together.

    • Batch fetching at collections : This allows batch fetching of value collections, one-to-many or many-to-many relationships that are unitialized. To give an example, consider Department-Employee one-to-many relationship where there are 25 departments loaded and each department has a lazy collection of employees. If you now iterate through the departments and call getEmployees() on each, by default 25 SELECT statements will be executed, one for each Department to load its employee objects. This can be optimized by enabling batch fetching which is done by specifying “batch-size” on the relationship property like

      In Department.cfc :

      &lt;cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="true" batch-size="10"&gt;

      One important thing to understand here is that batch-size here does not mean that 10 employees will be loaded at one time for a department. it actually means that 10 employee collections (i.e employees for 10 department objects) will be loaded together.
      When you call getEmployees() on the first department, employees for 9 other departments will also be fetched along with the one that was asked for.

    The value for batch-size attribute should be chosen based on the expected number of proxied objects or unitialized collections in the session.

Tags: , , ,