In any application that needs database interaction, DB operations are the key to the application performance. Most of the application performance problems come because the sqls being executed are not optimized or there are huge numbers of queries being executed or there is too much data getting loaded by the query or the columns are not properly indexed or there is no caching being done and the application always hits the DB. In this series, I will try to cover different strategies that you need to use for a good performing ORM based application.
As we all know, the fundamental strategy to tune an application performance is to optimize the sql queries. As a general practice, object retrieval using many round trips to the database is avoided and you would fetch all the required data for a particular operation using a single SQL query using Joins to retrieve related entities. Also, you would fetch only the data that is required i.e data will not be fetched unnecessarily if it is not needed so as to reduce the load on the DB. However this becomes an issue when you use ORM because you no longer write the SQL queries yourself and queries are generated and executed by the underlying ORM engine.
Thankfully ORM engine like Hibernate provides various hooks to optimize the sql as well no of trips that will be made to the database. The most important of these hooks is “fetching strategy” which defines what data will be fetched, when and how.
There are four fetching strategies for loading an object and its associations. (We will use Department-Employee relationship for all the explanation)
- Immediate fetching : In this strategy, the associated object is fetched immediately after the owning entity is fetched, either from the database using a separate SQL query or from the seconadary cache. This is usually not an efficient strategy unless the associated object is cached in the secondary cache or when separate queries are more efficient than a Join query. You can define this strategy by setting lazy=”false” and fetch=”select” for the relationship property definition in the CFC.example :
<cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="false" fetch="select">
With this strategy, on loading the department object, its employees object will be loaded immediately using a separate SQL query. As a result, this strategy is extremely vulnerable to ‘N+1 Select problem’.
pros : The association is loaded immediately and hence the associated object can be accessed even after the ORM session is closed.
cons : A large number of sqls get executed causing a higher traffic between application and the database. The association is loaded even if it might not be needed.When to use : When the association is almost always read after loading the object and executing separate sql is more efficient than executing a join query.
- Lazy fetching : In this strategy, the associated object or collection is fetched lazily i.e only when required. For example, when you load a Department object, all the associated employees will not be loaded at all. It will be loaded only when you access it. This results in a new request to the database but it controls how much of data is loaded and when is it loaded. This helps in reducing the database load because you fetch only the data that is required and is a good default strategy. We will talk about this in much more detail in the next post. For the time being lets just say this is the most commonly used and the default strategy for obvious reasons. You can define this strategy by setting lazy=”true” or lazy=”extra”.
example :<cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="true" >
pros : Only the minimum required data is loaded. This avoids loading of entire object graph in memory and hence the performance is generally good.
cons : If the association is always accessed after loading, this would result in extra sql execution. If the loaded object is accessed in another ORM session (i.e has become detached), extra care must be taken to avoid errors like ‘LazyInitializationException’ or ‘NonUniqueObjectException’.When to use : When the association is not immediately read after loading the object. This is the most commonly used and default strategy.
- Eager fetching : In this strategy, the associated object or collection is fetched together with the owning entity using a single SQL Join query. Thus, this strategy reduces the number of trips to the database and is a good optimization when you always access the associated object immediately after loading the owning entity. You can define this strategy by setting fetch=”join” for the relationship property definition in the CFC.example :
<cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" fetch="join">
With this strategy, on loading the department object, both department and employees data will be fetched from the database using a single join query.
Even if the eager fetching is not defined in the CFC metadata, it can be done at runtime using ORMExecuteQuery. This can be very powerful in scenarios where in most of the cases, you choose the assocition to be lazily loaded but in some cases, you want to immediately load it. In those case, use Join in the HQL and execute that using ORMExecuteQuery.
Example :
ORMExecuteQuery("from Department dept left join fetch dept.employees") ORMExecuteQuery("from Department dept left join fetch dept.employees where dept.id=1001")pros : The association is loaded immediately and hence the associated object can be accessed even after the ORM session is closed. The association is loaded using a single join query which usually is more efficient than executing multiple queries.
cons : The association is loaded even if it might not be needed. Since the query used is a join query, the resultset returned by the DB will typically contain lot of repititive data. If used for more than one collection of an entity, this will create a cartesian product of the collection’s data and thus causing creation of a huge resultset.When to use : When the association is almost always read after loading the object. More suitable for many-to-one and one-to-one association or single collection where the associated objects can be loaded using join query without much overhead.
- Batch fetching : This strategy tells Hibernate to optimize the second SQL select in Immediate fetching or lazy fetching to load batch of objects or collections in a single query. This allows you to load a batch of proxied objects or unitialized collections that are referenced in the current request. This is a blind-guess optimization technique but very useful in nested tree loading.
The concept of batch-fetching is slightly confusing (at least I got confused when I first read about it). So you need to pay careful attention to this.
This can be specified using “batch-size” attribute for CFC or relationship property. There are two ways you can tune batch fetching: on the CFC and on the collection.- Batch fetching at CFC level : This allows batch fetching of the proxied objects and hence is applied to one-to-one and many-to-one relationship. To give an example, cosider Employee-Department example where there are 25 employee instance loaded in the request(ORM session). Each employee has a reference to the department and the relationship is lazy. Therefore employee objects will contain the proxied object for Department.If you now iterate through all the employees and call getDepartment() on each, by default 25 SELECT statements will be executed to retrieve the proxied owners, one for each Department proxy object. This can be batched by specifying the ‘batch-size’ attribute on the Department CFC like
<cfcomponent table=”Department” batch-size=”10″ …>
When you call getDepartment() on the first employee object, it will see that department should be batch fetched, and hence it will fetch 10 department objects that are proxied in the current request.
So for 25 employee objects, this will make Hibernate to execute at max three queries – in batch of 10, 10 and 5.
You must note that batch-size at component level does not mean that whenever you load a Department object, 10 department objects will get loaded in the session. It just means that if there are proxied instances of Department object in the session, 10 of those proxied objects will get loaded together. - Batch fetching at collections : This allows batch fetching of value collections, one-to-many or many-to-many relationships that are unitialized. To give an example, consider Department-Employee one-to-many relationship where there are 25 departments loaded and each department has a lazy collection of employees. If you now iterate through the departments and call getEmployees() on each, by default 25 SELECT statements will be executed, one for each Department to load its employee objects. This can be optimized by enabling batch fetching which is done by specifying “batch-size” on the relationship property like
In Department.cfc :
<cfproperty name="employees" fieldtype="one-to-many" cfc="employee" fkcolumn="deptId" lazy="true" batch-size="10">
One important thing to understand here is that batch-size here does not mean that 10 employees will be loaded at one time for a department. it actually means that 10 employee collections (i.e employees for 10 department objects) will be loaded together.
When you call getEmployees() on the first department, employees for 9 other departments will also be fetched along with the one that was asked for.
The value for batch-size attribute should be chosen based on the expected number of proxied objects or unitialized collections in the session.
- Batch fetching at CFC level : This allows batch fetching of the proxied objects and hence is applied to one-to-one and many-to-one relationship. To give an example, cosider Employee-Department example where there are 25 employee instance loaded in the request(ORM session). Each employee has a reference to the department and the relationship is lazy. Therefore employee objects will contain the proxied object for Department.If you now iterate through all the employees and call getDepartment() on each, by default 25 SELECT statements will be executed to retrieve the proxied owners, one for each Department proxy object. This can be batched by specifying the ‘batch-size’ attribute on the Department CFC like

Pingback: ColdFusion ORM : Performance tuning – Lazy loading | ColdFused?
#1 by Ben Nadel on September 18th, 2009
| Quote
I am confused about the Batching. In the example, I think what you are saying that there is one department, which has 25 employees in it (loaded lazilly). Then, you iterate over the employees and call getDepartment() on them. Why would that call any more queries? Would the getDepartment() request simply return the department instance cached in the current session?
Or, does the fact that you are going more than one level deep trigger a non-lazy approach:
department::getEmployees() — lazy load
department.getEmployees()[ 1 ].getDepartment() — trigger deep load?
If that triggers deep load, then I assume the second getDepartment() is deep-loading the employee proxies contained within it?
Am I way off base here?
FYI – when I entered this form, it said “Welcome back Ben Nadel”… but, when I went to submit it, it told me my name needed to be entered (had to hit Change and fill out form).
#2 by Rupesh Kumar on September 19th, 2009
| Quote
I will try again. There are two parts and they are completely different.
1. batch fetching at CFC that applies to many-to-one and one-to-one: Lets say you have defined batch-fetching on Department.cfc with size 10. And lets say you have loaded few employees in the session.
emp1 -> dep1
emp2 -> dep2
emp3 -> dep1
emp4 -> dep3
…
When you call getDepartment() on emp1 where dept should be lazily loaded, it will load 9 other departments which are proxied in the session. So even though you called getDepartment on emp1 only, department objects dep2, dep3.. will be loaded if they are proxied (because of other employee objects in the session). This is done in anticipation that you will be accessing getDepartment for other employee objects as well.
2. Similarly for one-many and many-many, lets say you have specified the batchsize on dept-employees relationship where emploiyees are lazily loaded, and there are 20 departments loaded in the session. When you call dept1.getEmployees(), employees for 9 other departments will also be loaded along with it. Again this is in anticipation that you will be calling getEmployees on all the department objects that are loaded.
Let me know if I confused you even more
#3 by Sumit Verma on January 22nd, 2010
| Quote
Does “fetch all properties” syntax works? I tried it but seems like it’s silently getting ignored.
e.g.
from Document fetch all properties order by name
As mentioned at the end of section 14.3 here?
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/queryhql.html#queryhql-joins
#4 by Simon Lenoir on March 16th, 2010
| Quote
Hi Rupesh,
It seems that Coldfusion by default doesn’t use lazy loading. Is their any additional configuration to do ? In CF Admin or in the application.ormSettings ?
By reproducing you example with lazy loading:
property name=”employees” fieldtype=”one-to-many” cfc=”employee” fkcolumn=”deptId” lazy=”true”;
and then doing a simple dump:
I get the array of employees with all their properties. It shouldn’t be because I didn’t explicitly load the array of employees.
My second concern, and it’s related, is that I would like to use a bi-directional ORM relationship. If I don’t use lazy loading at all, when I get the department with array of employees, each employee have a property department with his own list of employees with … and so on… It looks like an infinitive loop of objects. How can I make sure that doesn’t appends (because all this data get send to Flex).
Thanks for your help.
Simon
#5 by Josh on January 18th, 2011
| Quote
Is it possible to combine lazy loading with fetch=”join”?
If so, what happens?
Thanks!
#6 by Anne on February 27th, 2011
| Quote
Our webspider noticed that your blog is dofollow. We launched first wordpress dofollow search engine and we have added your blog. This will increase the visitors. You have our adress on our name on this commnent. We will be very thankful if you will also give us a hand in promoting it by adding our url somewhere on your blog Our thanks for being dofollow !!
#7 by penisförstoring on July 7th, 2011
| Quote
To the guy who left a comment above: you should totally go for ColdFusion — I’m loving it!
It is totally worth it as well.
#8 by Google App Engine PHP on September 21st, 2011
| Quote
Just recently I’ve realized, that the real bottleneck of web apps is database. Happy to see how its tuned for CF, maybe can apply to PHP / MySQL combo too…
#9 by Sonu Agarwal on August 9th, 2012
| Quote
Very nice Article, Rupesh.It helped me a lot in understanding the performance tuning in ORM more precisely and concisely.
#10 by limousine services san francisco on June 13th, 2013
| Quote
each time i used to read smaller content that also clear their
motive, and that is also happening with this post which I am reading
here.