Hibernate - Don't forget the first level cache

A lot of developers ignore or forget about hibernate first level cache.
Hibernate firs level cache is part of the session and not visible to other sessions.
The first level cache cannot be disable and its not optional like the second level cache.
In the first level cache hibernate use in order to store  attached entities. This is in order to:
1. Performance - avoid db hit for re fetch of the same entity more than one time during the transaction.

2. Avoid "shadow object" in order to track on attache entities changes. Explain:
 Well imagine this code:
c1 = session.get(Customer.class, 42);
c2 = session.get(Customer.class, 42);
(note: the get could be replaced with any other kind of query, hql, criteria, *lazy navigation* etc.)

If hibernate doesn't store the attached entities in the first level cache than we will have two different instance of the same customer row. So,
assertSame(c1, c2); should fail but
assertEqual(c1, c2); would actually pass (if you have a proper equal/hashcode implementation)

What wrong with this?
Changing c1 attributes would not affect c2, so hibernate doesn't allow to have more than one instance of the same entity attached to the session.Hibernate do this by using a central place to control on all attached entities. this placed is called- first level cache. Any get  (actually not any . We can bypass it like using statelesession for example ) and update on entity
If Hibernate (or any other JPA implementation) would not do this you would have these “shadow objects” all over the place and the developer will need to manually keep track and ensure that you only use *one* of these objects.

So, this is a build in hibernate mechanism that cannot be turn off- why should i care about it?
Good question (this is the reason i wrote it :) ).
A lot of developers complaints on hibernate come-up from not understanding  hibernate mechanism.
In the continue i will a link to 2 posts that talk about problems in hibernate . which can be solved by right use in the hibernate framework.

The main reason that you should be aware to the first level cache is - performance.
Attach entity has a cost:
Cost of using proxy, cost of tracking (hibernate need to keep track on the object state), keep of synchronize  - in flush hibernate need to check if it should synchronize the object to the db. the check and the write to the db cost a lot and in most cases they are not necessary, memory - the object leave in the memory even if we are not going to use it any more.

In the following post avoiding-caching-to-improve-hibernate-performance the developer describe a common use case scenario. he has a XML which each element in the XML is a combination of few hibernate entities. he read element from the XML , do some db query for load references entities, look up data, validations and etc create entities based on the XML element and save it to the db. Since he want to get a good performance by avoid loading data from the db for each XML element, he didn't open hibernate session for each XML element.
He was sure that in this way he can utilize the hibernate first level cache and avoid hit the db for each element.
But, he observed that the performance decrease and the processing time required per input record in the data set increased linearly.

The reason - first level cache. The first level cache is get bigger and bigger during the processing time and this has major impact on hibernate performance. Most of the time is spent around hibernate synchronization mechanism (flush).

So - what is the lesson?
You should always concern about your first level cache. If you need the objects only once in the session load them in minimal scope by stateless session by this hibernate will not need to produce proxies, keep track on the objects and they will be free to be collected by the GC.
Another options:
use in readonly mode for read only entities (there are additional options to define read-only).
Define entities as immutable were optional.
Clear the first level cache from unnecessary entities.

How to clear the first level cache?
JPA API doesn't support in clear specific instances. It support only in clear all .Hibernate API allow you to clear specific instances . If you did changes to the instances make sure that you call to flush before clear . The entities become detached after clear , which mean hibernate will not keep track on them and  their state will not be synchronized with the db.

Solving by design - another design approach like use in process-context for the read-only data and session for each element processing can also solve the problem and in this way have short live session, short db transactions, avoid trigger flush during the process.

further reading  :

2 תגובות:

  1. what about ebean? it is session less and works well.

    1. There are a lot of session less solutions. ebean is one of them. My post is for one who chose session solution like hibernate with all advantages and disadvantages , but in some cases due to performance limitation need to avoid first level cache