Second level cache for multiple databases in cluster

nickonet · **Posted:** Tue Feb 03, 2004 11:07 pm

Hi-
I'm still in the process of evaluating Hibernate for our needs and so far I've been impressed by the features and the quality of the docs. Our web application has several clients and each client has a separate database based on the same schema (possibly on different backend servers). I've read in the docs that I need to use different SessionFactory for each database.

I would be interested in the second level cache and I will work in a clustered environment (i.e. several web app servers running the same application and changing the same data).

1- It's stated in the docs that OSCache cannot work in a clustered environment. I understand that because if each SessionFactory instanciates its own cache and if the cache is used in a cluster, when it receives a flush event all caches will try to remove the same primary key (which could be used in different databases). How would Swarmcache or JBoss TreeCache manage it?

2- According to the docs, Swarmcache cannot be used with QueryCache which leaves me with JBoss TreeCache. I'm a little afraid when I read TreeCache replicates data across the cluster. Can it be disabled (I prefer the clustered invalidation of OSCache/Swarmcache)? Has anybody already used TreeCache in a project? Did it scale well with lots of caches?

Thanks,

emmanuel · **Posted:** Wed Feb 04, 2004 7:24 am

Hum actually, TreeCache won't help you.

Quoting Gavin

Quote:

Actually, at the present time, I think the Hibernate docs are wrong and, in fact, the query cache does not work with *either* clustered cache.

The trouble is that SwarmCache is doing clustered eviction instead of clustered replication, and neither of them provide a cluster-safe timestamp.

So we need to get this figured out. (I am on this.)

metula · **Joined:** Mon Feb 02, 2004 8:27 pm **Posts:** 17

Hi.. I'm glad to see others out there have this multi-database design.. It actually works quite well as far as scaling up for large clients goes but when it comes to J2EE app servers with CMP or O/R mapping, things get complicated / impossible. I'd be interested to hear any other findings you have as to the suitabilityof Hibernate for this type of schema.

Anyway.. I had a brief discussion last night with Gavin (Didn't realise who he was in relation to Hibernate at the time!) and he said you shouldn't need multiple sessionFactorys but you'll just need to provide your own connection. (see sessionFactory.openSession(Connection)) (I still don't get how that works with the primary caching but I'll find out when I start testing I guess)

Having looked at the second level caches, First off, I'm not sure they're appropriate to this sort of schema unless you've got a large amount of spare memory... also, I don't think any of them work anyway!!

We have an added complication as outlined in this topic

http://forum.hibernate.org/viewtopic.php?t=927676

-M-

gavin · **Posted:** Wed Feb 04, 2004 1:01 pm

In 2.1.2, the query cache will work with TreeCache, as long as you have synchronized clocks in the cluster.

nickonet · **Posted:** Wed Feb 04, 2004 1:11 pm

Data I want to cache might be extensively used but I don't know upfront what this data will be (can't make a static reference to it).
I didn't really have a problem with custom ConnectionProvider. My problem comes from the fact that Settings.setCacheProvider is not exposed outside Hibernate package and/or that we cannot access (at least I haven't seen how) the Cache outside the package.
Otherwise, I'd just change the Settings to my own cache provider that would create a net.sf.hibernate.cache.Cache that appends for instance the client name/id before the key and use a single OS Cache administrator for notification. It works great ... on paper (I haven't started development yet).

Hibernate is open source so I can certainly recompile it, but I don't want to do that at every new version.

gavin · **Posted:** Wed Feb 04, 2004 1:14 pm

Why not just set the property hibernate.cache.provider_class???

nickonet · **Posted:** Wed Feb 04, 2004 1:19 pm

Yes I can have my custom cache created by using hibernate.cache.provider_class but I can't (apparently) access this cache to set a variable after it is created. No Properties seems to be passed to the CacheProvider/Cache either.

I'm back to using different caches listening to the same port for invalidations that could have the same key (I can't define a different multicast port for each client as it could be in the hundreds).

gavin · **Posted:** Wed Feb 04, 2004 1:30 pm

Actually, I don't understand why you want to do all this. What the point again?

nickonet · **Posted:** Wed Feb 04, 2004 1:49 pm

Ok let me try to restate my problem:
I have a web application hitting several databases that are independent but have the same tables (e.g. employees for client A and B).
I would like to use Hibernate to map my data to some objects (and avoid the pain of writing SQL for each action). I need to create a SessionFactory per client. Each SessionFactory will have its own datasource (no pb) and will instanciate its own second level cache. According to what I've found in the code (I'm a newbie to Hibernate so not sure), the key to the DB data is the key to the cache. But this key is also what is used for clustered invalidation. So if two caches run on the same multicast port, it is possible that the cache will invalidate data that doesn't need to be invalidated (thus removing the advantage of the cache).
If I have a multicast port for each client, I would possibly need hundreds of them. It is not very practical.

I wanted to write my own CacheProvider that would append the client identifier to the key before storing the object in a single cache. For instance, if using OSCache, the Cache for client A would call cacheAdministrator.put("A/" + key) (Similarly get("A/" + key)) where cacheAdministrator would be unique in the JVM.

To do that I need to say to my own Cache that it is the Cache for client A and I didn't find a way to access it after it is created in Settings.

Note: I don't need transaction locking on this data. The data users will be updating is proper to their account (and I use session affinity). For general data, few users will update it and I can still display an error if a conflict occurs.

gavin · **Posted:** Wed Feb 04, 2004 1:51 pm

So your "identical" tables have identical primary keys for *different* data??

Is that kinda - dangerous??!!

nickonet · **Posted:** Wed Feb 04, 2004 1:59 pm

Well that's the point of different databases. They have the same schema but different data. For instance "employees" in database A could have an employee 3 so could employees in database B. How would you enfore uniqueness of keys in different databases?

Sorry I didn't pay more attention to the Configuration.configureCaches() method. It actually has a property that I can use to pass the db identifier. If it is any interest for the Hibernate project I can give this part of the code as a contribution (cache for multiple databases with OSCache and possibly any cache supported)... Of course when it is written and tested.

Thanks again for your time.

jfifield · **Joined:** Tue Aug 26, 2003 3:09 pm **Posts:** 58

We are actually dealing with a similar situation. I thought of the custom cache provider solution, bit never looked very far into it. We have been thinking about changing from autoincrement keys to guid keys to enforce uniqueness across multiple databases. To me, it seems to be the least intrusive solution so far. I'd love to hear any other ideas though.

Joe

gavin · **Posted:** Wed Feb 04, 2004 4:12 pm

I think there is a *really* good argument (completely independant of any Hibernate issues) for using UUIDs here. Would make it MUCH easier to change your data model later (and do replication, etc).

nickonet · **Posted:** Wed Feb 04, 2004 4:52 pm

I am no DB expert but I haven't seen (at least in Postgres) a way to define a sequence valid across different databases (not tables). Also our web app may connect to different backend servers.

As for the solution I plan to use I'd be happy to share some code when I implement it (can't promise it will work until it actually compiles). As a last resort I could use the cache outside of Hibernate but that would remove the cool QueryCache feature.

A side question for the Hibernate experts: What is the key used in cache made of? Does it include the record primary key and the class? The record primary key and the table?

gavin · **Posted:** Wed Feb 04, 2004 4:53 pm

It is easy to have a valid sequence across multipl DBs! Just define a start and increment!