-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 
Author Message
 Post subject: Problem getting Hibernate Search (Lucine) to create indexes
PostPosted: Sat May 05, 2007 7:01 pm 
Newbie

Joined: Fri Apr 06, 2007 1:29 pm
Posts: 4
Hibernate version: version 3.2.2, January 24, 2007
Hibernate search version: hibernate-search-3.0.0.Beta1

Mapping documents:
My hibernate config

Code:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE hibernate-configuration PUBLIC
"-//Hibernate/Hibernate Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
    <session-factory>
        <property name="hibernate.bytecode.use_reflection_optimizer">false</property>
        <!-- configuration pool via c3p0-->
        <property name="c3p0.acquire_increment">1</property>
        <property name="c3p0.idle_test_period">100</property> <!-- seconds -->
        <property name="c3p0.max_size">3</property>
        <property name="c3p0.max_statements">0</property>
        <property name="c3p0.min_size">1</property>
        <property name="c3p0.timeout">60</property> <!-- seconds -->
        <!-- DEPRECATED very expensive property name="c3p0.validate>-->
        <property name="hibernate.cache.provider_class">org.hibernate.cache.NoCacheProvider</property>
        <property name="hibernate.cache.use_minimal_puts">false</property>
        <property name="hibernate.cache.use_query_cache">false</property>
        <property name="hibernate.connection.driver_class">org.postgresql.Driver</property>
        <property name="hibernate.connection.password"></property>
        <property name="hibernate.connection.url">jdbc:postgresql://localhost:5432/XXXXX?logUnclosedConnections=true&amp;loglevel=1</property>
        <property name="hibernate.connection.username">XXXXXXX</property>
        <property name="hibernate.current_session_context_class">thread</property>
        <property name="hibernate.dialect">org.hibernate.dialect.PostgreSQLDialect</property>
        <property name="hibernate.format_sql">false</property>
        <property name="hibernate.max_fetch_depth">3</property>
        <property name="hibernate.show_sql">false</property>
        <!-- Hibernate Search -->
        <property name="hibernate.search.default.directory_provider">org.hibernate.search.store.FSDirectoryProvider</property>
        <property name="hibernate.search.default.indexBase">/opt/lucine</property>
        <!-- Mapping objet/relationnel -->
        <mapping resource="com/knover/hit/Hit.hbm.xml" />
        <!-- Hibernate Search auto indexing -->
        <event type="post-update">
            <listener class="org.hibernate.search.event.FullTextIndexEventListener"/>
        </event>
        <event type="post-insert">
            <listener class="org.hibernate.search.event.FullTextIndexEventListener"/>
        </event>
        <event type="post-delete">
            <listener class="org.hibernate.search.event.FullTextIndexEventListener"/>
        </event>
    </session-factory>
   
</hibernate-configuration>



The Mapping File

Code:
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping>
    <class name="com.knover.hit.Hit" table="hit" schema="hits">       
        <id name="id" type="java.lang.Long">
            <column name="hit_id" />
            <!-- <generator class="native"></generator> -->
            <generator class="sequence">
                <param name="sequence">hit_id_seq</param>
            </generator>
        </id>
       
        <timestamp name="timestamp" column="timestamp" />

        <set name="subjects" schema="hits" table="subject" sort="natural" lazy="false" cascade="all" >
            <key column="id"/>
            <element column="subject_id" type="java.lang.Integer"/>
        </set>
       
        <property name="title" type="text">
            <column name="title"/>
        </property>
       
        <property name="content" type="text" not-null="false" update="false">
            <column name="content" />
        </property>
       
        <property name="url" index="url_idx" type="text" not-null="true" update="true">
            <column name="url" />
        </property>
       
        <property name="domain" type="text" not-null="false" update="true">
            <column name="domain"  />
        </property>
       
        <property name="retrievedAt" type="java.util.Date" update="false" >
            <column name="retrival_timestamp" />
        </property>
       
        <property name="isCurrent" type="java.lang.Boolean" not-null="true" update="true">
            <column name="is_current" default="TRUE" />
        </property>
       
        <property name="isHidden" type="java.lang.Boolean" not-null="true" update="true" >
            <column name="is_hidden" default="FALSE" />
        </property>
       
        <property name="views" type="java.lang.Long" not-null="true" update="true">
            <column name="views" default="0" />
        </property>
       
        <property name="rating" type="java.lang.Long" not-null="true" update="true">
            <column name="rating" default="-2" check="rating &lt;= 2 AND rating &gt;=-2" />
        </property>
       
        <property name="numberOfRatings" type="java.lang.Long" not-null="true" update="true">
            <column name="number_of_ratings" default="0" />
        </property>
    </class>
</hibernate-mapping>


PostgreSQL 8*

Here is the code for my Hit Class

Code:
/** Hit.java * Created on April 12, 2007, 5:13 PM */
package com.knover.hit;
import java.io.Serializable;
import java.util.Date;
import java.util.Set;
import org.apache.lucene.search.Searchable;
import org.hibernate.search.annotations.*;

/** @author Brian M. Lima*/
@Indexed(index="hit_idx")
public class Hit implements Serializable{   
    private Long id=null;
    private String content=null;
    private Date timestamp=null;
    private Set<Integer> subjects=null;
    private String title=null;
    private String url=null;
    private Date retrievedAt=null;
    private Boolean isCurrent=true;
    private Long views=null;
    private Long rating=null;
    private Long numberOfRatings=null;
    private String domain=null;
    private Boolean isHidden=false;

    @DocumentId
    public Long getId(){return id;}
    public void setId(Long id){this.id = id;}
   
    public String getContent(){return content;}
    public void setContent(String content){this.content = content;}
   
    public Set<Integer> getSubjects(){return subjects;}
    public void setSubjects(Set<Integer> subjects){this.subjects = subjects;}
   
    public String getTitle(){return title;}
    public void setTitle(String title){this.title = title;}
   
    public String getUrl(){return url;}
    public void setUrl(String url){this.url = url;}
   
    public Date getRetrievedAt(){return retrievedAt;}
    public void setRetrievedAt(Date retrievedAt){this.retrievedAt = retrievedAt;}
   
    public Boolean getIsCurrent(){return isCurrent;}
    public void setIsCurrent(Boolean isCurrent){this.isCurrent = isCurrent;}
   
    public Long getViews(){return views;}
    public void setViews(Long views){this.views = views;}
   
    public Long getRating(){return rating;}
    public void setRating(Long rating){this.rating = rating;}
   
    public Long getNumberOfRatings(){return numberOfRatings;}
    public void setNumberOfRatings(Long numberOfRatings){this.numberOfRatings = numberOfRatings;}
   
    public String getDomain(){return domain;}
    public void setDomain(String domain){this.domain = domain;}
   
    public Date getTimestamp(){return timestamp;}
    public void setTimestamp(Date timestamp) {this.timestamp = timestamp;}
   
    public Boolean getIsHidden() {return isHidden;}
    public void setIsHidden(Boolean isHidden) {this.isHidden = isHidden;}
}


I added some anotations to my class that hibernate.search complained about not having and everything seems to be going fine except no indexes are created in the index folder.

My objects are added to the database perfectly and I know that Lucene is working because it threw an exception when I had no DocumentId anotation on my class.

I also checked all my permissions on the directory for indexes. I created the directory with the same user I am testing as so permissions should not be a problem

I feel as if I am missing something very simple such as not anotating my class correctly or not initializing Lucene, if it needs to be. Anyway I have been searching for a few hours now and I gues I am not googling the problem with the correct words.

To recap my database is empty, I build the tables using SQL then run my insertion program which sucessfully inserts over 2 million rows into the database. All my Hibernate searching test cases pass but there are no Lucine indexes created in the target directory.

Any Help will be gratly apreciated, I know I am missing something simple, I think I have been looking too hard. Perhaps I am missing some initialization? I get no errors or notifications from hibernate search.


Top
 Profile  
 
 Post subject:
PostPosted: Sun May 06, 2007 6:35 pm 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
Hi
This is strange
Be sure to double check /opt/lucine (seems you made a type lucine vs lucene)
You need to create this directory,
Then subdirectories will be created in it (in your case hit_idx)

Also, something I found strange, You have no @Field annotation making your mapping pretty useless.

You also need to commit your transaction after the insertions.

PS one way to check if data has been indexed after insertions is to do a query.

_________________
Emmanuel


Top
 Profile  
 
 Post subject:
PostPosted: Tue May 15, 2007 6:45 pm 
Newbie

Joined: Fri Apr 06, 2007 1:29 pm
Posts: 4
It did turn out to be a typo issue. I also assumed that the annotations would include all fields if I did not designate them, but that actually turns out to be an incorrect and looking back at it a not all that intelligent an assumption.

Currently everything works extremely well.

I am amazed at how well Hibernate and Hibernate Search work. I have learned allot about both since my last posting. I do still have one concern and that is the possibility of the data in my database being changed by other systems. Currently this is not happening however I worry a little about being tied to always using Hibernate for any data modification that might skew the Lucene indexes. I hate to tell my DBA that he can not change data anymore

Anyway, I am nearing production much faster and with much more powerful search / retrieval capabilities than I expected. A project I estimated at two months is nearing completion in just over one month and search / retrieval is the same speed as my previous custom search system but does not take up gigabytes of memory. That reduces my cost of ownership by more than $700.00 USD per month per search server in my cluster, less $200.00 USD for more fast disks in the RAID array to help out PostgreSQL and Lucene. Of course most of the memory reduction is due to using PostgreSQL for storage as opposed to a complex in memory data structure and some fancy caching, but even under heavy load the trio seems able to handle concurrent search / retrieval with out significant spikes in memory usage. Especially no spikes that could cause disk thrashing.

All around I am impressed with the quality of the combination of PostgreSQL, Hibernate3, and Hibernate Search. I do wish that the Scheme Exporter worked a little better out of the box with PostgreSQL, like the inability to create sequences, but other than that I will be using PostgreSQL, Hibernate3, and Hibernate Search as core infrastructure for all my new projects as soon as I learn more about clustering the architecture.

Does anyone have any suggestions for books on clustering Hibernate3 and Hibernate Search. I currently use PgPool for PostgreSQL but it does not support true replication. Is this going to be a problem moving forward for clustering the Lucene component? I worry about the DocumentID being different across pooled versions of my database. I think I can lock on a sequence generator across the pool but even if that were to work there is the possibility of the generators becoming out of synch due to failure and recovery and a there would be a limit on data insertion speed. I assume with all the usage these products get that someone has a good solution, but I am getting way off topic.

Anyway thanks for the response and I made sure to give you credit!


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 17, 2007 9:08 pm 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
Thank you for the compliments, I'm glad Hibernate Search helped you.

You have several questions:
- what if another app change my database?
2 solutions, if the app raise an event, the you can call a fullTextSession.index(thridPartyUpdatedEntity)
If the app is silent, you will have to reindex periodically. Either the whole entity index or if you ahve some time stamp, only the updated entities, once again it's through fullTextSession.index();
- clustering
If you are talking about clustering Lucene through Hibernate Search but targeting one database instance, check the HSearch reference documentation, I describe a custering architecture that scale fairly well.
If you want to cluster your database, 2 solutions:
- it's transparent wrt your JDBC driver, so Hibernate and HSearch will be usable out of the box
- it targets different DB instances with no link one with another: Hibernate Shards might worth a look. Hibernate Search is not yet compatible with Hibernate Shards, but it shouldn't be a major problem to port it.

_________________
Emmanuel


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.