-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: FullTextQuery not filling POJO instance with indexed fields
PostPosted: Mon Nov 23, 2009 1:32 pm 
Newbie

Joined: Mon Nov 23, 2009 1:17 pm
Posts: 5
I've tried searching online and through these forums, but I was unable to find a solution to my problem. Please forgive me if this is a trivial question!

I have objects called "Pages" where their metadata is stored in a SQL Server and their payload fields are indexed in Lucene. A "Bucket" is essentially a set of pages, or a topic. The class is annotated as follows:
Code:
@GenericGenerator(
   name="BigIntGenerator",
   strategy="package.BigIntGenerator"
)
@Entity
@Indexed(index="Pages")
@Table(name="Pages")
public class Page implements Cloneable {
   public enum PageType {
      TRAINING, CRAWLED, TRUE_POSITIVE, FALSE_POSITIVE, NO_CONTENT;
      public String getName()  { return name(); }
      public String getLower() { return name().toLowerCase(); }
   }
   
   @Id @GeneratedValue(strategy=GenerationType.AUTO, generator="BigIntGenerator")
   @Column(name="PageID", nullable=false)
   private BigInteger id;
   
   @Column(name="PageType", nullable=false)
   @Enumerated(EnumType.ORDINAL)
   @Field(index=Index.UN_TOKENIZED, store=Store.YES)
   private PageType type = PageType.TRAINING;
   
   @Column(name="URL")
   @Field(index=Index.UN_TOKENIZED, store=Store.YES)
   private String url;
   
   @Column(name="FetchTime")
   @Temporal(TemporalType.TIMESTAMP)
   private Date fetchTime;
   
   @Column(name="FetchResult")
   private Integer fetchResult;
   
   @Column(name="ContentType")
   private String contentType;
   
   @Transient
   @Field(index=Index.UN_TOKENIZED, store=Store.YES)
   @FieldBridge(impl=PrintBridge.class)
   private byte[] print;

   @Column(name="Confidence")
   private float confidence;
   
   @ManyToOne
   @JoinColumn(name="Bucket")
   @Field(index=Index.UN_TOKENIZED, store=Store.YES)
   @FieldBridge(impl=BucketBridge.class)
   private Bucket bucket;

   @Lob
   @Column(name="FeatureVector")
   @Fetch(FetchMode.SELECT)
   private TLongIntHashMap features;

   @Transient
   @Field(index=Index.TOKENIZED, store=Store.COMPRESS)
   private String text;
   
   @Transient
   @Field(index=Index.TOKENIZED, store=Store.YES)
   private String title;
   
   @Transient
   private String[] out;

        ....


I have marked the last 3 fields as "@Transient" so that they only exist in the index. My first question is, am I using Hibernate Search in an improper way? Should I be storing everything in SQL Server and marking everything as store=Store.NO? I set it up this way since about 4 million pages won't be storing/indexing plain text, and will only need a feature vector (the field "features") whereas the rest of the pages will need both features and plain text. I didn't want to add more columns to my DB schema when they wouldn't be used by all the pages.

I load all the pages in a Bucket with the following code:
Code:
      FullTextSession s = HibernateUtil.newSession();
      
      BooleanQuery q = new BooleanQuery();
      q.add(new TermQuery(new Term("bucket", bucket.getId())), Occur.MUST);

      FullTextQuery ftq = s.createFullTextQuery(q, Page.class);
      List<?> l = ftq.list();
      HibernateUtil.endSession(s);
      
      for (Object o : l) bucket.add((Page) o);


My second question is, when I run the above code, the Page instances only have the fields backed by SQL server initialized- the indexed fields are null. When I use projection, the indexed fields are successfully fetched from the index, but I'm trying to avoid using projection in the cases where I need the fully initialized POJO. thanks in advance for your help.


Top
 Profile  
 
 Post subject: Re: FullTextQuery not filling POJO instance with indexed fields
PostPosted: Mon Nov 23, 2009 2:32 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi dimeo,
you can use a Lucene index to store information, but the recommended way is to store in a reliable database and always keep yourself the option to rebuild the index from the database.

Everytime you load a managed entity Search will help to identify which primary keys are relevant, but the POJO is created from the database fields: so your @Transient fields will never hold a value as Hibernate is in charge to initialize it.

Using projections is a good performance optimization to show previews of your results, but it can't be used to fully initialize a managed entity: this wouldn't be as safe.

In case you really don't want to add a column to schema, and you trust Lucene's index as a store, you could combine standard object loading and then add the projected values to the objects. I wouldn't recommend it, as I'm always afraid of my index getting corrupted or not being backed up properly.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: FullTextQuery not filling POJO instance with indexed fields
PostPosted: Mon Nov 23, 2009 3:49 pm 
Newbie

Joined: Mon Nov 23, 2009 1:17 pm
Posts: 5
Thank you very much for your prompt reply and the "under the hood" explanation. I guess the solution is to just do it the right way and store everything in the DB... hopefully my manager won't be mad at me for yet another architectural change!


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.