-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 15 posts ] 
Author Message
 Post subject: MassIndexing, example from the book
PostPosted: Tue Feb 23, 2010 11:51 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
Hi,

i'm currently tryin to get my tool ready for indexing masses of entries, so i take i look in the book, esp. chapter 9, and find the following:

Code:
Criteria query = session.createCriteria( Item.class )
.setFetchMode("distributor", FetchMode.JOIN)
.setResultTransformer(
➥CriteriaSpecification.DISTINCT_ROOT_ENTITY)
.setCacheMode(CacheMode.IGNORE)
.setFetchSize(FETCH_SIZE)
.setFlushMode(FlushMode.MANUAL);
ScrollableResults scroll = query.scroll(
➥ ScrollMode.FORWARD_ONLY);
int batch = 0;
scroll.beforeFirst();


While tryin to execute this, i get the following exception concerning the last line:

Code:
ERROR org.hibernate.util.JDBCExceptionReporter [RMI TCP Connection(4)-*.*.*.*] - Der angeforderte Vorgang wird für nur vorwärts gerichtete ResultSets nicht unterstützt.

Sry for that german, but it says that the method "beforeFirst" is not supported by resultsets that are FORWARD_ONLY.
Is this a fault on my side?


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Feb 24, 2010 4:49 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
Hi

You should try the approach using the online documentation(i've used this before with no problems) :

http://docs.jboss.org/hibernate/stable/ ... batchindex

The other thing that is worth mentioning is the MassIndexer API which is documented in the below:

http://relation.to/Bloggers/Sanne

Cheers
Amin


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Feb 24, 2010 9:43 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
thx, but i have to use HS 3.0.1., so i cannot use the massindexer, which, as described in the blog, is introduced with hs 3.2.

I'll give the online documentation a try, but the thing i was wondering about:

how can we call a method called "beforeFirst()" (Go to a location just before first result (this is the initial location)) on a scrolledResultSet which only allows scrolling in forward direction? "before First" means to me, that we HAVE TO go back(which we cant by definition) (if the current position isnt already the initial one)! isnt this a paradoxon on here?

(and thats what the exception is tryin to tellin' me as well)


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Thu Feb 25, 2010 4:08 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
thanks for reporting; this statement was tested and actually used in production so I guess different JDBC implementations have a different notion of the starting point. Which database and driver are you using?
It's not legal to use a beforeFirst() after any next(), but I just verified on MySQL it's fine to execute asa first statement.

Quote:
"before First" means to me, that we HAVE TO go back

that depends of course on the idea of where you are at start;

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Thu Feb 25, 2010 4:47 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
Hm,

i'm using MSSQL with JDBC: com.microsoft.sqlserver.jdbc.SQLServerDriver

would be sick, if its different in different jdbc implementations


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Thu Feb 25, 2010 5:49 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
would be sick, if its different in different jdbc implementations

totally agree, but that's often the case unfortunately

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Tue Apr 20, 2010 12:31 am 
Newbie

Joined: Mon Apr 19, 2010 11:55 pm
Posts: 5
ScrollMode.FORWARD_ONLY attribute dose not allow RS to call scroll.beforeFirst() method.

The following code dose work for me. I have indexed over 3 million of records and did not find any issues other than the time. (seam project)

please make sure you put following line in your persistent.xml file
<property name="hibernate.search.worker.batch_size" value="1000"/>

hibernate.search.worker.batch_size value should be same as the value you put in your class.





import javax.persistence.EntityManager;


import org.dms.epop.crs.birth.normal.model.BirthDeclarations;
import org.hibernate.Session;
import org.hibernate.search.FullTextSession;
import org.hibernate.search.SearchException;
import org.hibernate.search.jpa.FullTextEntityManager;
import org.jboss.seam.annotations.In;
import org.jboss.seam.annotations.Name;
import org.hibernate.CacheMode;
import org.hibernate.FlushMode;
import org.hibernate.HibernateException;
import org.hibernate.ScrollMode;
import org.hibernate.ScrollableResults;
import org.hibernate.Transaction;



@Name("indexAction")
public class IndexAction {

@In
protected EntityManager entityManager;

private static final int BATCH_SIZE = 1000;
private static final int FETCH_SIZE = 10;



Session session;

public void index() {
indexItems();
}

}

private FullTextSession getFullTextSession() {

session = (Session)entityManager.getDelegate();
FullTextSession ftSession = org.hibernate.search.Search.createFullTextSession(session);

return ftSession;
}



public void indexItems() {
FullTextSession session = getFullTextSession();
Transaction tx = null;
try {
session.setFlushMode(FlushMode.MANUAL); //disable flush operations
session.setCacheMode(CacheMode.IGNORE); //disable 2nd level cache operations
tx = session.beginTransaction();
//read the data from the database
//Scrollable results will avoid loading too many objects in memory





ScrollableResults results = session.createCriteria( BirthDeclarations.class )
.scroll( ScrollMode.FORWARD_ONLY ); //ensure forward only result set

System.out.println("SIZE OF THE RS :" );

int index = 0;
while( results.next() ) {
index++;
session.index( results.get(0) ); //index each element
if (index % BATCH_SIZE == 0) {
session.clear(); //clear the session releasing memory (no need to call flush method in hibernate search 3.0.1)
System.out.println("...indexing " + " BirthDeclarations.class " + ": " + index);

}
}

//commit the remaining index changes
tx.commit();

try {
if (results != null) {
results.close();
}
} catch (Exception ex) {
ex.printStackTrace();
}



}
catch (HibernateException e) {
rollbackIfNeeded(tx);
throw e;
}
catch (SearchException e) {
rollbackIfNeeded(tx);
throw e;
}
finally {
session.close();
}
}

private void rollbackIfNeeded(org.hibernate.Transaction tx) {
if ( tx != null && tx.isActive() ) {
tx.rollback();
}
}
}


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Tue Apr 20, 2010 8:22 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
thx 4 replying. Tried that, but my problem now is, that i'm not able to create that criteria. I have several 100k entries in my db, the jvm has 512mb heap space. i always get an out of memory error when tryin to create that criteria.

btw, what about that FETCH_SIZE int? just a copy & paste issue? didnt see it used in the code.


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 1:08 am 
Newbie

Joined: Mon Apr 19, 2010 11:55 pm
Posts: 5
Hi,
Scroll method dose not load the entire RS in to memory. I am using the code to index more than 3 million recs. When you are running the program, check the size of the index folder. If the folder size remains the same means your indexes will not updated and entities will reside in the memory.
Please double check weather your have done below settings.
Cheers
Indika



%JBOSS_HOME%\bin\run.bat

rem JVM memory allocation pool parameters. Modify as appropriate.
set JAVA_OPTS=%JAVA_OPTS% -Xms256m -Xmx512m -XX:PermSize=256m -XX:MaxPermSize=512m


Also increase the “TransactionTimeout” value in %JBOSS_HOME%\server\production\conf\jboss-service.xml from 300---> 300000


And also increase the “connectionTimeout” value in
%JBOSS_HOME%\server\production\deploy\jboss-web.deployer\server.xml to 30000 (depending on your data volume)


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 1:10 am 
Newbie

Joined: Mon Apr 19, 2010 11:55 pm
Posts: 5
and just ignore the FETCH_SIZE value.

gud luck
Indika


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 2:00 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
it seems to me that there's something wrong with the cursors usage; I know that with MySQL you need to add some special parameters at the JDBC connection properties to properly enabled them, otherwise it's driver will fetch all data in memory at once instead of properly streaming.
Did you enable cursors in the hibernate configuration? The MSSQL driver also needs some special care, I'd recommend using jTDS driver for MsSQL Server.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 3:31 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
thanks for your remarks indikak, but i dont use a jboss here. i think its about this cursor thing.
i finally managed to keep the memory (on the indexing machine) at round about 200mb for the java process, doing it like this:
Code:
         int PAGESIZE = 100;
         int currentElement = 0;
         Criteria crit = dbSession.createCriteria(Attribute.class);
         crit.setFirstResult(currentElement);
         crit.setMaxResults(PAGESIZE);
         List<Attribute> results = crit.list();
         while (results.size() > 0) {
            fSession.beginTransaction();
            for (Attribute r : results) {
               fSession.index(r);
            }
            fSession.getTransaction().commit();
            fSession.clear();
            results.clear();
            crit = dbSession.createCriteria(Attribute.class);
            crit.setFirstResult(currentElement + PAGESIZE);
            currentElement += PAGESIZE;
            crit.setMaxResults(PAGESIZE);
            results = crit.list();
         }

I tried to index my test db with round about 700k entries last night with this code, but as i just returned to the machine, it has indexed just 380k entries in 13h, thats totally not acceptable, or is it?
the processor load of the db host was at 100%, which isnt acceptable either, since there are other project, using this host.

Is it a possibilty to create a new fulltextsession every iteration?


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 5:56 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
ok, i get the cursors working (extend the connection url in the hibernate.cfg with "selectMethod=cursor;"). Now the resultset is generated on the server. Nice one. First attempt seem to work, but i get another error (which is out of the HS Scope), so i had to terminate. Now, restarting the index method, i get:
Code:
org.hibernate.exception.GenericJDBCException: could not advance using next()


while doing this:
Code:
Criteria crit = dbSession.createCriteria(Attribute.class);
ScrollableResults attributeScroll = crit.scroll(ScrollMode.FORWARD_ONLY);
while (attributeScroll.next()){
   ...


edit: first iteration works, second fails, dont know why


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 9:57 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
is the following code closing the ScrollableResults in any way?

Code:
Criteria crit = dbSession.createCriteria(Attribute.class);
crit.setCacheable(false);
crit.setCacheMode(CacheMode.IGNORE);
ScrollableResults attributeScroll = crit.scroll(ScrollMode.FORWARD_ONLY);
while (attributeScroll.next()) {
  fSession.beginTransaction();             
  Object[] o = attributeScroll.get();
  Attribute a = (Attribute) o[0];
  fSession.index(a);
  fSession.getTransaction().commit();
  dbSession.clear();
}


if i outcomment the calls on fsession, the loop works fine until the resultset returns no more elements. Otherwise, it fails after one iteration like described before. ("could not advance using next()" caused of closed resultSet)


Top
 Profile  
 
 Post subject: Re: MassIndexing, example from the book
PostPosted: Wed Apr 21, 2010 10:11 am 
Regular
Regular

Joined: Thu Nov 26, 2009 8:45 am
Posts: 78
Link tells me that: "Flush the associated Session and end the unit of work."

unit of work = session?

I dont wanna let the fulltextsession object getting that big, in my local machines memory. How can i avoid that?
clear wont help me. Obv. i have to keep the session open until i iterate to the end of the resultset, so i can't open/close the Fulltextsession every X times, between the loop(?).


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 15 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.