Gah, I remember the issue now!
If you're just doing a LIMIT with a sort then everything you say is true, and pulling back the contents of the entire table is completely foolish. However, if you're doing a LIMIT with a sort
and setting a value for firstResult then we have no choice but to pull back the entire contents of the table and do the merge in memory.
Suppose you have a PERSON table with a column called FIRST_NAME:
Shard 1
ALEX
BILL
CARL
DAVE
Shard 2
BOB
FRED
GARY
HENRY
Now suppose you want to issue a query that returns all users, ordered by FIRST_NAME, starting with the second result, with a limit of 2 results.
Code:
Criteria crit = session.createCriteria(Person.class).setFirstResult(2).setMaxResults(2).addOrder(Order.asc("firstName"))list();
What answer would you expect? Well, Shards tries to make your data look like it all lives in the same place, and if all this data was in the same place the result of the select and the order by would be this:
ALEX
BILL
BOB
CARL
DAVE
FRED
GARY
HENRY
We'd then start with the result at index 2 (BOB) and return 2 results: BOB and CARL.
Now suppose we apply the first result and max result constraints
before we return data to the application. From the first Shard we would get:
CARL
DAVE
And from the second Shard we would get:
GARY
HENRY
And from there, well, I'm not even sure what we would do to merge those results, but one thing is for certain, we're not going to be able to return BOB and CARL because BOB isn't even part of the result set.
I've thought through this pretty extensively just now and I
think I buy it. Do you? :-)
Max