Running HS 5.8 beta3 pointed at ElasticSearch 5.4.1.
I want to be able to search and
sort on certain string fields. Noted that beta3 introduced the notion of normalizers (
http://in.relation.to/2017/06/13/hibern ... 8-0-Beta3/) but I'm not clear on the appropriate way to implement this for a standard string field. While I've seen some success with ElasticSearch 2.4.4 I haven't been able to make sorting work with ES 5.4.1.
"In Elasticsearch version 5.2 and above, a normalizer will be translated to a native Elasticsearch normalizer, and a text field with a normalizer will take the keyword datatype."
That being the case, what is the appropriate way of annotating the field to achieve that result?
Background on what I've tried thus far. I've tried the following sorts of annotations:
Code:
@Fields({
@Field,
@Field(name = "sort_firstName", analyze = Analyze.NO)
})
@SortableField(forField = "sort_firstName")
private String firstName;
// Not clear on how to create a standard normalizer - i'm pretty sure this is incorrect and a normalizer implementation needs to be specified.
@Field(normalizer = @Normalizer)
@SortableField
private String lastName;
@Field
private String email;
When I try to sort on sort_firstName I get the following back from ElasticSearch:
Status: 400 Bad Request
Error message: {"root_cause":[{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [sort_firstName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"com.rossvideo.inception.model.contact.contact","node":"tMVoMLRNRNGSq0InPhVaFg","reason":{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [sort_firstName] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."}}]}
Cluster name: null
Cluster status: 400
Regardless of what I do each field looks like this when I show the mappings for the index:
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
So a "keyword" field is created implicitly (regardless of the annotation I add to the field) - presumably for the very purpose of sorting. However if I try to sort on the keyword field I get the following:
Cannot automatically determine the field type for field 'lastName.keyword'. Use byField(String, Sort.Type) to provide the sort type explicitly.
at org.hibernate.search.query.dsl.sort.impl.SortFieldStates.getCurrentSortFieldTypeFromMetamodel(SortFieldStates.java:177)
at org.hibernate.search.query.dsl.sort.impl.SortFieldStates.determineCurrentSortFieldTypeAutomaticaly(SortFieldStates.java:150)
I tried using byField("lastName.keyword", Sort.INDEXORDER) it won't build because apparently that version of the method is deprecated.