尽管官网上强调,facet在以后的版本中将会从elasticsearch中移除,推荐使用aggregations。但在工作上,自己还是使用了facet。在阅读《Mastering Elasticsearch》的时候,看到了对facet的介绍,介绍的非常的实用和易懂,于是就摘译了一部分出来,供需要的参考。
当使用ElasticSearch 刻面(faceting)机制时,需要牢记:刻面(faceting)结果仅在查询(query)结果上计算;如果你在query实体外包含过滤(filter),这样的过滤不会限制刻面统计的文档(document)
来看例子:
首先,使用以下命令往books索引内插入一些文本:
curl -XPUT ‘localhost:9200/books/book/1‘ -d ‘{
"id":"1", "title":"Test book 1", "category":"book",
"price":29.99
}‘
curl -XPUT ‘localhost:9200/books/book/2‘ -d ‘{
"id":"2", "title":"Test book 2", "category":"book",
"price":39.99
}‘
curl -XPUT ‘localhost:9200/books/book/3‘ -d ‘{
"id":"3", "title":"Test comic 1","category":"comic",
"price":11.99
}‘
curl -XPUT ‘localhost:9200/books/book/4‘ -d ‘{
"id":"4", "title":"Test comic 2","category":"comic",
"price":15.99
}‘
让我们来看看当使用查询(query)和过滤(filter)时,刻面(faceting)是如何工作的。我们将会执行一个简单的查询(query)——返回books索引上的所有文档。同样,我们会包含一个过滤来将查询结果限制仅仅属于book分类(category),以及包含一个针对price字段的范围切面,来查看有多少文档的价格低于30和有多少是高于30.整个查询如下:
{
"query": {
"match_all": {}
},
"filter": {
"term": {
"category": "book"
}
},
"facets": {
"price": {
"range": {
"field": "price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
}
}
}
}
执行后,我们将得到以下结果:
{
…
"hits":{
"total":2,
"max_score": 1.0,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.0,
"_source": {
"id": "2",
"title": "Test book 2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 3,
"min": 11.99,
"max": 29.99,
"total_count": 3,
"total": 57.97,
"mean": 19.323333333333334
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
从结果可以看出,尽管filter限制只包括category字段取值为book的文档,但facet并不是只在这些文档上执行,而是在books索引上的所有文档上执行(因为match_all查询)。也就是说,刻面机制在计算的时候是不考虑filter的。但如果filter作为query的一部分呢?比如filtered查询?继续看例子。
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"category": "book"
}
}
}
},
"facets": {
"price": {
"range": {
"field": "price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
}
}
}
}
返回结果:
{
...
"hits":{
"total": 2,
"max_score": 1.0,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.0,
"_source": {
"id": "2",
"title": "Test book2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 1,
"min": 29.99,
"max": 29.99,
"total_count": 1,
"total": 29.99,
"mean": 29.99
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
从返回结果可以看出,这个时候的filter限制了facet的计算范围。
现在,想象我们想要仅仅对title字段包含”2”的书籍计算刻面。我们可以在query增加第二个filter,但是这样的话,会限制查询结果,这并不是我们想要的。我们要做的是引入facet filter。
在提供facet的同级使用facet_filter,这允许我们限制计算刻面的文本。比如如果想限制刻面计算只针对title字段包含”2“的文本,elasticsearch语句可修改为:
{
"query": {
"filtered": {
"query": {
"match_all": {
}
},
"filter": {
"term": {
"category": "book"
}
}
}
}"facets": {
"price": {
"range": {
"field": "price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
},
"facet_filter": {
"term": {
"title": "2"
}
}
}
}
}
返回结果:
{
...
"hits":{
"total":2,
"max_score": 1.0,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.0,
"_source": {
"id": "2",
"title": "Test book 2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 0,
"total_count": 0,
"total": 0.0,
"mean": 0.0
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
从上面可以看出,facet限制在了一个文本。而query没变。
现在,假如我们想要对所有category字段为”book“的文档进行query(查询),但是想要对索引中的所有文档都进行facet,改怎么办呢?
直接看语句吧:
{
"query": {
"term": {
"category": "book"
}
},
"facets": {
"price": {
"range": {
"field":"price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
},
"global": true
}
}
}
返回结果:
{
...
"hits":{
"total":2,
"max_score": 0.30685282,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 0.30685282,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 0.30685282,
"_source": {
"id": "2",
"title": "Test book 2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 3,
"min": 11.99,
"max": 29.99,
"total_count": 3,
"total": 57.97,
"mean": 19.323333333333334
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
这就是global带给facet的好处。