Elasticsearch 之 Facet

尽管官网上强调，facet在以后的版本中将会从elasticsearch中移除，推荐使用aggregations。但在工作上，自己还是使用了facet。在阅读《Mastering Elasticsearch》的时候，看到了对facet的介绍，介绍的非常的实用和易懂，于是就摘译了一部分出来，供需要的参考。

当使用ElasticSearch 刻面(faceting)机制时，需要牢记:刻面（faceting）结果仅在查询（query）结果上计算；如果你在query实体外包含过滤（filter），这样的过滤不会限制刻面统计的文档（document）

来看例子：

首先，使用以下命令往books索引内插入一些文本：

curl -XPUT ‘localhost:9200/books/book/1‘ -d ‘{

"id":"1", "title":"Test book 1", "category":"book",

"price":29.99

}‘

curl -XPUT ‘localhost:9200/books/book/2‘ -d ‘{

"id":"2", "title":"Test book 2", "category":"book",

"price":39.99

}‘

curl -XPUT ‘localhost:9200/books/book/3‘ -d ‘{

"id":"3", "title":"Test comic 1","category":"comic",

"price":11.99

}‘

curl -XPUT ‘localhost:9200/books/book/4‘ -d ‘{

"id":"4", "title":"Test comic 2","category":"comic",

"price":15.99

}‘

让我们来看看当使用查询（query）和过滤（filter）时，刻面（faceting）是如何工作的。我们将会执行一个简单的查询（query）——返回books索引上的所有文档。同样，我们会包含一个过滤来将查询结果限制仅仅属于book分类（category），以及包含一个针对price字段的范围切面，来查看有多少文档的价格低于30和有多少是高于30.整个查询如下：

{

"query": {

"match_all": {}

"filter": {

"term": {

"category": "book"

}

"facets": {

"price": {

"range": {

"field": "price",

"ranges": [

{

"to": 30

{

"from": 30

}

]

}

执行后，我们将得到以下结果：

{

…

"hits":{

"total":2,

"max_score": 1.0,

"hits": [

{

"_index": "books",

"_type": "book",

"_id": "1",

"_score": 1.0,

"_source": {

"id": "1",

"title": "Test book 1",

"category": "book",

"price": 29.99

}

{

"_index": "books",

"_type": "book",

"_id": "2",

"_score": 1.0,

"_source": {

"id": "2",

"title": "Test book 2",

"category": "book",

"price": 39.99

}

]

"facets": {

"price": {

"_type": "range",

"ranges": [

{

"to": 30.0,

"count": 3,

"min": 11.99,

"max": 29.99,

"total_count": 3,

"total": 57.97,

"mean": 19.323333333333334

{

"from": 30.0,

"count": 1,

"min": 39.99,

"max": 39.99,

"total_count": 1,

"total": 39.99,

"mean": 39.99

}

]

}

从结果可以看出，尽管filter限制只包括category字段取值为book的文档，但facet并不是只在这些文档上执行，而是在books索引上的所有文档上执行（因为match_all查询）。也就是说，刻面机制在计算的时候是不考虑filter的。但如果filter作为query的一部分呢？比如filtered查询？继续看例子。

{

"query": {

"filtered": {

"query": {

"match_all": {}

"filter": {

"term": {

"category": "book"

}

"facets": {

"price": {

"range": {

"field": "price",

"ranges": [

{

"to": 30

{

"from": 30

}

]

}

返回结果：

{

...

"hits":{

"total": 2,

"max_score": 1.0,

"hits": [

{

"_index": "books",

"_type": "book",

"_id": "1",

"_score": 1.0,

"_source": {

"id": "1",

"title": "Test book 1",

"category": "book",

"price": 29.99

}

{

"_index": "books",

"_type": "book",

"_id": "2",

"_score": 1.0,

"_source": {

"id": "2",

"title": "Test book2",

"category": "book",

"price": 39.99

}

]

"facets": {

"price": {

"_type": "range",

"ranges": [

{

"to": 30.0,

"count": 1,

"min": 29.99,

"max": 29.99,

"total_count": 1,

"total": 29.99,

"mean": 29.99

{

"from": 30.0,

"count": 1,

"min": 39.99,

"max": 39.99,

"total_count": 1,

"total": 39.99,

"mean": 39.99

}

]

}

从返回结果可以看出，这个时候的filter限制了facet的计算范围。

现在，想象我们想要仅仅对title字段包含”2”的书籍计算刻面。我们可以在query增加第二个filter，但是这样的话，会限制查询结果，这并不是我们想要的。我们要做的是引入facet filter。

在提供facet的同级使用facet_filter,这允许我们限制计算刻面的文本。比如如果想限制刻面计算只针对title字段包含”2“的文本，elasticsearch语句可修改为：

{

"query": {

"filtered": {

"query": {

"match_all": {

}

"filter": {

"term": {

"category": "book"

}

}"facets": {

"price": {

"range": {

"field": "price",

"ranges": [

{

"to": 30

{

"from": 30

}

]

"facet_filter": {

"term": {

"title": "2"

}

返回结果：

{

...

"hits":{

"total":2,

"max_score": 1.0,

"hits": [

{

"_index": "books",

"_type": "book",

"_id": "1",

"_score": 1.0,

"_source": {

"id": "1",

"title": "Test book 1",

"category": "book",

"price": 29.99

}

{

"_index": "books",

"_type": "book",

"_id": "2",

"_score": 1.0,

"_source": {

"id": "2",

"title": "Test book 2",

"category": "book",

"price": 39.99

}

]

"facets": {

"price": {

"_type": "range",

"ranges": [

{

"to": 30.0,

"count": 0,

"total_count": 0,

"total": 0.0,

"mean": 0.0

{

"from": 30.0,

"count": 1,

"min": 39.99,

"max": 39.99,

"total_count": 1,

"total": 39.99,

"mean": 39.99

}

]

}

从上面可以看出，facet限制在了一个文本。而query没变。

现在，假如我们想要对所有category字段为”book“的文档进行query（查询），但是想要对索引中的所有文档都进行facet，改怎么办呢？

直接看语句吧：

{

"query": {

"term": {

"category": "book"

}

"facets": {

"price": {

"range": {

"field":"price",

"ranges": [

{

"to": 30

{

"from": 30

}

]

"global": true

}

返回结果：

{

...

"hits":{

"total":2,

"max_score": 0.30685282,

"hits": [

{

"_index": "books",

"_type": "book",

"_id": "1",

"_score": 0.30685282,

"_source": {

"id": "1",

"title": "Test book 1",

"category": "book",

"price": 29.99

}

{

"_index": "books",

"_type": "book",

"_id": "2",

"_score": 0.30685282,

"_source": {

"id": "2",

"title": "Test book 2",

"category": "book",

"price": 39.99

}

]

"facets": {

"price": {

"_type": "range",

"ranges": [

{

"to": 30.0,

"count": 3,

"min": 11.99,

"max": 29.99,

"total_count": 3,

"total": 57.97,

"mean": 19.323333333333334

{

"from": 30.0,

"count": 1,

"min": 39.99,

"max": 39.99,

"total_count": 1,

"total": 39.99,

"mean": 39.99

}

]

}

这就是global带给facet的好处。

时间： 2024-11-05 22:27:38

Elasticsearch 之 Facet

Elasticsearch 之 Facet的相关文章

Elasticsearch的javaAPI之facet,count,delete by query

用ElasticSearch和Protovis实现数据可视化

filters和scope在ElasticSearch Faceting模块的应用

elasticsearch 中文API facets(⑩)

elasticsearch报错

elasticsearch代码片段，及工具类SearchEsUtil.java

搜索引擎选择： Elasticsearch与Solr

亿级规模的Elasticsearch优化实战

ElasticSearch的基本用法与集群搭建