Elasticsearch基础
01、Elasticsearch 教程: 基础教程
02、Elasticsearch 教程: 简介
03、Elasticsearch 教程: 安装前准备工作
04、Elasticsearch 教程: 安装 ( Windows )
05、Elasticsearch 教程: 安装 ( Unix )
06、Elasticsearch 教程: 术语
07、Elasticsearch 教程: RESTful API
08、Elasticsearch 教程: 初体验 – 分词
09、Elasticsearch 教程: 中文分词 – 结巴分词
10、Elasticsearch 教程: 准备数据
11、Elasticsearch 教程: 填充数据
12、Elasticsearch 教程: API 约定
13、Elasticsearch 教程: 文档 API
14、Elasticsearch 教程: 搜索 API
15、Elasticsearch 教程: 聚合计算
16、Elasticsearch 教程: 索引 API
17、Elasticsearch 教程: 集群管理 API
18、Elasticsearch 教程: 查询描述语言
19、Elasticsearch 教程: 映射 ( Mapping )
20、Elasticsearch 教程: 分析 ( Analysis )
21、Elasticsearch 教程: 模块 ( Modules )
22、Elasticsearch 教程: 测试
23、Elasticsearch 教程: 版本升级
24、Elasticsearch 教程: 总结
本文档使用 MrDoc 发布
-
+
首页
15、Elasticsearch 教程: 聚合计算
聚合框架用于收集搜索查询选择的所有数据。该框架由许多构建块组成,有助于构建复杂的数据摘要 下面的JSON 对象使用聚合函数的一般请求正文格式 ```python "aggregations" : { "<aggregation_name>" : { "<aggregation_type>" : { <aggregation_body> } [,"meta" : { [<meta_data_body>] } ]? [,"aggregations" : { [<sub_aggregation>]+ } ]? } ``` Elasticsearch 提供了大量的聚合函数,它们都有各自不同的目的 ### **矩阵聚合 ( Metrics )** ------------ 这些聚合函数可以根据聚合文档的字段值计算度量值,而且有时可以从脚本生成一些值 数字矩阵既可以是单值,也可以是平均聚合或多值统计等 ### **平均数聚合 ( avg )** ------------ 该聚合函数用于计算文档中出现的任何数字字段的平均值 例如 ```python POST http://localhost:9200/user_admin/_search?pretty ``` 请求正文 ```python { "aggs":{ "avg_money":{"avg":{"field":"money"}} } ``` 返回响应结果 ```python { "took" : 160, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "DDKK.COM 弟弟快看,程序员编程资料站 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "avg_money" : { "value" : 1780321.6666666667 } } ``` 如果一个或多个聚合文档中不存在此值,默认情况下它们会被忽略 我们可以在聚合中添加缺失字段来设置缺失字段的默认值 ```python { "aggs":{ "avg_money":{ "avg":{ "field":"money" "missing":0 } } } ``` ### **基数聚合 ( cardinality )** ------------ 基数聚合 ( cardinality ) 用于计算特定字段的不同值的计数 例如 ```python POST http://localhost:9200/user*/_search?pretty ``` 请求正文 ```python { "aggs":{ "distinct_nickname_count":{"cardinality":{"field":"nickname"}} } ``` 响应内容 ```python { "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } ], "type" : "search_phase_execution_exception", "reason" : "all shards failed", "phase" : "query", "grouped" : true, "failed_shards" : [ { "shard" : 0, "index" : "user", "node" : "4zwAMlTzRCaioBeOE9PaNw", "reason" : { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } } ], "caused_by" : { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.", "caused_by" : { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } } }, "status" : 400 ``` 很明显响应出错了,提示 `Fielddata` 要单独加载 好吧,那我们先运行下面的请求来修改下 ```python PUT http://localhost:9200/user*/_mapping/user ``` 请求正文 ```python { "properties": { "nickname": { "type": "text", "fielddata": true } } ``` 响应内容 ```python {"acknowledged":true} ``` 然后重新发起刚刚报错的请求,响应如下 ```python { "took" : 186, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "枫晚", "description" : "停车坐爰枫林晚", "street" : "苏州大学", "city" : "Suzhou", "state" : "Jiangsu", "zip" : "215006", "location" : [ 120.65426, 31.30797 ], "money" : 10235, "tags" : [ "Java", "Android" ], "vitality" : "3.5" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "question", "description" : "问题少年也是少年", "street" : "张江高科技园区", "city" : "Shanghai", "state" : "Shanghai", "zip" : "201204", "location" : [ 121.60632, 31.199305 ], "money" : 13648, "tags" : [ "VUE", "HTML" ], "vitality" : "8.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "DDKK.COM 弟弟快看,程序员编程资料站 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "distinct_nickname_count" : { "value" : 9 } } ``` ### **扩展统计聚合 ( extended_stats )** ------------ 此聚合用于生成有关聚合文档中特定数字字段的所有统计信息 例如 ```python POST http://localhost:9200/user_admin/user/_search?pretty ``` 请求正文 ```python { "aggs" : { "money_stats" : { "extended_stats" : { "field" : "money" } } } ``` 响应内容 ```python { "took" : 12, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "DDKK.COM 弟弟快看,程序员编程资料站 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "money_stats" : { "count" : 3, "min" : 68023.0, "max" : 5201814.0, "avg" : 1780321.6666666667, "sum" : 5340965.0, "sum_of_squares" : 2.7068555211509E13, "variance" : 5.853306500366889E12, "std_deviation" : 2419360.762756743, "std_deviation_bounds" : { "upper" : 6619043.192180153, "lower" : -3058399.858846819 } } } ``` ### **最大值聚合 ( max )** ------------ 最大值聚合用于查找聚合文档中特定数字字段的最大值 例如 ```python POST http://localhost:9200/user*/_search ``` 请求正文 ```python { "aggs" : { "max_money" : { "max" : { "field" : "money" } } } ``` 响应内容 ```python { "took" : 22, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "DDKK.COM 弟弟快看,程序员编程资料站 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "max_money" : { "value" : 5201814.0 } } ``` ### **最小值聚合 ( min )** ------------ 最小值聚合用于查找聚合文档中特定数字字段的最小值 例如 ```python POST http://localhost:9200/user*/_search?pretty ``` 请求正文 ```python { "aggs" : { "min_money" : { "min" : { "field" : "money" } } } ``` 响应内容 ```python { "took" : 20, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "枫晚", "description" : "停车坐爰枫林晚", "street" : "苏州大学", "city" : "Suzhou", "state" : "Jiangsu", "zip" : "215006", "location" : [ 120.65426, 31.30797 ], "money" : 10235, "tags" : [ "Java", "Android" ], "vitality" : "3.5" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "question", "description" : "问题少年也是少年", "street" : "张江高科技园区", "city" : "Shanghai", "state" : "Shanghai", "zip" : "201204", "location" : [ 121.60632, 31.199305 ], "money" : 13648, "tags" : [ "VUE", "HTML" ], "vitality" : "8.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "DDKK.COM 弟弟快看,程序员编程资料站 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "min_money" : { "value" : 10235.0 } } ``` ### **求和聚合 ( sum )** ------------ 求和聚合 ( sum ) 用于计算聚合文档中特定数字字段的总和 例如 ```python POST http://localhost:9200/user*/_search?pretty ``` 请求正文 ```python { "aggs" : { "total_money" : { "sum" : { "field" : "money" } } } ``` 返回响应 ```python { "took" : 10, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "枫晚", "description" : "停车坐爰枫林晚", "street" : "苏州大学", "city" : "Suzhou", "state" : "Jiangsu", "zip" : "215006", "location" : [ 120.65426, 31.30797 ], "money" : 10235, "tags" : [ "Java", "Android" ], "vitality" : "3.5" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "question", "description" : "问题少年也是少年", "street" : "张江高科技园区", "city" : "Shanghai", "state" : "Shanghai", "zip" : "201204", "location" : [ 121.60632, 31.199305 ], "money" : 13648, "tags" : [ "VUE", "HTML" ], "vitality" : "8.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "DDKK.COM 弟弟快看,程序员编程资料站 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "total_money" : { "value" : 5364848.0 } } ``` 此外,还存在一些其它聚合函数用于计算地理位置,如地理边界聚合和地理质心聚合 ### **批量聚合 ( Bucket )** ------------ 这些聚合包含了许多具有统一标准的不同类型的桶聚合,它们用于确定文档是否应该属于某个桶。 下面我们将会罗列这些桶聚合 ### **子聚合** ------------ 批量聚合会生成一组文档,这些文档将映射到父桶中 参数`type` 用于定义父索引 例如,假如我们有一个品牌及其不同的模型,然后模型类型将包含以下` _parent` 字段 ```python { "model" : { "_parent" : { "type" : "brand" } } ``` 还有很多其它的特殊的批量集合,在某些特定的情况下很好用,我们罗列如下 1、 DateHistogram聚合; 2、 DateRange聚合; 3、 Filter聚合; 4、 Filters聚合; 5、 GeoDistance聚合; 6、 GeoHashgrid聚合; 7、 Global聚合; 8、 Histogram聚合; 9、 IPv4Range聚合; 10、 Missing聚合; 11、 Nested聚合; 12、 Range聚合; 13、 Reversenested聚合; 14、 Sampler聚合; 15、 SignificantTerms聚合; 16、 Terms聚合; ### **聚合元数据** ------------ 可以在请求时使用 meta 参数添加关于聚合的一些数据,然后就可以在响应时获取到这些数据 ```python POST http://localhost:9200/user*/report/_search?pretty ``` 请求正文 ```python { "aggs" : { "min_money" : { "avg" : { "field" : "money" } , "meta" :{"dsc" :"Lowest Moneys"} } } ``` 响应内容 ```python { "took" : 30, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] }, "aggregations" : { "min_money" : { "meta" : { "dsc" : "Lowest Moneys" }, "value" : null } } ```
李智
2025年3月17日 13:24
转发文档
收藏文档
上一篇
下一篇
手机扫码
复制链接
手机扫一扫转发分享
复制链接
Markdown文件
分享
链接
类型
密码
更新密码