8 years ago · 41d63fdc67
--- a/03_Aggregations.asciidoc
+++ b/03_Aggregations.asciidoc
@@ -1,48 +1,32 @@
 
				 ifndef::es_build[= placeholder3]

			
 
				 

			
 
				 [[aggregations]]

			
 
				-= Aggregations

			
 
				+= 聚合

			
 
				 

			
 
				 [partintro]

			
 
				 --

			
 
				-Until this point, this book has been dedicated to search.((("searching", "search versus aggregations")))((("aggregations")))  With search,

			
 
				-we have a query and we want to find a subset of documents that

			
 
				-match the query.  We are looking for the proverbial needle(s) in the

			
 
				-haystack.

			
 
				+在这之前，本书致力于搜索。((("searching", "search versus aggregations")))((("aggregations")))  通过搜索，如果我们有一个查询并且希望找到匹配这个查询的文档集，就好比在大海捞针。

			
 
				 

			
 
				-With aggregations, we zoom out to get an overview of our data.  Instead of

			
 
				-looking for individual documents, we want to analyze and summarize our complete

			
 
				-set of data:

			
 
				+通过聚合，我们会得到一个数据的概览。我们需要的是分析和总结全套的数据而不是寻找单个文档：

			
 
				 

			
 
				 // Popular manufacturers? Unusual clumps of needles in the haystack?

			
 
				-- How many needles are in the haystack?

			
 
				-- What is the average length of the needles?

			
 
				-- What is the median length of the needles, broken down by manufacturer?

			
 
				-- How many needles were added to the haystack each month?

			
 
				-

			
 
				-Aggregations can answer more subtle questions too:

			
 
				-

			
 
				-- What are your most popular needle manufacturers?

			
 
				-- Are there any unusual or anomalous clumps of needles?

			
 
				-

			
 
				-Aggregations allow us to ask sophisticated questions of our data.  And yet, while

			
 
				-the functionality is completely different from search, it leverages the

			
 
				-same data-structures.  This means aggregations execute quickly and are

			
 
				-_near real-time_, just like search.

			
 
				-

			
 
				-This is extremely powerful for reporting and dashboards.  Instead of performing

			
 
				-_rollups_ of your data (_that crusty Hadoop job that takes a week to run_),

			
 
				-you can visualize your data in real time, allowing you to respond immediately.

			
 
				-Your report changes as your data changes, rather than being pre-calculated, out of

			
 
				-date and irrelevant.

			
 
				-

			
 
				-Finally, aggregations operate alongside search requests.((("aggregations", "operating alongside search requests"))) This means you can

			
 
				-both search/filter documents _and_ perform analytics at the same time, on the

			
 
				-same data, in a single request.  And because aggregations are calculated in the

			
 
				-context of a user's search, you're not just displaying a count of four-star hotels--you're displaying a count of four-star hotels that _match their search criteria_.

			
 
				-

			
 
				-Aggregations are so powerful that many companies have built large Elasticsearch

			
 
				-clusters solely for analytics.

			
 
				+- 在大海里有多少针？

			
 
				+- 针的平均长度是多少？

			
 
				+- 按照针的制造商来划分，针的长度中位值是多少？

			
 
				+- 每月加入到海中的针有多少？

			
 
				+

			
 
				+聚合也可以回答更加细微的问题：

			
 
				+

			
 
				+- 你最受欢迎的针的制造商是什么？

			
 
				+- 这里面有异常的针么？

			
 
				+

			
 
				+聚合允许我们向数据提出一些复杂的问题。虽然功能完全不同于搜索，但它使用相同的数据结构。这意味着聚合的执行速度很快并且就像搜索一样几乎是实时的。

			
 
				+

			
 
				+这对报告和仪表盘是非常强大的。你可以实时显示你的数据，让你立即回应，而不是对你的数据进行汇总（ _需要一周时间去运行的 Hadoop 任务_ ），您的报告随着你的数据变化而变化，而不是预先计算的、过时的和不相关的。

			
 
				+

			
 
				+最后，聚合和搜索是一起的。((("aggregations", "operating alongside search requests"))) 这意味着你可以在单个请求里同时对相同的数据进行搜索/过滤和分析。并且由于聚合是在用户搜索的上下文里计算的，你不只是显示四星酒店的数量，而是显示匹配查询条件的四星酒店的数量。

			
 
				+

			
 
				+聚合是如此强大以至于许多公司已经专门为数据分析建立了大型 Elasticsearch 集群。

			
 
				 --

			
 
				 

			
 
				 include::301_Aggregation_Overview.asciidoc[]