概念
Ambari Metrics是Ambari中负责监控集群状态的功能组件。它有如下一些主要的概念:
Terminology | Description |
---|---|
Ambari Metrics System (“AMS”) | The built-in metrics collection system for Ambari. |
Metrics Collector | The standalone server that collects metrics, aggregates metrics, serves metrics from the Hadoop service sinks and the Metrics Monitor. |
Metrics Hadoop Sinks | Plugs into the various Hadoop components sinks to send Hadoop metrics to the Metrics Collector. |
Metrics Monitor | Installed on each host in the cluster to collect system-level metrics and forward to the Metrics Collector. |
简单地说,Ambari收集两类信息放到Collector上:
1. 各节点“系统级”的指标
2. Hadoop各组件的指标
前者是通过安装在每个节点上的Metrics Monitor(就是Agent)来收集的,后者是通过面向特定Hadoop组件的Sink(概念上和Flume的Sink是一样的)来收集的。
最后补充一一点,Collector是使用HBase存放Metrics数据的。
架构
配置
配置Ambari Metrics为分布式模式
默认安装时Ambari Metrics为embedded模式,这样收集的所有数据是存放在Collector节点的本地的,大量的Metrics数据会挤占大量的本地存储空间,该为分布式模式后Metrics数据会放置到HDFS上,所以通常这是安装Ambari后必备一个操作。具体的操作可以参考: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_ambari_reference_guide/content/_configuring_ambari_metrics_for_distributed_mode.html
配置Metrics数据的生命周期
大量的Metrics会占用非常大的存数空间,设定Metrics数据的保留时间(TTL)是很必要的,控制Metrics数据保留时间的参数位于ams-site.xml中,以下是相关的配置项:
配置项 | 默认值 | 描述 |
---|---|---|
timeline.metrics.host.aggregator.ttl | 86400 | 1 minute resolution data purge interval. Default is 1 day. |
timeline.metrics.host.aggregator.minute.ttl | 604800 | Host based X minutes resolution data purge interval. Default is 7 days.(X = configurable interval, default interval is 2 minutes) |
timeline.metrics.host.aggregator.hourly.ttl | 2592000 | Host based hourly resolution data purge interval. Default is 30 days. |
timeline.metrics.host.aggregator.daily.ttl | 31536000 | Host based daily resolution data purge interval. Default is 1 year. |
timeline.metrics.cluster.aggregator.minute.ttl | 2592000 | Cluster wide minute resolution data purge interval. Default is 30 days. |
timeline.metrics.cluster.aggregator.hourly.ttl | 31536000 | Cluster wide hourly resolution data purge interval. Default is 1 year. |
timeline.metrics.cluster.aggregator.daily.ttl | 63072000 | Cluster wide daily resolution data purge interval. Default is 2 years. |
版权声明:本文为博主原创文章,未经博主允许不得转载。
时间: 2024-10-13 07:11:51