OpenTSDB-Writing Data

Writing Data

You may want to jump right in and start throwing data into your TSD, but to really take advantage of OpenTSDB‘s power and flexibility, you may want to pause and think about your naming schema. After you‘ve done that, you can procede to pushing data over the Telnet or HTTP APIs, or use an existing tool with OpenTSDB support such as ‘tcollector‘.

你可能调到这里,开始将数据丢进TSD中,但是真正地利用好OpenTSDB的强大功能以及灵活性,你可能需要停一下,想一下你的naming schema。

然后,你就可以继续通过Telnet或者HTTPAPIs推送数据,或者利用现有OpenTSDB支持的工具,如tcollector

Naming Schema命名范式

Many metrics administrators are used to supplying a single name for their time series. For example, systems administrators used to RRD-style systems may name their time series webserver01.sys.cpu.0.user. The name tells us that the time series is recording the amount of time in user space for cpu 0 on webserver01. This works great if you want to retrieve just the user time for that cpu core on that particular web server later on.

多数的metrics使用单个命名。例如,系统管理的参数使用RRD-格式命名,格式如webserver01.sys.cpu.0.user。这个名字告诉我们,时间序列是记录webser01上cpu0的user 

占用的时间。如果你想获取特定web server上cpu的用户态使用时间的话,这将很好地支持。

But what if the web server has 64 cores and you want to get the average time across all of them? Some systems allow you to specify a wild card such as webserver01.sys.cpu.*.user that would read all 64 files and aggregate the results. Alternatively, you could record a new time series called webserver01.sys.cpu.user.all that represents the same aggregate but you must now write ‘64 + 1‘ different time series. What if you had a thousand web servers and you wanted the average cpu time for all of your servers? You could craft a wild card query like *.sys.cpu.*.user and the system would open all 64,000 files, aggregate the results and return the data. Or you setup a process to pre-aggregate the data and write it to webservers.sys.cpu.user.all.

但是,如果web server有64个核,而你想获取平均时间呢?有些系统允许你使用一个模糊匹配,例如webserver01.sys.cpu.*.user ,然后读取64个文件,然后将它们聚合。

另外,你可以记录一个新的时间序列,名为webserver01.sys.cpu.user.all,这样表示同样的聚合效果,但是需要64+1个不同的时间序列。

如果你有1000个webserer,对所有的server求cpu平均时间的画?你可能使用*.sys.cpu.*.user ,然后读取64000个文件,然后聚合结果返回数据,或者提前聚合数据,写入新的时间序列如webservers.sys.cpu.user.all。

OpenTSDB handles things a bit differently by introducing the idea of ‘tags‘. Each time series still has a ‘metric‘ name, but it‘s much more generic, something that can be shared by many unique time series. Instead, the uniqueness comes from a combination of tag key/value pairs that allows for flexible queries with very fast aggregations.

OpenTSDB使用不同的处理方式,引入tags的思想。每个时间序列都有一个metric的名字,但是这个更通用,被很多不同的时间序列共享。

唯一性来自于tag,key/value pairs,这样使用查询灵活,也快速进行整合。

Note

Every time series in OpenTSDB must have at least one tag.

在OpenTSDB中的每个时间至少有一个tag。

Take the previous example where the metric was webserver01.sys.cpu.0.user. In OpenTSDB, this may become sys.cpu.userhost=webserver01, cpu=0. Now if we want the data for an individual core, we can craft a query likesum:sys.cpu.user{host=webserver01,cpu=42}. If we want all of the cores, we simply drop the cpu tag and ask forsum:sys.cpu.user{host=webserver01}. This will give us the aggregated results for all 64 cores. If we want the results for all 1,000 servers, we simply request sum:sys.cpu.user. The underlying data schema will store all of the sys.cpu.user time series next to each other so that aggregating the individual values is very fast and efficient. OpenTSDB was designed to make these aggregate queries as fast as possible since most users start out at a high level, then drill down for detailed information.

回到前面的例子中的metric,webserver01.sys.cpu.0.user。在OpenTSDB中,将变为sys.cpu.userhost=webserver01, cpu=0。

如果想获取单个核的数据,可以使用如下查询sys.cpu.user{host=webserver01,cpu=42}。

如果想获取所有核的话,可以使用如下查询sys.cpu.user{host=webserver01},这给出64个核聚合后的结果。

如果想获取所有webserver的,查询方式如sys.cpu.user。

底层的数据结构是逐个存储sys.cpu.user时间序列,因此获取单个值是非常快和高效的。

OpenTSDB设计的目标是尽可能地快进行查询的整合,因为大多数用户进行更上层的查询,然后获取更细节的信息。

Aggregations——聚合

While the tagging system is flexible, some problems can arise if you don‘t understand how the querying side of OpenTSDB, hence the need for some forethought. Take the example query above: sum:sys.cpu.user{host=webserver01}. We recorded 64 unique time series forwebserver01, one time series for each of the CPU cores. When we issued that query, all of the time series for metric sys.cpu.user with the tag host=webserver01 were retrieved, averaged, and returned as one series of numbers. Let‘s say the resulting average was 50 for timestamp 1356998400. Now we were migrating from another system to OpenTSDB and had a process that pre-aggregated all 64 cores so that we could quickly get the average value and simply wrote a new time series sys.cpu.user host=webserver01. If we run the same query, we‘ll get a value of 100 at 1356998400. What happened? OpenTSDB aggregated all 64 time series and the pre-aggregated time series to get to that 100. In storage, we would have something like this:

虽然标签系统很灵活,但是如果不了解OpenTSDB的查询方式,可能还会遇到问题,因此需要进一步了解。

以上面的查询作为例子:sum:sys.cpu.user{host=webserver01}

webserver01记录64个不同时间序列,每个核都记录一个。当讨论查询时,所有带有标签host=webserver01的sys.cpu.user的metric都会查询,然后求平均,返回一串数字。

假设结果平均值为50,时间戳为1356998400。现在我们移到另一个OpenTSDB系统,它有一个进程提前整合64核的数据,这样我们将快速得到平均值,写入一个新的时间序列中sys.cpu.user host=webserver01,但是运行同样的查询,结果却为100。这样是发生什么事情呢?

在存储中,数据格式如下:

sys.cpu.user host=webserver01        1356998400  50
sys.cpu.user host=webserver01,cpu=0  1356998400  1
sys.cpu.user host=webserver01,cpu=1  1356998400  0
sys.cpu.user host=webserver01,cpu=2  1356998400  2
sys.cpu.user host=webserver01,cpu=3  1356998400  0
...
sys.cpu.user host=webserver01,cpu=63 1356998400  1

OpenTSDB will automatically aggregate all of the time series for the metric in a query if no tags are given. If one or more tags are defined, the aggregate will ‘include all‘ time series that match on that tag, regardless of other tags. With the querysum:sys.cpu.user{host=webserver01}, we would include sys.cpu.user host=webserver01,cpu=0 as well as sys.cpu.userhost=webserver01,cpu=0,manufacturer=Intel,

sys.cpu.user host=webserver01,foo=bar and

sys.cpu.userhost=webserver01,cpu=0,datacenter=lax,department=ops.

The moral of this example is: be careful with your naming schema.

如果在一个查询中没有设置tags,OpenTSDB自动整合所有时间序列。如果定义一个或者多个tags,整合只会包含和tag匹配的时间序列,忽略掉其他的tags。

例如,查询sum:sys.cpu.user{host=webserver01},将会包括如下:

sys.cpu.user host=webserver01,cpu=0

sys.cpu.userhost=webserver01,cpu=0,manufacturer=Intel

sys.cpu.user host=webserver01,foo=bar

sys.cpu.userhost=webserver01,cpu=0,datacenter=lax,department=ops

这个例子的寓意是:使用naming schema应谨慎

【参考资料】

1、http://opentsdb.net/docs/build/html/user_guide/writing.html

时间: 2024-10-17 06:22:16

OpenTSDB-Writing Data的相关文章

opentsdb+grafana监控系按使用总结

一.OpenTSDB简介 开源监控系统OpenTSDB,用hbase存储所有的时序(无须 采样)来构建一个分布式.可伸缩的时间序列数据库.它支持秒级数据采集所有metrics,支持永久存储,可以做容量规划,并很容易的接入到现有的报警系 统里.OpenTSDB可以从大规模的集群(包括集群中的网络设备.操作系统.应用程序)中获取相应的metrics并进行存储.索引以及服务,从而使得这些数据更容易让人理解,如web化,图形化等. 对于运维工程师而言,OpenTSDB可以获取基础设施和服务的实时状态信息

R Programming week1-Reading Data

Reading Data There are a few principal functions reading data into R. read.table, read.csv, for reading tabular data readLines, for reading lines of a text file source, for reading in R code files (inverse of dump) dget, for reading in R code files (

记一次 Data Binding 在 library module 中遇到的大坑

使用 Data Binding 也有半年多了,从最初的 setVariable,替换 findViewById,到比较高级的双向绑定,自定义 Adapter.Component,查看源码了解编译.运行流程,也算是小有成果,且没有碰到 Data Binding 本身实现上的问题. 然而,最近在一次重构组件化(见 MDCC 上冯森林的<回归初心,从容器化到组件化>)的过程中,碰到了一个比较严重的 BUG.已经提交 issue(#224048)到了 AOSP,虽然改起来是不麻烦,但是因为是 grad

data cleaning

Cleaning data in Python Table of Contents Set up environments Data analysis packages in Python Clean data in Python Load dataset into Spyder Subset Drop data Transform data Create new variables Rename variables Merge two datasets Handle missing value

Create the Data Access Layer

https://docs.microsoft.com/en-us/aspnet/web-forms/overview/getting-started/getting-started-with-aspnet-45-web-forms/create_the_data_access_layer This tutorial describes how to create, access, and review data from a database using ASP.NET Web Forms an

Data storage on the batch layer

4.1 Storage requirements for the master dataset To determine the requirements for data storage, you must consider how your data will be written and how it will be read. The role of the batch layer within the Lambda Architecture affects both values. I

Why Apache Beam? A data Artisans perspective

https://cloud.google.com/dataflow/blog/dataflow-beam-and-spark-comparison https://github.com/apache/incubator-beam https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102 h

[Hive - LanguageManual] Hive Data Manipulation Language

LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files into tables Syntax Synopsis Notes Inserting data into Hive Tables from queries Syntax Synopsis Notes Dynamic Partition Inserts Example Additional Documen

[Hive-Tutorial] Querying and Inserting Data 查询和插入数据

Querying and Inserting Data Simple Query Partition Based Query Joins Aggregations Multi Table/File Inserts Dynamic-Partition Insert Inserting into Local Files Sampling Union All Array Operations Map (Associative Arrays) Operations Custom Map/Reduce S