Visualize real-time data streams with Gnuplot

源文地址

(September 2008)

For the last couple of years, I‘ve been working on European Space Agency (ESA) projects - writing rather complex code generators. In the ESA project I am currently working on, I am also the technical lead; and I recently faced the need to (quickly) provide real-time plotting of streaming data. Being a firm believer in open-source, after a little Googling I found Gnuplot. From my (somewhat limited) viewpoint, Gnuplot appears to be the LaTEX equivalent in the world of graphs: amazing functionality that is also easily accessible. Equally important, Gnuplot follows the powerful paradigm that UNIX established: it comes with an easy to use scripting language, thus allowing its users to prescribe actions and "glue" Gnuplot together with other applications - and form powerful combinations.

To that end, I humbly submit a little creation of mine: a Perl script that spawns instances of Gnuplot and plots streaming data in real-time.


Plotting data in real-time

Interfacing over standard input

My coding experience has taught me to strive for minimal and complete interfaces: to that end, the script plots data that will arrive over the standard input, one sample per line. The samples are just numbers (integers / floating point numbers), and must be prefixed with the stream number ("0:", "1:", etc). Each plot window will also be configured to display a specific number of samples.

The resulting script is relatively simple - and easy to use:

bash ./driveGnuPlots.pl

Usage: ./driveGnuPlots.pl <options>
where options are (in order):

NumberOfStreams                         How many streams to plot (windows)
Stream1_WindowSampleSize <Stream2...>   This many window samples for each stream
Stream1_Title <Stream2_Title> ...       Title used for each stream
(Optional) Stream1_geometry <...>.      Sizes and positions in pixels

The last parameters (the optionally provided geometries of the gnuplot windows)
are of the form:
  WIDTHxHEIGHT+XOFF+YOFF
  

Note that the script uses the "autoscale" feature of GnuPlot, to automatically adapt to the incoming value ranges.

An example usage scenario: plotting sine and cosine

Let‘s say we want to see a sine and a cosine run side-by-side, in real-time. We also want to watch the cosine "zooming-in" by 10x (time-scale wise). The following code will print our test samples:

#!/usr/bin/perl -w
use strict;

use Time::HiRes qw/sleep/;

# First, set the standard output to auto-flush
select((select(STDOUT), $| = 1)[0]);

# And loop 5000 times, printing values...
my $offset = 0.0;
while(1) {
    print "0:".sin($offset)."\n";
    print "1:".cos($offset)."\n";
    $offset += 0.1;
    if ($offset > 500) {
        last;
    }
    sleep(0.02);
}

We‘ll use this code to test our plotting script: the data for two streams (sine and cosine) are printed in the expected format: one sample (one number) printed per line. To distinguish between the two streams, the sample is prefixed with "0:", "1:", etc. Notice that we explicitly set the autoflush flag for our standard output: we need the data output to be unbuffered, otherwise our plotting script will receive data in bursts (when the data are flushed from the producer), and the plots will "jerk" forward.

This is how we test the plotting script (assuming we saved the sample code above in sinuses.pl): <

bash$ ./sinuses.pl | ./driveGnuPlots.pl 2 50 500 "Sine" "Cosine"

To stop the plotting, use Ctrl-C on the terminal you spawned from.

The parameters we passed to driveGnuPlots.pl are:

  • 2 is the number of streams
  • The window for the first stream (sine) will be 50 samples wide
  • The window for the second stream (cosine) will be 500 samples wide (hence the different "zoom" factor)
  • The titles of the two streams follow

When executed, the script spawns one gnuplot per each stream, and displays the graphs in a clear, flicker-free manner. If you don‘t like the Gnuplot settings I used (e.g. the grid, or the colors, or...) feel free to change them: the setup code that defines the plotting parameters starts at line 82 of the script.

Executive summary: plotting streaming data is now as simple as selecting them out from your "producer" program (filtering its standard output through any means you wish: grep, sed, awk, etc), and outputing them, one number per line. Just remember to prefix with the stream number ("0:", "1:", etc, to allow for multiple streams), and make sure you flush your standard output, e.g.

For this kind of output:

    bash$ /path/to/programName
    ...(other stuff)
    Measure:   7987.3
    ...(other stuff)
    Measure:   8364.4
    Measure:   8128.1
    ...

You would do this:

    bash$ /path/to/programName | 	grep --line-buffered ‘^Measure:‘ | 	awk -F: ‘{printf("0:%f\n", $2); fflush();}‘ | 	driveGnuPlots.pl 1 50 "My data"

In the code above, grep filters out the lines that start with "Measure:", and awk selects the 2nd column ($2) and prefixes it with "0:" (since this is the 1st - and only, in this example - stream we will display). Notice that we used the proper options to force the standard output‘s flushing for both grep (--line-buffered) and awk (fflush() called).

Preparing for a demo

You don‘t want to move the GnuPlot windows after they are shown, do you? So you can just specify their placement, in "WIDTHxHEIGHT+XOFF+YOFF" format (in pixels):

bash$ ./sinus.pl | ./driveGnuPlots.pl 2 50 50 Sinus Cosinus 512x384+0+0 512x384+512+0

The provisioning of titles and GnuPlot window placement information, makes the script very well-suited for live demonstrations.

P.S. UNIX power in all its glory: it took me 30min to code this, and another 30 to debug it. Using pipes to spawned copies of gnuplots, we are able to do something that would require one or maybe two orders of magnitude more effort in any conventional programming language (yes, even accounting for custom graph libraries - you do have to learn their API and do your windows/interface handling...)

Update, November 30, 2009Andreas Bernauer has improved the script further, allowing multiple streams to be plotted in the same window. His work is here.

Update, December 20, 2009: Dima Kogan has done his own version, which detects the number of streams dynamically. He placed his code on GitHub.

时间: 2024-10-12 21:37:24

Visualize real-time data streams with Gnuplot的相关文章

Amazon Kinesis Data Streams 术语和概念

Kinesis Data Streams 高级别架构 下图演示 Kinesis Data Streams 的高级别架构.创建器会持续将数据推送到 Kinesis Data Streams,并且使用者 可实时处理数据.使用者(如在 Amazon EC2 上运行的自定义应用程序或 Amazon Kinesis Data Firehose 传输流)可以使用 Amazon DynamoDB.Amazon Redshift 或 Amazon S3 等 AWS 服务存储其结果. Kinesis Data S

Kinesis Data Streams 的服务器端加密

服务器端加密是 Amazon Kinesis Data Streams 中的一项功能,此功能在数据成为静态数据之前使用您指定的 AWS KMS 客户主密钥 (CMK) 自动对数据进行加密.数据在写入 Kinesis 流存储层之前加密,并在从存储检索到之后进行解密.因此,在 Kinesis Data Streams 服务中对数据进行静态加密.这样,您就可以满足严格的监管要求并增强您数据的安全性. 采用服务器端加密时,您的 Kinesis 流创建者和使用者不需要管理主密钥或加密操作.您的数据在进入和

读取 Amazon Kinesis Data Streams 中的数据

使用者 是一种处理 Kinesis 数据流中的所有数据的应用程序.当使用者使用增强型扇出功能 时,它会获取其自己的 2 MiB/秒的读取吞吐量配额,从而允许多个使用者并行读取相同流中的数据,而不必与其他使用者争用读取吞吐量.默认情况下,流中的每个分片均提供 2 MiB/秒的读取吞吐量.此吞吐量跨正在从某给定分片进行读取的所有使用器获取分片.换言之,每个分片的默认 2 MiB/秒的吞吐量是固定的,即使有多个使用器正在从分片中进行读取. 特性 没有增强型扇出功能的未注册使用者 具有增强型扇出功能的注

NTFS格式下的Alternate Data Streams

今天我写点NTFS的交换数据流以及其带来的安全问题(Alternate Data Stream/ADS) ========================================================================================================================================================================= 1.什么是ADS?   要了解什么是ADS我们必须先了

FunDA(9)- Stream Source:reactive data streams

上篇我们讨论了静态数据源(Static Source, snapshot).这种方式只能在预知数据规模有限的情况下使用,对于超大型的数据库表也可以说是不安全的资源使用方式.Slick3.x已经增加了支持Reactive-Streams功能,可以通过Reactive-Streams API来实现有限内存空间内的无限规模数据读取,这正符合了FunDA的设计理念:高效.便捷.安全的后台数据处理工具库.我们在前面几篇讨论里介绍了Iteratee模式,play-iteratees支持Reactive-St

翻译-In-Stream Big Data Processing 流式大数据处理

相当长一段时间以来,大数据社区已经普遍认识到了批量数据处理的不足.很多应用都对实时查询和流式处理产生了迫切需求.最近几年,在这个理念的推动下,催生出了一系列解决方案,Twitter Storm,Yahoo S4,Cloudera Impala,Apache Spark和Apache Tez纷纷加入大数据和NoSQL阵营.本文尝试探讨流式处理系统用到的技术,分析它们与大规模批量处理和OLTP/OLAP数据库的关系,并探索一个统一的查询引擎如何才能同时支持流式.批量和OLAP处理. 在Grid Dy

Indexing Sensor Data

In particular embodiments, a method includes, from an indexer in a sensor network, accessing a set of sensor data that includes sensor data aggregated together from sensors in the sensor network, one or more time stamps for the sensor data, and metad

Exploring the 7 Different Types of Data Stories

Exploring the 7 Different Types of Data Stories What makes a story truly data-driven? For one, the numbers aren’t caged in a sidebar graph. Instead, the data helps drive the narrative. Data can help narrate as many types of stories as there are angle

Streaming Big Data: Storm, Spark and Samza--转载

原文地址:http://www.javacodegeeks.com/2015/02/streaming-big-data-storm-spark-samza.html There are a number of distributed computation systems that can process Big Data in real time or near-real time. This article will start with a short description of th