Async IO

I was recently reading a series on “Write Sequential Non-Blocking IO Code With Fibers in NodeJS” by Venkatesh.

Venki was essentially trying to emphasize that writing non-blocking code in NodeJS (either via callbacks, or using promises) can get hairy really fast. For example, this code demonstrates that aptly:

var express = require(‘express‘);
var app = express();

app.get(‘/users/:fbId‘, function(req, res) {
  var id = req.params.id;
  var key = ‘user:‘ + id;
  client.get(key, function(err, reply) {
    if (err !== null) {
      res.send(500);
      return;
    }

    if (reply === null) {
      res.send(404);
      return;
    }

    res.send(200, {id: id, name: reply});
  });
});

The exact code is available on GitHub (so is the promises driven version, but I won’t bother inlining it.)

What we actually wanted to write (if it were possible, was):

var express = require(‘express‘);
var app = express();

app.get(‘/users/:fbId‘, function(req, res) {
  var id = req.params.id;
  var key = ‘user:‘ + id;

  try {
    var reply = client.get(key);
    if (reply === null) {
      res.send(404);
      return;
    }

    res.send(200, {id: id, name: reply});
  }
  catch(err) {
      res.send(500);
  }
});

The magic would happen in line number 9 (above.) Instead of having to provide a cascade of callbacks (what if we wanted to do another lookup after we got the value back from the first), we could just write them serially, one after the other.

Well. Apparently we can!

Fibers

A fiber is a particularly lightweight thread of execution. Like threads, fibers share address space. However, fibers use co-operative multitasking while threads use pre-emptive multitasking. Threads often depend on the kernel’s thread scheduler to preempt a busy thread and resume another thread; fibers yield themselves to run another fiber while executing.

Fibers allow exactly this kind of black magic in NodeJS. It is still callbacks internally, but we are exposed to none of it in our application code. Sure you will end up writing a bunch of wrappers (or have some tool generate them for us), but we would have the sweet sweet pleasure of writing async IO code without having to jump through all the hoops. This is how the wrapper code for redis client looks like:

var Fiber = require(‘fibers‘);
var client = require(‘./redis-client‘);

exports.get = function(key) {
  var err, reply;
  var fiber = Fiber.current;

  client.get(key, function(_err, _reply) {
    err = _err;
    reply = _reply;
    fiber.run();
  });

  Fiber.yield();

  if (err != null) {
    throw err;
  }

  return reply;
};

(the real code is here in case you are curious)

I liked how the code looked. Having survided a ‘promising’ node.js project, I was definitely curious about this new style. Maybe this can be the saving grace (before generators and yield take over the JS world) for real world server side JavaScript.

Fibers you say

But the code (and the underlying technique which makes it tick) sounded very familiar, and reminded me of a similar technique which is used in Go to allow writing beautiful async IO code. For example, the same function from above in Go:

m.Get("/users/:id", func(db *DB, params martini.Params) (int, []byte) {
  str := params["id"]
  id, err := strconv.Atoi(str)
  if err != nil {
    return http.StatusBadRequest, []byte{}
  }

  u, err := db.LoadUser(id)
  if err != nil {
    return http.StatusNotFound, []byte{}
  }
  return http.StatusOK, encoder.Must(enc.Encode(u))
})

Sure, there is a little more happening in here (Go is statically typed), buts its the exact same thing as the fibers example, without all the manual wrapping. Any call which does IO (like line 8) blocks the currently executing goroutine (just like a fiber, a lightweight thread.) The natural question to ask is, if the goroutine gets blocked, how do other requests get processed? Its quite simple actually. The Go runtime automatically schedules any other goroutine which is ready to run (their IO call is done) on the thread on which the current goroutine was running.

Since goroutines are light weight (stack size is just 4 KB in Go 1.3beta1 compared to the much larger ~2 MB thread stacks), it is not unusual to have hundreds of thousands of goroutines actively running in a single process, all humming along together. The best part, since the threads have to do less context switching (the same physical thread can continue running on the processor core, just the instruction pointer keeps changing as the goroutines shuffle in and out, just as in method calls), we are able to extract a lot more efficiency from the same unit of hardware than otherwise. Otherwise IO calls, which would otherwise cause the thread to block and wait, could cripple the system and bring it down to its knees. Read this article for more context on this.

Performance

A fellow ThoughtWorker asked me, “Does performance matter when choosing a framework?”

I know where he was coming from, and how we shouldn’t make decisions purely based on performance (we would all be doing assembly if that was the case.) While it is true that as a startup (or even in the case of a well established player), building the MVP and getting it to the users is paramount, you really dont want to face the situation where you suddenly have a huge influx of users (say it goes viral) and you are caught between a ROCK (scale horizontally by throwing compute units at the problem) and a HARD PLACE (have to rewrite the solution in a technology more amenable to scaling.) Both of these options are expensive, and can potentially be a deal breaker.

Therefore, provided everything else is more or less equal, choosing the more performant one is never a bad thing.

With this context, I decided to compare the two solutions for their performance, given that they more or less looked the same. I decided to allow the system under test to use as many cores as they wanted, and then hit them with 100 concurrent users, each of which is going full tilk for around 20 seconds (used the awesome wrktool for benchmarking.)

The results:

Golang  
Stdlib 134566 (3.81ms)
Gorilla 125092 (4.28ms)
Martini 51330 (9.51ms)
node.js  
Stdlib 54510 (7.78ms)
Callbacks* 36107 (10.84ms)
Fibers* 27372 (18.76ms)
Promises* 22665 (17.15ms)

* The Callbacks, Fibers and Promises versions are created using Express. The Stdlib versions use the http support in the corresponding standard libraries.

All the numbers are in req/s as given by wrk (higher is better.) The latency details are in brackets (lower is better.) Clicking the numbers will take you to the corresponding code in the GitHub repo (the README has the detailed numbers.)

The tests were done on an updated Ubuntu 14.04 box with a Intel i7 4770 processor, 16 GB of RAM and a SSD.

As you can see, the fibers method of doing async IO in node.js comes with a perceivable loss in throughput compared to the pure callbacks based approach, but looks relatively better than the promises version for this micro-benchmark.

At the same time, the default way of doing IO in Golang does very well for itself. More than 134,000 req/s with a 3.81 ms 99th percentile latency. All this without having to go through crazy callbacks/promises hoops. How cool is that?

How the tests were run?

Software versions

  • Go 1.3beta1
  • node.js 0.10.28
  • wrk 3.1.0

Command used to run

A more detailed description is available in the README but I will explain a simple version here:

  • Start the program (by say running ./start_martini.sh)
  • Run the benchmark (by running ./bench.sh)
  • Record the result
  • Rince and repeat 3 times and take the best run

Notes

  • All cores on the Intel i7 4770 were set to the performance governor
  • Redis was not tweaked
  • ulimit was not raised

Summary

This is part 1 in a multipart series looking at how async IO (and programming in general) is done in various languages/platforms. We will be going indepth into one language/platform with the every new article in the series. Future parts will look at Scala, Clojure, Java, C#, Python and Ruby based frameworks and try and present a holistic view of the async world.

But one thing is very clear, async IO is here to stay. Not embrassing it would be foolhardy given the need to stay lean. Hope these articles help you understand gravity of the decision.

While some might argue that what we did in Golang was not really async, as the call was blocking in nature. But the net result achieved, and the reason why Go is still able to provide an awesome throughput despite blocking IO calls, is because the Go runtime essentially does the heavy lifting for you. When one goroutine is busy waiting for the results of a IO call to come back, other goroutines can take their place and not waste CPU cycles. The fact that this mechanism allows us to get away with fewer threads that would be required otherwise, is the icing on top.

时间: 2024-10-20 15:28:09

Async IO的相关文章

ORACLE数据库异步IO介绍

异步IO概念 Linux 异步 I/O (AIO)是 Linux 内核中提供的一个增强的功能.它是Linux 2.6 版本内核的一个标准特性,当然我们在2.4 版本内核的补丁中也可以找到它.AIO 背后的基本思想是允许进程发起很多 I/O 操作,而不用阻塞或等待任何操作完成.稍后或在接收到 I/O 操作完成的通知时,进程就可以检索 I/O 操作的结果. Linux IO模型(I/O models)分同步IO模型(synchronous models)和异步IO模型(asynchronous mo

standby 磁盘IO性能较差,影响Primary性能

1. 近日处理一个由于standby 磁盘IO性能较差,导致Primary的性能受到影响.主库主要是等待"log file switch completion",通过ASH dump分析,最终发现实际等待事件是"LGWR-LNS wait on channel".这个事件基本上可以将问题归结到网络性能和standby的IO性能,而客户的传输模式是"MAXIMUM AVAILABILITY"最后提出两个解决方案,(1). 更换性能更好的standb

WaitType:ASYNC

项目组有一个数据库备份的Job运行异常,该Job将备份数据存储到remote server上,平时5个小时就能完成的备份操作,现在运行19个小时还没有完成,backup命令的Wait type是 ASYNC_IO_COMPLETION: 根据MSDN 官方文档的定义:Occurs when a task is waiting for asynchronous I/O operations to finish. 该等待类型表示:Task在等待异步IO操作的完成.进程申请了一个IO操作,但是系统(D

ceshi

第1章 ASM安 select * from dual; 装- 3 - 1.1 简介- 3 - 1.1.1 ASMLib- 3 - 1.1.2 什么是 udev- 4 - 1.1.3 Why ASMLIB and why not?- 4 - 1.2 在 RHEL 6.4 上安装 Oracle 11gR2 + ASM --使用udev- 6 - 1.2.1 检查硬件- 6 - 1.2.2 安装软件包检查- 8 - 1.2.3 修改主机名- 13 - 1.2.4 网络配置- 13 - 1.2.5 磁

my-innodb-heavy-4G.cnf配置文件注解

#BEGIN CONFIG INFO #DESCR: 4GB RAM, InnoDB only, ACID, few connections, heavy queries  #TYPE: SYSTEM  #END CONFIG INFO# # This is a MySQL example config file for systems with 4GB of memory  # running mostly MySQL using InnoDB only tables and performi

【mysql】Innodb三大特性之insert buffer

一.什么是insert buffer insert buffer是一种特殊的数据结构(B+ tree)并不是缓存的一部分,而是物理页,当受影响的索引页不在buffer pool时缓存 secondary index pages的变化,当buffer page读入buffer pool时,进行合并操作,这些操作可以是 INSERT, UPDATE, or DELETE operations (DML) 最开始的时候只能是insert操作,所以叫做insert buffer,现在已经改叫做chang

转:openTSDB 2.0 安装

OpenTSDB-2.0.0安装布署 2014-02-27 11:07:49|  分类: 大数据 |  标签:hadoop  |举报|字号 订阅 1.介绍OpenTSDB是一个架构在Hbase系统之上的实时监控信息收集和展示平台.它支持秒级数据采集所有metrics,支持永久存储,可以做容量规划,并很容易的接入到现有的报警系统里.OpenTSDB可以从大规模的集群(包括集群中的网络设备.操作系统.应用程序)中获取相应的metrics并进行存储.索引以及服务,从而使得这些数据更容易让人理解,如we

Oracle Bug 某脚本一直处于执行状态,等待事件为:asynch descriptor resize

问题描述:       项目反馈数据库服务器的CPU持续100%的情况,跟踪发现很多活动会话的等待事件是"asynch descriptor resize",并且这些会话一直处于Active状态,而这些会话执行的查询并不复杂,感觉很是奇怪.       查阅了一下资料,对应Oracle的Bug 9829397,Excessive CPU and many "asynch descriptor resize" waits for SQL using Async IO,

Mysql 笔记(一)

InnoDB存储引擎 mysql 存储引擎(好难用,看https://www.zybuluo.com/eqyun/note/27850) 简介 InnoDB是事务安全的MySQL存储引擎,从MySQL5.5版本开始是默认的表存储引擎,是第一个完整支持ACID事务的MySQL存储引擎,其特点是行锁设计.支持MVCC.支持外键.提供一致性锁定读,同时被设计用来最有效地利用以及使用内存和CPU InnoDB存储引擎体系架构 后台线程(多个)->InnoDB存储引擎内存池->物理文件 后台线程 1.