actor concurrency

The hardware we rely on is changing rapidly as ever-faster chips are replaced by ever-increasing numbers of cores. As a result, concurrency and parallelism, niche features today, will soon be a basic requirement for most software. Application developers seeking concurrency solutions have discovered that shared-state concurrency (employed by Java and other imperative languages) comes up short when compared to functional models that promise better reliability and scalability. In this two-part article Alex Miller introduces the actor concurrency model, which has been popularized by Erlang and more recently adopted by Scala. In Part 1 you‘ll be introduced to the actor model and its benefits; in Part 2 you‘ll learn how to use it on the JVM. Level: Intermediate

As programmers we know that our software lives on top of hardware, which for decades has relied on chips getting faster at an unrelenting pace. Moore‘s Law famously states that the number of transistors on a chip will double approximately every 18 months. This exponential law has held true for nearly four decades. and has even exceeded that pace. Intel chips had a few thousand transistors in the early 1970s. The chip in the computer you‘re reading this article with probably has hundreds of millions of transistors, and newer quad-core chips have several billion.

Until recently, the increase in chip counts (and reduction in size) has made chips faster, increased the number of pipelines, and dramatically increased the number and size of on-chip caches. But hardware crossed a boundary in the early 2000s: chips got big enough and cycle speed got fast enough that a signal can no longer reach the whole chip in a clock cycle -- a physical and architectural limit that will be difficult to breach. Yet the transistors per chip continue their unrelenting exponential march. Instead of increased speed per core, we are now seeing an increase in the number of cores per chip. CPU speeds are likely to stay relatively flat or possibly even decrease in the near future in the service of reduced heat and power consumption (see Figure 1). Themulticore era is well under way, and it will change the way we write programs -- a reality that many software developers have yet to face.

Figure 1. Transistor and MIPS (millions of instructions per second) trends over time. Click to enlarge.

Right now you probably use a dual-core machine for day-to-day work, or maybe a quad-core if you‘re lucky. You might be deploying your server application to an 8-core machine. The concurrency model that most popular languages use now -- shared-state concurrency -- can make good use of multiple cores and result in fast and efficient software. But if the number of cores doubles every 18 months, in a decade we will be dealing with thousands of cores. (This seems ridiculous, but exponential curves always seem ridiculous to our feeble wetware. Forty years ago, the growth in transistor count to today‘s levels seemed ridiculous and unsustainable.) The prospect of multicore systems of this magnitude threatens the viability of the shared-state concurrency model. This article introduces you to a robust alternative -- the actor concurrency model -- and explains how it is implemented in a 20+-year-old yet increasingly relevant functional language: Erlang.

Shared-state concurrency

The most popular languages today rely on mutable shared state and locks as a model of concurrency. A typical example of multithreaded code is the Counter class in Listing 1.

Listing 1. A typical synchronized Counter class in Java

public class Counter {
  private int count = 0;

  public synchronized int increment() {
    return ++count;
  }
}

Counter is a thread-safe class that maintains a counter (the shared state) and protects access to that state by allowing only one thread at a time to read or modify that state. In Listing 1 that protection is provided by the synchronized block, but it could also be provided by explicit Lock objects.

If you‘ve written programs with shared-state concurrency, you‘ve likely encountered the legions of dragons that can arise. If mutable state is not modified and read under the scope of a lock, you might not see the latest version of the value from another thread. Or threads can obtain multiple locks in differing orders, leading to deadlock. Or the necessity for locking can lead to lock contention.

Perhaps the most difficult aspect of concurrent programming is that it‘s quite easy to combine several properly written concurrent components and thereby produce a system that either is broken or performs poorly. The composite system can be broken if the independent components modify state under different independent locks but the program requires atomic operations that modify state in both components. You can address this problem in a variety of ways with careful architecture, by exposing the locking semantics of the components, or by adding new layers of locks over the components. However, most of the easiest ways to deal with these issues result in poor performance or unmaintainable code.

Whether shared-state concurrency can scale is an equally important question. Will it scale to 16 cores? 32? 1000? Some parts of your program will naturally scale up. If you‘ve got a thread pool working on a set of parallel tasks, it‘s entirely feasible to scale that to a larger number of cores. But in general, much of your program does not consist of discrete tasks that can be easily made parallel.Amdahl‘s Law is commonly used to calculate the maximum benefit from increasing the parallel parts of your program. But the sad fact remains that as you scale, the non-parallel parts of your program are the ones that matter, and you can‘t parallelize them away.

Actor concurrency

So what‘s the alternative? The actor model of concurrency has recently gained prominence, largely thanks to some success achieved by its flagship language,Erlang.

The actor model consists of a few key principles:

  • No shared state
  • Lightweight processes
  • Asynchronous message-passing
  • Mailboxes to buffer incoming messages
  • Mailbox processing with pattern matching

Let‘s look at these principles in more detail. An actor is a process that executes a function. Here a process is a lightweight user-space thread (not to be confused with a typical heavyweight operating-system process). Actors never share state and thus never need to compete for locks for access to shared data. Instead, actors share data by sending messages that are immutable. Immutable data cannot be modified, so reads do not require a lock.

Messages are sent asynchronously and are buffered in an actor‘s mailbox. A mailbox is essentially a queue with multiple producers (other actors) and a single consumer. A particular actor is driven by receiving messages from the mailbox based on pattern matching.

Some key assumptions are built into this model:

  • Processes are cheap (in memory) and fast to create
  • Processes are small and thus large numbers of processes can be created
  • Processes are not bound to kernel threads and can be efficiently scheduled in user space
  • The scheduler requires the ability to pause and continue process execution
  • Messages can be retrieved by efficiently pattern-matching messages in the mailbox

Erlang

The actor model has most famously been associated with Erlang, a functional, dynamically typed language invented in 1986 at Ericsson. Erlang was designed for creating applications (such as telephone switches) that must run nonstop. That requires the code to be hot-swappable, distributed, and robust in the face of errors. Erlang was open-sourced in 1998 and has gained prominence in the last few years as concurrency has become a concern. A complete overview of Erlang is outside this article‘s scope, but I‘ll give you a quick introduction to the language‘s key aspects.

Erlang programs are compiled and run in a virtual machine, but you can also use an interactive shell to explore the basics. Erlang variables, which start with a capital letter or a _ character, can be assigned only once, unlike in most other languages. Listing 2 shows an example trace in the Erlang shell with a Value variable.

Listing 2. Erlang variables

1> Value = 4.
4
2> Value.
4
3> Value = 6.
** exception error: no match of right hand side value 6

In this trace you can see expressions being entered on the lines starting with 1>2>, and 3> and the response from the shell on the following lines. Line 1 assigns the value 4 to the Value variable. Line 2 returns the value of Value. Line 3 demonstrates that Value can‘t be reassigned to a new value because it is already bound to the value 4.

Actually, = is not an assignment operator at all in Erlang. It is really a pattern-matching operator. In the case of an unbound variable on the left-hand side, the pattern is unmatched and will bind a value from the right-hand side. But that‘s just the beginning of what = can do.

Words starting with lowercase letters are atoms -- fixed symbols that always represent a constant, number, or string of the same name, even across machines. Atuple is a fixed-size set of heterogenous values and is specified with { }. Here we match a variable with a tuple of atoms:

Stooges = {larry, curly, moe}.

You can also put tuples on the left side and use variables and the pattern-matching =to extract values from a tuple on the right side:

{Stooge1, Stooge2, Stooge3} = {larry, curly, moe}.

Here the variables Stooge1Stooge2, and Stooge3 match the atoms larrycurly, and moe.

Lists contain a variable number of elements and are represented in [ ]. The | (pipe) is used to separate the head element from the rest of the list (similar to car and cdr in Lisp). In Listing 3 we build a list and then extract the list‘s head and tail with pattern matching.

Listing 3. Lists

1> List=[1,2,3].
[1,2,3]
2> [First | Rest]=List.
[1,2,3]
3> First.
1
4> Rest.
[2,3]

Functions are first-class values, as you would expect in a functional language. They are defined by a function name and the number of parameters, using a simple syntax. Here we match the anonymous function to the Square variable. You can then execute it or even create functions that return functions, like the TimesN function in Listing 4.

Listing 4. Functions

1> Square = fun(X) -> X*X end.
#Fun<erl_eval.6.13229925>
2> Square(5).
25.
3> TimesN = fun(N) -> (fun(X) -> N*X end)
3> end.
#Fun<erl_eval.6.13229925>
4> lists:map(TimesN(10), [1,2,3]).
[10,20,30]

At the end of Listing 4 you can see a call to the function mapin one of the most important Erlang modules: lists.

Normally you define Erlang code in source files with the .erlextension. Each source file can declare a module and which functions are exported or imported to that module. For example, Listing 5 is a definition of the mymath module, which has two functions -- square and fib.

Listing 5. The mymath.erl module

-module(mymath).
-export([square/1,fib/1]).

square(Value) -> Value*Value.

fib(0) -> 0;
fib(1) -> 1;
fib(N) when N>1 -> fib(N-1) + fib(N-2).

In Listing 5, fib is defined by three different patterns, and the runtime will decide which pattern to call as appropriate. The when ... is a guard clause that can constrain when a pattern is applicable.

Listing 6 shows how to compile and load this module in the shell.

Listing 6 Compiling a module and executing its functions

1> c(mymath.erl).
2> mymath:square(5).
25
3> mymath:fib(7).
13

Actors in Erlang

Now that you‘re at least a little bit comfortable with Erlang, we can begin looking at actors. Three basic elements in Erlang form the foundation for concurrency. First, the built-in spawn function creates a new process executing a function and returns the new process‘s process identifier. Second is a syntax for sending a message to a process with the ! operator. And finally, actors can use a receive...end function to pattern-match messages from the actor‘s mailbox.

To see how these pieces fit together, we‘ll create a process that does conversions between Celsius and Fahrenheit temperatures. First we must define the function that the process will run, as shown in Listing 7.

Listing 7. temperature.erl

temperatureConverter() ->
 receive
 {toF, C} ->
 io:format("~p C is ~p F~n", [C, 32+C*9/5]),
 temperatureConverter();
 {toC, F} ->
 io:format("~p F is ~p C~n", [F, (F-32)*5/9]),
 temperatureConverter();
 {stop} ->
 io:format("Stopping~n");
 Other ->
 io:format("Unknown: ~p~n", [Other]),
 temperatureConverter()
 end.

This function receives messages in the form of three possible tuple formats. The first format is {toF, C}. In this tuple, the first element is the toF atom, and the second is the temperature in Celsius. In the code the variable C is matched to the value. On match, it prints the answer and calls back to this function again. This is a tail-recursive call. Erlang actor functions often follow this idiom of matching an incoming message, then making a tail-recursive call back to the same function. State can be maintained in the actor by passing it in a function parameter and modifying it on the recursive call.

The {stop} message is a special message to stop the process. The last pattern is just an Other variable that maps to anything, prints an error message, and restarts.

You‘ve no doubt noticed that depending on the circumstance, clauses and functions are separated by , (comma), ;(semicolon), and . (period). In Erlang, these punctuation marks serve much the same purpose as their use in English. The comma separates phrases (often subsequent expressions to evaluate). The semicolon indicates a related but separate phrase. The period terminates a complete statement.

We can then spawn a process to run this function and send messages to it, as shown in Listing 8.

Listing 8. Running the temperature converter process

1> c(temperature).
{ok,temperature}
2> Pid = spawn(fun temperature:temperatureConverter/0).
<0.128.0>
3> Pid ! {toC, 32}.
32F is 0.0 C
{convertToC,32}
4> Pid ! {toF, 100}.
100C is 212.0 F
{convertToF,100}
5> Pid ! {stop}.
Stopping
{stop}

In this trace in the shell, we load and compile the temperature module, then spawn a process running the temperatureConverter function. In return we get the process ID of the converter process. We can then send it messages using the ! operator.

RPC in Erlang

In the example in Listing 7 we simply printed the answer in the process, but it‘s probably far more useful to get a response to our question. A remote procedure call (RPC) is easy to build by simply including the process identifier of the source as part of the message. We must rework our converter function to expect a source identifier and to respond with a message instead of printing, as shown in Listing 9.

Listing 9. Reworking the converter for RPC

temperatureConverter() ->
 receive
 {From, {toF, C}} ->
 From ! {self(), 32+C*9/5},
 temperatureConverter();
 ...etc

We can then add some functions to hide the spawn and the remote call behind a simpler interface, as shown in Listing 10.

Listing 10. Wrapping the RPC

start() ->
 spawn(fun() -> temperatureConverter() end).

convert(Pid, Request) ->
 Pid ! {self(), Request},
 receive
 {Pid, Response} -> Response
 end.

The start() function spawns the converter process and returns its process identifier. The convert function uses the process identifier to call the converter process, sending the current identifier as the source, then blocks in a receivewaiting for a response on the current‘s process mailbox, which is subsequently returned.

These are just the most basic ways to use actors in Erlang. Erlang also makes it easy to handle errors: instead of having a process manage its own errors, it‘s more common to create a new process (they‘re cheap, right?) to watch the original process and deal with an error. Many patterns, such as propagating the error or restarting the process, are possible. In fact, Erlang provides a whole library that deals with patterns of error handling, hot-swapping code, and other useful functionality.

Does it work?

Erlang wouldn‘t be receiving attention today without real-life success stories. At Ericsson, the AXD 301 switch is the flagship example of Erlang‘s scalability and reliability. This switch handles millions of calls per week and has a reported 99.9999999 percent uptime. Once Erlang was open-sourced, it was picked up by Ericsson‘s competitor Nortel and integrated into the Nortel Alteon SSL Accelerator, which does hundreds of transactions per second.

Because of Erlang‘s focus on messaging, it seems to be a good fit for these kinds of systems. The ejabberd daemon is a highly-regarded Jabber/XMPP implementation that has been used in some large implementations such as jabber.org. Similarly,RabbitMQ is a high-performance AMQP server written in Erlang that can reportedly handle 400,000 messages per second.

SIMILAR ARTICLES

Several so-called Web 2.0 companies have also had success building parts of their systems in Erlang. Facebook Chatuses a combination of C++ and an Erlang Web server to handle 70,000 concurrent users. Delicious 2.0 relaunched in 2008 and is primarily based on C++ but uses Erlang in several subsystems such as data migrations, rolling migrations, and algorithmic analysis. CouchDB, which has gotten a lot of press lately as a document-centric Web database, is written entirely in Erlang. Reportedly the Amazon SimpleDB service also uses Erlang internally.

In short, Erlang can boast a number of high-profile successes, and these users consistently tout Erlang‘s strengths for writing distributed, concurrent, fault-tolerant software. It‘s clear that the actor model is a valid alternative for addressing concurrency without shared state and locks.

You may be thinking, "This sounds cool, but there‘s no way my manager will let me start deploying software in Erlang." Don‘t you wish you could use the actor model on the JVM? Well, stay tuned for Part 2.

Alex Miller is a tech lead at Terracotta Inc, the makers of the open-source Java clustering product Terracotta. Previously, Alex worked at BEA Systems on the AquaLogic product line and was Chief Architect at MetaMatrix. Alex blogs at Pure Danger Tech and speaks at user groups, conferences, and the No Fluff Just Stuff tour. He has an enduring love for his wife and three children, music, and nachos.

时间: 2024-11-08 21:45:06

actor concurrency的相关文章

NHibernate.3.0.Cookbook第一章第六节Handling versioning and concurrency的翻译

NHibernate.3.0.Cookbook第一章第六节Handling versioning and concurrency的翻译 第一章第二节Mapping a class with XML第一章第三节Creating class hierarchy mappings第一章第四节Mapping a one-to-many relationship第一章第五节Setting up a base entity class Handling versioning and concurrency

Concurrency vs. Parallelism

http://getakka.net/docs/concepts/terminology Terminology and Concepts In this chapter we attempt to establish a common terminology to define a solid ground for communicating about concurrent, distributed systems which Akka.NET targets. Please note th

actor

_timcharper1月 20 02:18 @timcharper , so what app the actor model can help, and what app the thread modle can help, could you please give me some advice? say, a heavy traffic app, for example a instant message server is ok using actor model, bussy log

C++多线程开发之actor model

最近想把写过的一个多线程程序整理一下,这个程序主要特点是有一系列的互相之间有依赖关系的task.于是在网上找相关类库 1,一类是简单的线程池了,这也是原本俺的做法.之前使用的是手工调度,代码实现的很蛋疼.外面的lib有poco https://pocoproject.org/slides/130-Threads.pdf2,Intel TBB, MS PPL (Parallel Patterns Library)之类的类库,感觉这里一类本质上和1没有大的分别3,微软的并行库1)MS PPL (Pa

Akka(1):Actor - 靠消息驱动的运算器

Akka是由各种角色和功能的Actor组成的,工作的主要原理是把一项大的计算任务分割成小环节,再按各环节的要求构建相应功能的Actor,然后把各环节的运算托付给相应的Actor去独立完成.Akka是个工具库(Tools-Library),不是一个软件架构(Software-Framework),我们不需要按照Akka的框架格式去编写程序,而是直接按需要构建Actor去异步运算一项完整的功能,这样让用户在不知不觉中自然的实现了多线程并发软件编程(concurrent programming).按这

基于Actor的并发方案

共享可变状态的问题 Actor模型 Actor系统 定义Actor 消息处理 副作用 非类型化 异步和非阻塞 创建一个Actor 发送消息 消息应答 问询Ask机制 有状态的Actor 小结 译者注: 本文原文标题:<The Neophyte's Guide to Scala Part 14: The Actor Approach to Concurrency>,作者:Daniel Westheide, 原文链接:http://danielwestheide.com/blog/2013/02/

Akka 3 Akka actor 与 scala actor的不同

经常有人问: Akka的Actor和Scala的Actor有什么不同?这里的回答是,从actor 模型角度讲,没什么不同,它们都实现了actor model. Akka actors and Scala actors are two implementations of that model. All Actor model says that your concurrency primitives are actors, which can: receive a message and dec

Actor模型

[原文链接] Actor模式是一个解决分布式计算的数学模型,其中Actor是基础,它能回应接收到消息,能够自我决策,创建更多的Actor,发送更多的消息,决定 如何回应下一个接收到的消息.Actor认为一切皆是Actor,类似于面向对象认为一切皆Object一样.OO的执行是顺序的,Actor模型内在设 计就是并行的. Actor是计算实体,它回复接收到的消息,能够并行的: 1. 发生有限的消息给其他Actor 2. 创建有限数目的新Actor 3. 指定小一个消息到达时的行为 这些操作并没有顺

一 Akka学习 - actor

(引用 http://shiyanjun.cn/archives/1168.html) 一: 什么是Akka? Akka是JAVA虚拟机JVM平台上构建高并发.分布式和容错应用的工具包和运行时,是一个框架.Akka用Scala语言写成,同时提供了Scala和JAVA的开发接口. Akka处理并发的方法基于Actor模型.在Akka里,Actor之间通信的唯一机制就是消息传递. 二: 什么是Actor? 维基百科这样定义Actor模型: 在计算科学领域,Actor模型是一个并行计算(Concurr