Atomic operations on the x86 processors

On the Intel type of x86 processors including AMD, increasingly there are more CPU cores or processors running in parallel.

In the old days when there was a single processor, the operation:

++i;

Would be thread safe because it was one machine instruction on a single processor. These days laptops have numerous CPU cores so that even single instruction operations aren‘t safe. What do you do? Do you need to wrap all operations in a mutex or semaphore? Well, maybe you don‘t need too.

Fortunately, the x86 has an instruction prefix that allows a few memory referencing instruction to execute on specific memory locations exclusively.

There are a few basic structures that can use this:

(for the GNU Compiler)

void atom_inc(volatile int *num)

{

__asm__ __volatile__ ( "lock incl %0" : "=m" (*num));

}

void atom_dec(volatile int *num)

{

__asm__ __volatile__ ( "lock decl %0" : "=m" (*num));

}

int atom_xchg(volatile int *m, int inval)

{

register int val = inval;

__asm__ __volatile__ ( "lock xchg %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));

return val;

}

void atom_add(volatile int *m, int inval)

{

register int val = inval;

__asm__ __volatile__ ( "lock add %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));

}

void atom_sub(volatile int *m, int inval)

{

register int val = inval;

__asm__ __volatile__ ( "lock sub %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));

}

?

For the Microsoft Compiler:

?

void atom_inc(volatile int *num)

{

_asm

{

mov esi, num

lock inc DWORD PTR [esi]

};

}

void atom_dec(volatile int *num)

{

_asm

{ mov esi, num

lock dec DWORD PTR [esi]

};

}

int atom_xchg(volatile int *m, int inval)

{

_asm

{

mov eax, inval

mov esi, m

lock xchg eax, DWORD PTR [esi]

mov inval, eax

}

return inval;

}

void atom_add(volatile int *num, int val)

{

_asm

{ mov esi, num

mov eax, val

lock add DWORD PTR [esi], eax

};

}

void atom_sub(volatile int *num, int val)

{

_asm

{ mov esi, num

mov eax, val

lock sub DWORD PTR [esi], eax

};

}

?

The lock prefix is not universally applied. It only works if all accesses to the locations also use lock. So, even though you use "lock" in one section of code, another section of code that just sets the value will not be locked out. Think of it as just a mutex.

Basic usage:

?

class poll

{

int m_pollCount;

....

....

?

void pollAdd()

{

atom_inc(&m_pollCount);

}

};

The above example increments a poll object count by one.

SRC=http://www.mohawksoft.org/?q=node/78

时间: 2024-12-26 04:34:07

Atomic operations on the x86 processors的相关文章

Voting and Shuffling to Optimize Atomic Operations

2iSome years ago I started work on my first CUDA implementation of the Multiparticle Collision Dynamics (MPC) algorithm, a particle-in-cell code used to simulate hydrodynamic interactions between solvents and solutes. As part of this algorithm, a num

Linearizability(also known as strict or atomic consistency)

In concurrent programming, an operation (or set of operations) is atomic, linearizable, indivisible or uninterruptible if it appears to the rest of the system to occur instantaneously. Atomicity is a guarantee of isolation from concurrent processes.

Method and apparatus for an atomic operation in a parallel computing environment

A method and apparatus for a?atomic?operation?is described. A method comprises receiving a first program unit in a parallel computing environment, the first program unit including a memory update?operation?to be performed atomically, the memory updat

A multiprocessing system including an apparatus for optimizing spin-lock operations

A multiprocessing system having a plurality of processing nodes interconnected by an interconnect network. To optimize performance during spin-lock operations, a home agent prioritizes the servicing of read-to-own (RTO) transaction requests over the

Adaptively handling remote atomic execution based upon contention prediction

In one embodiment, a method includes receiving an instruction for decoding in a processor core and dynamically handling the instruction with one of multiple behaviors based on whether contention is predicted. If no contention is predicted, the instru

原子操作(atomic operation)

深入分析Volatile的实现原理 引言 在多线程并发编程中synchronized和Volatile都扮演着重要的角色,Volatile是轻量级的synchronized,它在多处理器开发中保证了共享变量的"可见性".可见性的意思是当一个线程修改一个共享变量时,另外一个线程能读到这个修改的值. 它在某些情况下比synchronized的开销更小,本文将深入分析在硬件层面上Inter处理器是如何实现Volatile的,通过深入分析能帮助我们正确的使用Volatile变量. 术语定义 术

C++11开发中的Atomic原子操作

C++11开发中的Atomic原子操作 Nicol的博客铭 原文  https://taozj.org/2016/09/C-11%E5%BC%80%E5%8F%91%E4%B8%AD%E7%9A%84Atomic%E5%8E%9F%E5%AD%90%E6%93%8D%E4%BD%9C/ 主题 C++ 原子操作在多线程开发中经常用到,比如在计数器,序列产生器等地方,这类情况下数据有并发的危险,但是用锁去保护又显得有些浪费,所以原子类型操作十分的方便. 原子操作虽然用起来简单,但是其背景远比我们想象

ARM, X86和MIPS

ARM ARM架构,过去称作高级精简指令集机器(Advanced RISC Machine,更早称作:Acorn RISC Machine),是一个32位精简指令集reduced instruction set computing(RISC)处理器架构,其广泛地使用在许多嵌入式系统设计.由于节能的特点,ARM处理器非常适用于移动通信领域,符合其主要设计目标为低成本.高性能.低耗电的特性. ARM is a family of instruction set architectures for c

golang原子库atomic

package atomic import ( "unsafe" ) // BUG(rsc): On x86-32, the 64-bit functions use instructions unavailable before the Pentium MMX. // // On non-Linux ARM, the 64-bit functions use instructions unavailable before the ARMv6k core. // // On both