前言: 什么是RDMA? 简单来说,RDMA就是指不通过操作系统(OS)内核以及TCP/IP协议栈在网络上传输数据,因此延迟(latency)非常低,CPU消耗非常少。 下面给出一篇简单介绍RDMA的文章之中英文对照翻译。
Introduction to Remote Direct Memory Access (RDMA) | RDMA概述
1. What is RDMA? | 什么是RDMA
Direct memory access (DMA) is an ability of a device to access host memory directly, without the intervention of the CPU(s).
RDMA (Remote DMA) is the ability of accessing (i.e. reading from or writing to) memory on a remote machine without interrupting the processing of the CPU(s) on that system.
XXXX
2. So? why is this so good? | 为嘛这么好呢
Using RDMA has the following major advantages:
- Zero-copy - applications can perform data transfer without the network software stack involvement and data is being send received directly to the buffers without being copied between the network layers.
- Kernel bypass - applications can perform data transfer directly from userspace without the need to perform context switches.
- No CPU involvement - applications can access remote memory without consuming any CPU in the remote machine. The remote memory machine will be read without any intervention of remote process (or processor). The caches in the remote CPU(s) won‘t be filled with the accessed memory content.
- Message based transactions - the data is handled as discrete messages and not as a stream, which eliminates the need of the application to separate the stream into different messages/transactions.
- Scatter/gather entries support - RDMA supports natively working with multiple scatter/gather entries i.e. reading multiple memory buffers and sending them as one stream or getting one stream and writing it to multiple memory buffers.
XXXX
3. Where can I find RDMA? | 哪里需要用到RDMA?
You can find RDMA in industries that need at least one the following:
- Low latency - For example: HPC, financial services, web 2.0
- High Bandwidth - For example: HPC, medical appliances, storage and backup systems, cloud computing
- Small CPU footprint - For example: HPC, cloud computing
And in many-many more other industries...
XXXX
4. Which network protocols support RDMA? | 支持RDMA的网络协议有哪些?
Today, there are several network protocols which support RDMA:
- InfiniBand (IB) - a new generation network protocol which supports RDMA natively from the beginning. Since this is a new network technology, it requires NICs and switches which supports this technology.
- RDMA Over Converged Ethernet (RoCE) - a network protocol which allows performing RDMA over Ethernet network. Its lower network headers are Ethernet headers and its upper network headers (including the data) are InfiniBand headers. This allows using RDMA over standard Ethernet infrastructure (switches). Only the NICs should be special and support RoCE.
- Internet Wide Area RDMA Protocol (iWARP) - a network protocol which allows performing RDMA over TCP. There are features that exist in IB and RoCE and aren‘t supported in iWARP. This allows using RDMA over standard Ethernet infrastructure (switches). Only the NICs should be special and support iWARP (if CPU offloads are used) otherwise, all iWARP stacks can be implemented in SW and loosing most of the RDMA performance advantages.
XXXX
5. Does it mean that I need to learn several programming APIs? | 需要学习编程API吗
No. Luckily, the same API (i.e. verbs) can be used for all the above-mentioned RDMA enabled network protocols. In *nix it is libibverbs and kernel verbs and in Windows it is Network Direct (ND).
XXXX
6. Are those network protocols interoperable? | 网络协议可以互操作吗
Since those are different network protocols, their packets are completely different and they cannot send/receive messages directly without any router/gateway between them. However, the same code can support all of them. Since all those network protocols support libibverbs, the same binary can be used without even the need to recompile the source code.
XXX
7. Do I need to download special packages to use RDMA or is it part of the Operating System? | 用RDMA需要装额外的包吗 or RDMA是OS的一部分吗
For several Operating Systems, RDMA support is embedded within the kernel. For example, Linux which supports RDMA natively and all major Linux distributions support it. Other Operating Systems may need to download a package (such as OFED) to add RDMA support to it.
XXXX
推荐阅读:
- InfiniBand的版本演进、基础观念、传量传速
- RDMA : https://en.wikipedia.org/wiki/Remote_direct_memory_access
- IB : https://en.wikipedia.org/wiki/InfiniBand
- iSER : https://en.wikipedia.org/wiki/ISCSI_Extensions_for_RDMA
- iWARP : https://en.wikipedia.org/wiki/IWARP
- RoCE : https://en.wikipedia.org/wiki/RDMA_over_Converged_Ethernet
- SDP : https://en.wikipedia.org/wiki/Sockets_Direct_Protocol
- SRP : https://en.wikipedia.org/wiki/SCSI_RDMA_Protocol