Introduction to RDMA

RDMA

Remote Direct Memory Access (RDMA) is a networking technology that allows one machine to directly read or write the memory of another machine without involving the remote CPU or operating system. By bypassing the kernel and avoiding unnecessary memory copies, RDMA provides:

Benefit	Effect
Very low latency	Data is transferred without kernel involvement
Low CPU overhead	Remote access does not require remote CPU work
Zero-copy transfer	No intermediate buffer copying is needed
High throughput	NIC hardware handles data movement efficiently

Compared to traditional TCP/IP networking, RDMA reduces communication overhead significantly. It does not require multiple memory copies and kernel transitions. This makes RDMA especially valuable in data centers, HPC clusters, and distributed AI training, because these systems need to exchange large amounts of data efficiently.

What This Tutorial Covers

This section focuses on GPU-side RDMA fundamentals:

The RDMA Verbs API and how applications interact with RDMA hardware
Memory Registration, which grants permission for remote access
Queue Pairs, the core communication endpoints in RDMA programs

Learning Outcomes

After completing the RDMA Basics section, you will be able to:

Understand the RDMA programming model and key objects (PD, MR, CQ, QP)
Register memory regions and work with memory access keys
Set up and configure Queue Pairs for communication
Run and modify basic RDMA operations such as Write, Read, and Send/Recv

Next, we begin with the RDMA Verbs API.