网络编程系列(一)socket API 基础篇
socket 是什么
参考 Berkeley套接字
Berkeley sockets 也称为 BSD Socket 。伯克利套接字的应用编程接口(API)是采用 C 语言的进程间通信的库,经常用在计算机网络间的通信。 BSD Socket 的应用编程接口已经是网络套接字的事实上的抽象标准。大多数其他程序语言使用一种相似的编程接口。
BSD Socket 作为一种 API,允许不同主机或者同一个计算机上的不同进程之间的通信。它支持多种 I/O 设备和驱动,但是具体的实现是依赖操作系统的。这种接口对于 TCP/IP 是必不可少的,所以是互联网的基础技术之一。它最初是由加州伯克利大学为 Unix 系统开发出来的。所有现代的操作系统都实现了伯克利套接字接口,因为它已经是连接互联网的标准接口了。
socket 流程
UDP socket
TCP socket
TCP 连接管理
TCP 连接管理(三次握手、四次挥手)的流程及状态机如下:
socket API
https://man7.org/linux/man-pages/man7/socket.7.html
Syscall | Description | Blocking or not |
---|---|---|
socket() |
creates an endpoint for communication and returns a file descriptor | NO |
bind() |
bind a name to a socket | NO |
listen() |
listen for connections on a socket | NO |
accept() accept4() |
creates a new file descriptor for an incoming connection | YES or NO (EAGAIN or EWOULDBLOCK ) |
connect() |
initiate a connection on a socket | YES or NO (EINPROGRESS ) |
recv() send() |
operations on a single file descriptor | YES or NO (EAGAIN or EWOULDBLOCK ) |
服务端 API
socket()
https://man7.org/linux/man-pages/man2/socket.2.html
SYNOPSIS
1
2
3
4
5
6
7
// On success, a file descriptor for the new socket is returned.
// On error, -1 is returned, and errno is set appropriately.
int socket(int domain,
int type,
int protocol);DESCRIPTION
socket()
creates an endpoint for communication and returns a file descriptor that refers to that endpoint.
domain
参数常用选项:
Name | Purpose | Man page |
---|---|---|
AF_INET |
IPv4 Internet protocols | ip(7) |
AF_INET6 |
IPv6 Internet protocols | ipv6(7) |
type
参数常用选项:
SOCK_DGRAM
数据报套接字:Supports datagrams (connectionless, unreliable messages of a fixed maximum length).SOCK_STREAM
流式套接字:Provides sequenced, reliable, two-way, connection-based byte streams. An out-of-band data transmission mechanism may be supported.SOCK_RAW
原始套接字:Provides raw network protocol access.
bind()
https://man7.org/linux/man-pages/man2/bind.2.html
SYNOPSIS
1
2
3
4
5
6
7
// On success, 0 is returned.
// On error, -1 is returned, and errno is set appropriately.
int bind(int sockfd,
const struct sockaddr *addr,
socklen_t addrlen);DESCRIPTION
The
bind()
assigns a local protocol address to a socket. With the Internet protocols, the address is the combination of an IPv4 or IPv6 address (32-bit or 128-bit) address along with a 16 bit TCP port number.
一般情况下,只有服务端 socket 需要指定端口号。而客户端 socket 为了避免发生端口号冲突,会由操作系统提供的随机临时端口号来运行,用户无须关心。但在某些情况下:
- 例如遇到防火墙只允许某些端口号出网,此时操作系统向客户端提供的端口号很可能会被客户端防火墙阻止。
- 某些协议(例如 NFS 协议)要求客户端程序仅在某个端口号上运行。
在这种情况下,我们需要显式或强制地使用 bind()
来为客户端 socket 指定特定端口号。
参考:Explicitly assigning port number to client in Socket
listen()
https://man7.org/linux/man-pages/man2/listen.2.html
SYNOPSIS
1
2
3
4
5
6
// On success, 0 is returned.
// On error, -1 is returned, and errno is set appropriately.
int listen(int sockfd,
int backlog);DESCRIPTION
listen()
marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept incoming connection requests usingaccept()
.The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.
The behavior of the backlog argument on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests. The maximum length of the queue for incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog. When syncookies are enabled there is no logical maximum length and this setting is ignored. See tcp(7) for more information.
《深入探索 Linux listen() 函数 backlog 的含义》
accept()
https://man7.org/linux/man-pages/man2/accept.2.html
SYNOPSIS
1
2
3
4
5
6
7
// On success, a nonnegative integer that is a file descriptor for the accepted socket is returned.
// On error, -1 is returned, and errno is set appropriately.
int accept(int sockfd,
struct sockaddr *addr,
socklen_t *addrlen)DESCRIPTION
The
accept()
system call is used with connection-based socket types (SOCK_STREAM
,SOCK_SEQPACKET
). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call.If no pending connections are present on the queue, and the socket is not marked as nonblocking,
accept()
blocks the caller until a connection is present. If the socket is marked nonblocking and no pending connections are present on the queue,accept()
fails with the error EAGAIN or EWOULDBLOCK.
客户端 API
connect()
https://man7.org/linux/man-pages/man2/connect.2.html
SYNOPSIS
1
2
3
4
5
6
7
// If the connection or binding succeeds, 0 is returned.
// On error, -1 is returned, and errno is set appropriately.
int connect(int sockfd,
const struct sockaddr *addr,
socklen_t addrlen);DESCRIPTION
The
connect()
function is used by a TCP client to establish a connection with a TCP server.The function returns
0
if the it succeeds in establishing a connection (i.e., successful TCP three-way handshake,-1
otherwise.The client does not have to call
bind()
in Section before calling this function: the kernel will choose both an ephemeral port and the source IP if necessary.
close()
https://man7.org/linux/man-pages/man2/close.2.html
SYNOPSIS
1
2
3
int close(int fd);
用于完成四次挥手流程:
- 客户端首先调用
close()
主动关闭连接,这时 TCP 发送一个 FIN M; - 服务端接收到 FIN M 之后,执行被动关闭,并发送 ACK M+1 对这个 FIN 进行确认;(它的接收也作为文件结束符
EOF
传递给应用进程(此时recv()
返回值为0
),因为 FIN 的接收意味着应用进程在相应的连接上再也接收不到额外数据) - 服务端之后调用
close()
主动关闭它的 socket,这导致它的 TCP 也发送一个 FIN N; - 客户端接收到 FIN N 之后,发送 ACK N+1 对这个 FIN 进行确认。
这样每个方向上都有一个 FIN 和 ACK。
选项设置 API
SYNOPSIS
1 int setsockopt(int sockfd, int level, int optname, const void *optval, socklen_t optlen);
1 int getsockopt(int sockfd, int level, int optname, void *optval, socklen_t *optlen);DESCRIPTION
getsockopt(2) and setsockopt(2) are used to set or get socket layer or protocol options.
level
常用选项如下:
socket layer
protocol layer
Using the
SOL_IP
socket options level isn’t portable; BSD-based stacks use theIPPROTO_IP
level.
SOL_SOCKET
https://man7.org/linux/man-pages/man7/socket.7.html
超时时间
SO_RCVTIMEO and SO_SNDTIMEO
Specify the receiving or sending timeouts until reporting an error. The argument is a struct timeval.
Return value:
- If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred;
- if no data has been transferred and the timeout has been reached, then -1 is returned with errno set to EAGAIN or EWOULDBLOCK, or EINPROGRESS (for connect(2)) just as if the socket was specified to be nonblocking.
Timeout setting:
- If the timeout is set to zero (the default) then the operation will never timeout.
- Timeouts only have effect for system calls that perform socket I/O (e.g., read(2), recvmsg(2), send(2), sendmsg(2));
- timeouts have no effect for select(2), poll(2), epoll_wait(2), and so on.
1
2 setsockopt(clientfd, SOL_SOCKET, SO_RCVTIMEO, ...);
setsockopt(clientfd, SOL_SOCKET, SO_SNDTIMEO, ...);
<How to set socket timeout in C when making multiple connections?>
<Why there is no “write timeout” for the socket ?>
缓冲区大小
SO_RCVBUF
Sets or gets the maximum socket receive buffer in bytes. The kernel doubles this value (to allow space for bookkeeping overhead) when it is set using setsockopt(2), and this doubled value is returned by getsockopt(2). The default value is set by the /proc/sys/net/core/rmem_default file, and the maximum allowed value is set by the /proc/sys/net/core/rmem_max file. The minimum (doubled) value for this option is 256.
SO_SNDBUF
Sets or gets the maximum socket send buffer in bytes. The kernel doubles this value (to allow space for bookkeeping overhead) when it is set using setsockopt(2), and this doubled value is returned by getsockopt(2). The default value is set by the /proc/sys/net/core/wmem_default file and the maximum allowed value is set by the /proc/sys/net/core/wmem_max file. The minimum (doubled) value for this option is 2048.
地址重用
SO_REUSEADDR
Indicates that the rules used in validating addresses supplied in a bind(2) call should allow reuse of local addresses. For AF_INET sockets this means that a socket may bind, except when there is an active listening socket bound to the address. When the listening socket is bound to INADDR_ANY with a specific port then it is not possible to bind to this port for any local address. Argument is an integer boolean flag.
1 setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, ...);
端口重用
SO_REUSEPORT (since Linux 3.9)
1 setsockopt(listenfd, SOL_SOCKET, SO_REUSEPORT, ...);
《深入理解 Linux 端口重用这一特性——良许 Linux》
IPPROTO_TCP
https://man7.org/linux/man-pages/man7/tcp.7.html
长连接(TCP keepalive)
SO_KEEPALIVE 开启 TCP keepalive
Enable sending of keep-alive messages on connection-oriented sockets. Expects an integer boolean flag.
1 setsockopt(clientfd, SOL_SOCKET, SO_KEEPALIVE, ...);
TCP keepalive 具体参数设置:
TCP_KEEPCNT: overrides tcp_keepalive_probes
TCP_KEEPIDLE: overrides tcp_keepalive_time
TCP_KEEPINTVL: overrides tcp_keepalive_intvl
1
2
3 setsockopt(clientfd, IPPROTO_TCP, TCP_KEEPCNT, ...);
setsockopt(clientfd, IPPROTO_TCP, TCP_KEEPIDLE, ...);
setsockopt(clientfd, IPPROTO_TCP, TCP_KEEPINTVL, ...);
流量控制
TCP_NODELAY
If set, disable the Nagle algorithm. This means that segments are always sent as soon as possible, even if there is only a small amount of data. When not set, data is buffered until there is a sufficient amount to send out, thereby avoiding the frequent sending of small packets, which results in poor utilization of the network.
1 setsockopt(clientfd, IPPROTO_TCP, TCP_NODELAY, ...);
读、写 API
成功建立连接之后,双方开始通过 recv()
和 send()
来读、写数据,就像往一个文件流里面写东西一样。
recv()
https://man7.org/linux/man-pages/man2/recv.2.html
https://man7.org/linux/man-pages/man2/recvfrom.2.html
https://man7.org/linux/man-pages/man2/recvmsg.2.html
receive a message from a socket
- If no messages are available at the socket, the receive calls wait for a message to arrive, unless the socket is nonblocking (see fcntl(2)), in which case the value -1 is returned and the external variable errno is set to EAGAIN or EWOULDBLOCK.
- The select(2) or poll(2) call may be used to determine when more data arrives.
1 |
|
send()
https://man7.org/linux/man-pages/man2/send.2.html
https://man7.org/linux/man-pages/man2/sendto.2.html
https://man7.org/linux/man-pages/man2/sendmsg.2.html
send a message on a socket
- When the message does not fit into the send buffer of the socket, send() normally blocks, unless the socket has been placed in nonblocking I/O mode. In nonblocking mode it would fail with the error EAGAIN or EWOULDBLOCK in this case.
- The select(2) call may be used to determine when it is possible to send more data.
1 |
|
nonblocking I/O on sockets
默认设置下,socket 为同步阻塞 I/O 模型,下列系统调用会引发阻塞:
Call type | Socket state | blocking | Nonblocking |
---|---|---|---|
Types of recv() calls |
Input is available | Immediate return | Immediate return |
No input is available | Wait for input | Immediate return with EWOULDBLOCK error number |
|
Types of send() calls |
Output buffers available | Immediate return | Immediate return |
No output buffers available | Wait for output buffers | Immediate return with EWOULDBLOCK error number |
|
accept() call |
New connection | Immediate return | Immediate return |
No connections queued | Wait for new connection | Immediate return with EWOULDBLOCK error number |
|
connect() call |
Wait | Immediate return with EINPROGRESS error number |
有几种设置方式可以将 socket 设置为同步非阻塞 I/O 模型:
设置方式一
使用 fcntl(2) + O_NONBLOCK
参数。POSIX 标准,版本兼容性最好,推荐使用:
1 | int flags = fcntl(listenfd, F_GETFL, 0); |
使用 ioctl(2) + FIONBIO
参数。版本兼容性较弱,不建议使用:
1 | int on = 1; |
设置方式二
从 Linux 2.6.27 开始,也可以使用 socket(2) 和 accept4(2) 直接创建非阻塞 socket:
1 | int listenfd, clientfd; |
上述几个文件描述符 fd 的基础概念:
- listenfd:服务端创建的 socket,
bind()
地址和端口,调用listen()
函数发起侦听的一端; - clientfd:服务端调用
accept()
函数接受新连接,由accept()
函数返回的 socket。accept()
函数并不参与三次握手过程,accept()
函数从已经完成连接的 backlog 队列中取出连接,并返回 clientfd; - connfd:客户端创建的 socket,调用
connect()
函数主动发起连接(三次握手)的一端。
最后客户端与服务端分别通过 connfd 和 clientfd 进行通信(调用 send()
或者 recv()
函数)。
socket 代码样例
https://cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
https://www.geeksforgeeks.org/socket-programming-cc/
https://github.com/qidawu/cpp-data-structure/tree/main/socket
参考
https://en.wikipedia.org/wiki/Network_socket
https://en.wikipedia.org/wiki/Berkeley_sockets
<Client/server socket programs: Blocking, nonblocking, and asynchronous socket calls | IBM>