内容大纲
1.socke基础
两个程序通过一个双向的通信连接实现数据的交换,这个连接的一端称为一个socket。
建 立网络通信连接至少要一对端口号(socket)。socket本质是编程接口(API),对TCP/IP(或者UDP)的封装,TCP/IP也要提供可供程序员做网络 开发所用的接口,这就是Socket编程接口;HTTP是轿车,提供了封装或者显示数据的具体形式;Socket是发动机,提供了网络通信的能力。
Socket的英文原义是“孔”或“插座”。作为BSD UNIX的进程通信机制,取后一种意思。通常也称作"套接字",用于描述IP地址和端口,是一个通信链的句柄,可以用来实现不同虚拟机或不同计算机之间的通信。在Internet上的主机一 般运行了多个服务软件,同时提供几种服务。每种服务都打开一个Socket,并绑定到一个端口上,不同的端口对应于不同的服务。Socket正如其英文原 意那样,像一个多孔插座。一台主机犹如布满各种插座的房间,每个插座有一个编号,有的插座提供220伏交流电, 有的提供110伏交流电,有的则提供有线电视节目。 客户软件将插头插到不同编号的插座,就可以得到不同的服务。
1.1现象解释
Socket非常类似于电话插座。以一个国家级电话网为例,电话的通话双方相当于相互通信的2个进程,区号是它的网络地址;区内一个单位的交换机相当于一台主机, 主机分配给每个用户的局内号码相当于Socket号。任何用户在通话之前,首先要占有一部电话机,相当于申请一个Socket;同时要知道对方的号码,相 当于对方有一个固定的Socket。然后向对方拨号呼叫,相当于发出连接请求(假如对方不在同一区内,还要拨对方区号,相当于给出网络地址)。假如对方在 场并空闲(相当于通信的另一主机开机且可以接受连接请求),拿起电话话筒,双方就可以正式通话,相当于连接成功。双方通话的过程,是一方向电话机发出信号 和对方从电话机接收信号的过程,相当于向Socket发送数据和从socket接收数据。通话结束后,一方挂起电话机相当于关闭Socket,撤消连接。
在电话系统中,一般用户只能感受到本地电话机和对方电话号码的存在,建立通话的过程,话音传输的过程以及整个电话系统的技术细节对他都是透明的,这也与Socket机制非常相似。Socket利用网间网通信设施实现进程通信,但它对通信设施的细节毫不关心,只要通信设施能提供足够的通信能力,它就满足了。
至此,我们对Socket进行了直观的描述。抽象出来,Socket实质上提供了进程通信的端点。进程通信之前,双方首先必须各自创建一个端点,否则是没有办法建立联系并相互通信的。正如打电话之前,双方必须各自拥有一台电话机一样。
在网间网内部,每一个Socket用一个半相关描述:(协议,本地地址,本地端口)。
一个完整的Socket有一个本地唯一的Socket号,由操作系统分配。
最重要的是,Socket是面向客户/服务器模型而设计的,针对客户和服务器程序提供不同的Socket系统调用。客户随机申请一个Socket(相当于一个想打电话的人可以在任何一台入网电话上拨号呼叫),系统为之分配一个Socket号;服务器拥有全局公认的Socket,任何客户都可以向它发出连接请求和信息请求(相当于一个被呼叫的电话拥有一个呼叫方知道的电话号码)。
Socket利用客户/服务器模式巧妙地解决了进程之间建立通信连接的问 题。服务器Socket半相关为全局所公认非常重要。读者不妨考虑一下,两个完全随机的用户进程之间如何建立通信?假如通信双方没有任何一方的 Socket固定,就好比打电话的双方彼此不知道对方的电话号码,要通话是不可能的。
1.2 python3 socket基础知识
Socket Families(地址簇)
socket.
AF_UNIX unix本机进程间通信
socket.
AF_INET IPV4
socket.
AF_INET6 IPV6
These constants represent the address (and protocol) families, used for the first argument to
socket()
. If the AF_UNIX
constant is not defined then this protocol is unsupported. More constants may be available depending on the system.
Socket Types
socket.
SOCK_STREAM #for tcp
socket.
SOCK_DGRAM #for udp
socket.
SOCK_RAW #原始套接字,普通的套接字无法处理ICMP、IGMP等网络报文,而SOCK_RAW可以;其次,SOCK_RAW也可以处理特殊的IPv4报文;此外,利用原始套接字,可以通过IP_HDRINCL套接字选项由用户构造IP头。
socket.
SOCK_RDM #是一种可靠的UDP形式,即保证交付数据报但不保证顺序。SOCK_RAM用来提供对原始协议的低级访问,在需要执行某些特殊操作时使用,如发送ICMP报文。SOCK_RAM通常仅限于高级用户或管理员运行的程序使用。
socket.
SOCK_SEQPACKET #废弃了
These constants represent the socket types, used for the second argument to socket()
. More constants may be available depending on the system. (Only SOCK_STREAM
and SOCK_DGRAM
appear to be generally useful.)
Socket 方法
socket.
socket
(family=AF_INET, type=SOCK_STREAM, proto=0, fileno=None)
Create a new socket using the given address family, socket type and protocol number. The address family should be AF_INET
(the default), AF_INET6
, AF_UNIX
, AF_CAN
or AF_RDS
. The socket type should beSOCK_STREAM
(the default), SOCK_DGRAM
, SOCK_RAW
or perhaps one of the other SOCK_
constants. The protocol number is usually zero and may be omitted or in the case where the address family is AF_CAN
the protocol should be one of CAN_RAW
or CAN_BCM
. If fileno is specified, the other arguments are ignored, causing the socket with the specified file descriptor to return. Unlike socket.fromfd()
, fileno will return the same socket and not a duplicate. This may help close a detached socket using socket.close()
.
socket.
socketpair
([family[, type[, proto]]])
Build a pair of connected socket objects using the given address family, socket type, and protocol number. Address family, socket type, and protocol number are as for the socket()
function above. The default family is AF_UNIX
if defined on the platform; otherwise, the default is AF_INET
.
socket.
create_connection
(address[, timeout[, source_address]])
Connect to a TCP service listening on the Internet address (a 2-tuple (host, port)
), and return the socket object. This is a higher-level function than socket.connect()
: if host is a non-numeric hostname, it will try to resolve it for both AF_INET
and AF_INET6
, and then try to connect to all possible addresses in turn until a connection succeeds. This makes it easy to write clients that are compatible to both IPv4 and IPv6.
Passing the optional timeout parameter will set the timeout on the socket instance before attempting to connect. If no timeout is supplied, the global default timeout setting returned by getdefaulttimeout()
is used.
If supplied, source_address must be a 2-tuple (host, port)
for the socket to bind to as its source address before connecting. If host or port are ‘’ or 0 respectively the OS default behavior will be used.
socket.
getaddrinfo
(host, port, family=0, type=0, proto=0, flags=0) #获取要连接的对端主机地址
sk.bind(address)
s.bind(address) 将套接字绑定到地址。address地址的格式取决于地址族。在AF_INET下,以元组(host,port)的形式表示地址。
sk.listen(backlog)
开始监听传入连接。backlog指定在拒绝连接之前,可以挂起的最大连接数量。
backlog等于5,表示内核已经接到了连接请求,但服务器还没有调用accept进行处理的连接个数最大为5
这个值不能无限大,因为要在内核中维护连接队列
sk.setblocking(bool)
是否阻塞(默认True),如果设置False,那么accept和recv时一旦无数据,则报错。
sk.accept()
接受连接并返回(conn,address),其中conn是新的套接字对象,可以用来接收和发送数据。address是连接客户端的地址。
接收TCP 客户的连接(阻塞式)等待连接的到来
sk.connect(address)
连接到address处的套接字。一般,address的格式为元组(hostname,port),如果连接出错,返回socket.error错误。
sk.connect_ex(address)
同上,只不过会有返回值,连接成功时返回 0 ,连接失败时候返回编码,例如:10061
sk.close()
关闭套接字
sk.recv(bufsize[,flag])
接受套接字的数据。数据以字符串形式返回,bufsize指定最多可以接收的数量。flag提供有关消息的其他信息,通常可以忽略。
sk.recvfrom(bufsize[.flag])
与recv()类似,但返回值是(data,address)。其中data是包含接收数据的字符串,address是发送数据的套接字地址。
sk.send(string[,flag])
将string中的数据发送到连接的套接字。返回值是要发送的字节数量,该数量可能小于string的字节大小。即:可能未将指定内容全部发送。
sk.sendall(string[,flag])
将string中的数据发送到连接的套接字,但在返回之前会尝试发送所有数据。成功返回None,失败则抛出异常。
内部通过递归调用send,将所有内容发送出去。
sk.sendto(string[,flag],address)
将数据发送到套接字,address是形式为(ipaddr,port)的元组,指定远程地址。返回值是发送的字节数。该函数主要用于UDP协议。
sk.settimeout(timeout)
设置套接字操作的超时期,timeout是一个浮点数,单位是秒。值为None表示没有超时期。一般,超时期应该在刚创建套接字时设置,因为它们可能用于连接的操作(如 client 连接最多等待5s )
sk.getpeername()
返回连接套接字的远程地址。返回值通常是元组(ipaddr,port)。
sk.getsockname()
返回套接字自己的地址。通常是一个元组(ipaddr,port)
sk.fileno()
套接字的文件描述符
socket.
sendfile
(file, offset=0, count=None)
发送文件 ,但目前多数情况下并无什么卵用。
SocketServer
The socketserver
module simplifies the task of writing network servers.
There are four basic concrete server classes:
- class
socketserver.
TCPServer
(server_address, RequestHandlerClass, bind_and_activate=True) -
This uses the Internet TCP protocol, which provides for continuous streams of data between the client and server. If bind_and_activate is true, the constructor automatically attempts to invokeserver_bind()
andserver_activate()
. The other parameters are passed to theBaseServer
base class.
- class
socketserver.
UDPServer
(server_address, RequestHandlerClass, bind_and_activate=True) -
This uses datagrams, which are discrete packets of information that may arrive out of order or be lost while in transit. The parameters are the same as forTCPServer
.
- class
socketserver.
UnixStreamServer
(server_address, RequestHandlerClass, bind_and_activate=True) - class
socketserver.
UnixDatagramServer
(server_address, RequestHandlerClass,bind_and_activate=True) -
These more infrequently used classes are similar to the TCP and UDP classes, but use Unix domain sockets; they’re not available on non-Unix platforms. The parameters are the same as forTCPServer
.
These four classes process requests synchronously; each request must be completed before the next request can be started. This isn’t suitable if each request takes a long time to complete, because it requires a lot of computation, or because it returns a lot of data which the client is slow to process. The solution is to create a separate process or thread to handle each request; the ForkingMixIn
and ThreadingMixIn
mix-in classes can be used to support asynchronous behaviour.
There are five classes in an inheritance diagram, four of which represent synchronous servers of four types:
+------------+ | BaseServer | +------------+ | v +-----------+ +------------------+ | TCPServer |------->| UnixStreamServer | +-----------+ +------------------+ | v +-----------+ +--------------------+ | UDPServer |------->| UnixDatagramServer | +-----------+ +--------------------+
Note that UnixDatagramServer
derives from UDPServer
, not from UnixStreamServer
— the only difference between an IP and a Unix stream server is the address family, which is simply repeated in both Unix server classes.
- class
socketserver.
ForkingMixIn
- class
socketserver.
ThreadingMixIn
-
Forking and threading versions of each type of server can be created using these mix-in classes. For instance,ThreadingUDPServer
is created as follows:class ThreadingUDPServer(ThreadingMixIn, UDPServer): pass
The mix-in class comes first, since it overrides a method defined in
UDPServer
. Setting the various attributes also changes the behavior of the underlying server mechanism.
- class
socketserver.
ForkingTCPServer
- class
socketserver.
ForkingUDPServer
- class
socketserver.
ThreadingTCPServer
- class
socketserver.
ThreadingUDPServer
-
These classes are pre-defined using the mix-in classes.Request Handler Objects
class
socketserver.
BaseRequestHandler
This is the superclass of all request handler objects. It defines the interface, given below. A concrete request handler subclass must define a new
handle()
method, and can override any of the other methods. A new instance of the subclass is created for each request.setup
()-
Called before thehandle()
method to perform any initialization actions required. The default implementation does nothing.
handle
()-
This function must do all the work required to service a request. The default implementation does nothing. Several instance attributes are available to it; the request is available asself.request
; the client address asself.client_address
; and the server instance asself.server
, in case it needs access to per-server information.The type of
self.request
is different for datagram or stream services. For stream services,self.request
is a socket object; for datagram services,self.request
is a pair of string and socket.
finish
()-
Called after thehandle()
method to perform any clean-up actions required. The default implementation does nothing. Ifsetup()
raises an exception, this function will not be called.1 # _*_ coding:utf-8 _*_ 2 # __author__ = "ZingP" 3 4 import socketserver 5 6 class MyTCPHandler(socketserver.BaseRequestHandler): 7 def handle(self): 8 while True: 9 try: 10 self.data = self.request.recv(1024).strip() 11 print("{}wrote:".format(self.client_address[0])) 12 print(self.data) 13 self.request.send(self.data.upper()) 14 except ConnectionResetError as e: 15 print("error:", e) 16 break 17 18 if __name__ == "__main__": 19 HOST, PORT = "localhost", 9000 20 # server = socketserver.TCPServer((HOST, PORT), MyTCPHandler) 21 # server = socketserver.ForkingTCPServer((HOST, PORT), MyTCPHandler) # 多进程 Windows上不好使,Linux好使。 22 server = socketserver.ThreadingTCPServer((HOST, PORT), MyTCPHandler) # 多线程 23 server.serve_forever() 24 25
2.socket进阶
2.1模拟ssh
2.1.1server端 1 # _*_ coding:utf-8 _*_ 2 __author__ = "ZingP" 3 4 import socket,os 6 server = socket.socket()
7 server.bind(("localhost", 9000)) 8 server.listen() 9 10 while True: 11 conn, addr = server.accept() 12 while True: 13 print("等待新指令。") 14 data = conn.recv(1024) 15 if not data: break 16 cmd_res = os.popen(data.decode()).read() 17 print("before send:", len(cmd_res)) 18 if len(cmd_res) == 0: 19 cmd_res = "cmd has no output..." # cmd 没有返回的数据 20 conn.send(str(len(cmd_res.encode())).encode("utf-8")) # 把cmd返回的数据大小发给客户端(告诉客户端,我要给你发这么大的数据) # 这里有个坑,就是必须对cmd_res进行encode(),否则会因为中文编码问题,导致发送的数据和接收的数据打印出的大小不一致 21 conn.send(cmd_res.encode("utf-8")) # cmd返回的数据 22 print("send done") 23 24 server.close()
2.1.2client端
# _*_ coding:utf-8 _*_ __author__ = "ZingP" import socket client = socket.socket() client.connect(("localhost", 9000)) while True: cmd = input(">>:").strip() if len(cmd) == 0: continue client.send(cmd.encode("utf-8")) cmd_res_size = client.recv(1024) # 第一次先接收即将要接收的数据大小 print("命令结果大小:", cmd_res_size) received_size = 0 # 初始收到的数据大小为零 received_data = b‘‘ # 初始收到的数据为空 while received_size < int(cmd_res_size.decode()): data = client.recv(1024) received_size += len(data) # 累加每次收到的数据大小 received_data += data # 累加每次收到的数据 else: print("收到数据大小:", received_size) print(received_data.decode()) client.close()