docker flannel网络部署和路由走向分析

1.flannel介绍

flannel是coreos开发的容器网络解决方案。flannel为每个host分配一个subnet,容器从此subnet中分配ip。这些ip可以在host间路由,容器间无需nat和port mapping就可以跨主机通讯。

每个subnet都是从一个更大的ip池中划分的,flannel会在每个主机上运行一个叫flanneld得agent,其职责是从ip池中分配subnet。为了在各个主机间共享信息,flannel用etcd存放网络配置,已分配的subnet,host的ip等信息。

数据包通过backend在主机间转发。
flannel提供了多种backend,最常用的有vxlan和host-gw。

2.部署实验环境

三个虚机

docker1 docker2 docker3?

etcd安装在docker1
docker1 docker2 docker3上运行flanneld
注:为了更方便的验证flannel和etcd所以docker1也安装了flannel,

其实可以不用在docker1安装
centos7自带了软件包,直接yum安装即可
2.1?
安装配置etcd

yum -y install etcd
[[email protected] ~]# systemctl start etcd? && systemctl enable etcd
[[email protected] ~]#

测试下

[[email protected] ~]# etcd --version
etcd Version: 3.2.18
Git SHA: eddf599
Go Version: go1.9.4
Go OS/Arch: linux/amd64
[[email protected] ~]#

[[email protected] ~]# etcdctl set test "a"
a
[[email protected] ~]# etcdctl get test
a
[[email protected] ~]#

2.2?

安装配置flannel

[[email protected] ~]# yum -y install flannel

启动

[[email protected] ~]# systemctl start flanneld

报错

[[email protected] ~]# systemctl status flanneld -l
● flanneld.service - Flanneld overlay address etcd agent
?? Loaded: loaded (/usr/lib/systemd/system/flanneld.service; disabled; vendor preset: disabled)
?? Active: activating (start) since Thu 2018-06-14 02:22:26 EDT; 1min 1s ago
Main PID: 2950 (flanneld)
?? Memory: 16.6M
?? CGroup: /system.slice/flanneld.service
? ? ? ? ?? └─2950 /usr/bin/flanneld -etcd-endpoints=http://127.0.0.1:2379 -etcd-prefix=/atomic.io/network

Jun 14 02:23:18 docker1 flanneld-start[2950]: E0614 02:23:18.974351? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:19 docker1 flanneld-start[2950]: E0614 02:23:19.977497? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:20 docker1 flanneld-start[2950]: E0614 02:23:20.980721? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:21 docker1 flanneld-start[2950]: E0614 02:23:21.983553? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:22 docker1 flanneld-start[2950]: E0614 02:23:22.988446? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:23 docker1 flanneld-start[2950]: E0614 02:23:23.992106? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:24 docker1 flanneld-start[2950]: E0614 02:23:24.994719? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:25 docker1 flanneld-start[2950]: E0614 02:23:25.998629? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:27 docker1 flanneld-start[2950]: E0614 02:23:27.002486? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]
Jun 14 02:23:28 docker1 flanneld-start[2950]: E0614 02:23:28.006185? ? 2950 network.go:102] failed to retrieve network config: 100: Key not found (/atomic.io) [11]

注意-etcd-prefix=/automic.io/network
flanel读取的网络配置是这个文件,这个文件是在

[[email protected] ~]# cat /usr/lib/systemd/system/flanneld.service
[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld-start $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
[[email protected] sysconfig]# cat flanneld
# Flanneld configuration options?

# etcd url location.? Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="http://127.0.0.1:2379"

# etcd config key.? This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/atomic.io/network"

# Any additional options that you want to pass
#FLANNEL_OPTIONS=""

注意:

FLANNEL_ETCD_PREFIX="/atomic.io/network"

这个FLANNEL_ETCD_PREFIX需要etcdctl手动去建立

[[email protected] ~]# etcdctl mk /atomic.io/network/config ‘{"Network":"172.17.0.0/16", "SubnetMin": "172.17.1.0", "SubnetMax": "172.17.254.0", "Backend":{"Type":"vxlan"}}‘

再启动flannel,启动正常

[[email protected] ~]# systemctl start flanneld && systemctl enable flanneld
Created symlink from /etc/systemd/system/multi-user.target.wants/flanneld.service to /usr/lib/systemd/system/flanneld.service.
Created symlink from /etc/systemd/system/docker.service.requires/flanneld.service to /usr/lib/systemd/system/flanneld.service.
[[email protected] ~]#
[[email protected] ~]# systemctl status flanneld
● flanneld.service - Flanneld overlay address etcd agent
?? Loaded: loaded (/usr/lib/systemd/system/flanneld.service; disabled; vendor preset: disabled)
?? Active: active (running) since Thu 2018-06-14 02:47:58 EDT; 11s ago
? Process: 3513 ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker (code=exited, status=0/SUCCESS)
Main PID: 3475 (flanneld)
?? Memory: 18.5M
?? CGroup: /system.slice/flanneld.service
? ? ? ? ?? └─3475 /usr/bin/flanneld -etcd-endpoints=http://127.0.0.1:2379 -etcd-prefix=/atomic.io/network

Jun 14 02:47:52 docker1 flanneld-start[3475]: E0614 02:47:52.150129? ? 3475 network.go:102] failed to retrieve network co...) [14]
Jun 14 02:47:53 docker1 flanneld-start[3475]: E0614 02:47:53.152602? ? 3475 network.go:102] failed to retrieve network co...) [14]
Jun 14 02:47:54 docker1 flanneld-start[3475]: E0614 02:47:54.155402? ? 3475 network.go:102] failed to retrieve network co...) [14]
Jun 14 02:47:55 docker1 flanneld-start[3475]: E0614 02:47:55.158612? ? 3475 network.go:102] failed to retrieve network co...) [14]
Jun 14 02:47:56 docker1 flanneld-start[3475]: E0614 02:47:56.164481? ? 3475 network.go:102] failed to retrieve network co...) [14]
Jun 14 02:47:57 docker1 flanneld-start[3475]: E0614 02:47:57.168282? ? 3475 network.go:102] failed to retrieve network co...) [14]
Jun 14 02:47:58 docker1 flanneld-start[3475]: I0614 02:47:58.179298? ? 3475 local_manager.go:179] Picking subnet in range....254.0
Jun 14 02:47:58 docker1 flanneld-start[3475]: I0614 02:47:58.261220? ? 3475 manager.go:250] Lease acquired: 172.17.21.0/24
Jun 14 02:47:58 docker1 flanneld-start[3475]: I0614 02:47:58.261993? ? 3475 network.go:98] Watching for new subnet leases
Jun 14 02:47:58 docker1 systemd[1]: Started Flanneld overlay address etcd agent.
Hint: Some lines were ellipsized, use -l to show in full.

看看这个脚本

Process: 3513 ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker (code=exited, status=0/SUCCESS)

flannel_env="/run/flannel/subnet.env"
docker_env="/run/docker_opts.env"
combined_opts_key="DOCKER_OPTS"
indiv_opts=false
combined_opts=false
ipmasq=true

检查下文件内容,我感觉是根据这个文件来生成网段,不确认

[[email protected] flannel]# cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.17.0.0/16
FLANNEL_SUBNET=172.17.21.1/24
FLANNEL_MTU=1472
FLANNEL_IPMASQ=false

看看ip段

[[email protected] ~]# ip a |grep flannel
11: flannel0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1472 qdisc pfifo_fast state UNKNOWN qlen 500
? ? inet 172.17.21.0/16 scope global flannel0
[[email protected] ~]#

以上都是docker1上的操作

2.3?

docker2,docker3上的操作是一样的,我记录docker2上的操作

[[email protected] ~]# yum -y install flannel

启动flannel

[[email protected] ~]# flanneld -etcd-endpoints=http://192.168.211.140:2379 -iface=ens33 -etcd-prefix=/atomic.io/network
I0614 04:28:55.785204? ? 2767 main.go:132] Installing signal handlers
I0614 04:28:55.785764? ? 2767 manager.go:149] Using interface with name ens33 and address 192.168.211.154
I0614 04:28:55.785784? ? 2767 manager.go:166] Defaulting external address to interface address (192.168.211.154)
E0614 04:28:55.786742? ? 2767 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 192.168.211.140:2379: getsockopt: no route to host
E0614 04:28:57.788671? ? 2767 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 192.168.211.140:2379: i/o timeout
E0614 04:28:59.791359? ? 2767 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 192.168.211.140:2379: i/o timeout

报错了
这个错误是因为etcd默认只监听本机的2379端口

[[email protected] ~]# cat /etc/etcd/etcd.conf
#[Member]
#ETCD_CORS=""
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
#ETCD_WAL_DIR=""
#ETCD_LISTEN_PEER_URLS="http://localhost:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379"
#ETCD_MAX_SNAPSHOTS="5"
#ETCD_MAX_WALS="5"
ETCD_NAME="default"
#ETCD_SNAPSHOT_COUNT="100000"

把ETCD_LISTEN_CLIENT_URLS="http://localhost:2379"改成ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"

重新启动etcd

[[email protected] ~]# systemctl restart etcd
[[email protected] ~]# systemctl status etcd
● etcd.service - Etcd Server
?? Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)
?? Active: active (running) since Thu 2018-06-14 04:38:49 EDT; 1min 11s ago
Main PID: 3401 (etcd)
?? Memory: 21.5M
?? CGroup: /system.slice/etcd.service
? ? ? ? ?? └─3401 /usr/bin/etcd --name=default --data-dir=/var/lib/etcd/default.etcd --listen-client-urls=http://0.0.0.0:23...

Jun 14 04:38:48 docker1 etcd[3401]: enabled capabilities for version 3.2
Jun 14 04:38:49 docker1 etcd[3401]: 8e9e05c52164694d is starting a new election at term 9
Jun 14 04:38:49 docker1 etcd[3401]: 8e9e05c52164694d became candidate at term 10
Jun 14 04:38:49 docker1 etcd[3401]: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 10
Jun 14 04:38:49 docker1 etcd[3401]: 8e9e05c52164694d became leader at term 10
Jun 14 04:38:49 docker1 etcd[3401]: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 10
Jun 14 04:38:49 docker1 etcd[3401]: published {Name:default ClientURLs:[http://192.168.211.140:2379]} to cluster cdf8...3a8c32
Jun 14 04:38:49 docker1 etcd[3401]: ready to serve client requests
Jun 14 04:38:49 docker1 systemd[1]: Started Etcd Server.
Jun 14 04:38:49 docker1 etcd[3401]: serving insecure client requests on [::]:2379, this is strongly discouraged!
Hint: Some lines were ellipsized, use -l to show in full.
[[email protected] ~]#

再启动还是报错?

[[email protected] ~]# systemctl status flanneld -l
● flanneld.service - Flanneld overlay address etcd agent
?? Loaded: loaded (/usr/lib/systemd/system/flanneld.service; disabled; vendor preset: disabled)
?? Active: inactive (dead)

Jun 14 04:21:53 docker2 flanneld-start[2706]: E0614 04:21:53.879476? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:21:54 docker2 flanneld-start[2706]: E0614 04:21:54.880962? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:21:55 docker2 flanneld-start[2706]: E0614 04:21:55.882332? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:21:56 docker2 flanneld-start[2706]: E0614 04:21:56.887002? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:21:57 docker2 flanneld-start[2706]: E0614 04:21:57.888246? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:21:58 docker2 flanneld-start[2706]: E0614 04:21:58.889903? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:21:59 docker2 flanneld-start[2706]: E0614 04:21:59.891323? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:22:00 docker2 flanneld-start[2706]: E0614 04:22:00.892229? ? 2706 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
Jun 14 04:22:01 docker2 systemd[1]: Stopped Flanneld overlay address etcd agent.
Jun 14 04:22:01 docker2 flanneld-start[2706]: I0614 04:22:01.105679? ? 2706 main.go:172] Exiting...
[[email protected] ~]#

拒绝连接,应该是防火墙的问题了

关闭docker1的防火墙

[[email protected] ~]# flanneld -etcd-endpoints=http://192.168.211.140:2379 -iface=ens33 -etcd-prefix=/atomic.io/network &
[1] 2938
[[email protected] ~]# I0614 04:44:03.522494? ? 2938 main.go:132] Installing signal handlers
I0614 04:44:03.523151? ? 2938 manager.go:149] Using interface with name ens33 and address 192.168.211.154
I0614 04:44:03.523174? ? 2938 manager.go:166] Defaulting external address to interface address (192.168.211.154)
I0614 04:44:03.530498? ? 2938 local_manager.go:134] Found lease (172.17.41.0/24) for current IP (192.168.211.154), reusing
I0614 04:44:03.546625? ? 2938 manager.go:250] Lease acquired: 172.17.41.0/24
I0614 04:44:03.547228? ? 2938 network.go:98] Watching for new subnet leases
I0614 04:44:03.558669? ? 2938 network.go:191] Subnet added: 172.17.21.0/24

[[email protected] ~]# ip a |grep flannel
8: flannel0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1472 qdisc pfifo_fast state UNKNOWN group default qlen 500
? ? inet 172.17.41.0/16 scope global flannel0
[[email protected] ~]#

启动了

有点奇怪,重启docker1后,可以进来了,不需要添加开放2379的规则

2.4?
docker3做一样的操作

3.分析flannel网络
?
3.1
上面把基本架构部署好了,具体如下:docker1安装了etcd,docker1 docker2 docker3都安装了flannel

在docker1上查看设置分配的网段和已经分配的网段

设置分配的网段

[[email protected] ~]#? etcdctl get atomic.io/network/config
{"Network":"172.17.0.0/16", "SubnetMin": "172.17.1.0", "SubnetMax": "172.17.254.0", "Backend":{"Type":"vxlan"}}

已经分配的网段

[[email protected] ~]# etcdctl ls atomic.io/network/subnets
/atomic.io/network/subnets/172.17.21.0-24
/atomic.io/network/subnets/172.17.41.0-24
/atomic.io/network/subnets/172.17.95.0-24
[[email protected] ~]#

3.2
docker中使用flannel网络

配置docker连接flannel,我这里用docker2和docker3
?
docker通过修改docker配置文件
/etc/systemd/system/docker.service
设置 --bip 和--mtu
连接flannel

--bip --mtu的值 ? 来自/run/flannel/subnet.env

[[email protected] ~]#? cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.17.0.0/16
FLANNEL_SUBNET=172.17.65.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false

--bip ?是 FLANNEL_SUBNET的值
--mtu 是 FLANNEL_MTU的值

在docker.service加上这两个值

[[email protected] ~]# cat /etc/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
After=network.target

[Service]
Type=notify
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver devicemapper --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic --bip=172.17.65.1/24 --mtu=1450
ExecReload=/bin/kill -s HUP
MountFlags=slave

重启docker

[[email protected] ~]# systemctl daemon-reload
[[email protected] ~]# systemctl restart docker.service

3.2
简易分析

docker连接上flannel后,网络路由和bridge 情况,参考如下

[[email protected] ~]# ip r
default via 192.168.211.2 dev ens33 proto dhcp metric 100
172.17.0.0/16 dev flannel.1
172.17.65.0/24 dev docker0 proto kernel scope link src 172.17.65.1
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
192.168.211.0/24 dev ens33 proto kernel scope link src 192.168.211.154 metric 100
[[email protected] ~]# brctl show
bridge name? ? ? ? bridge id? ? ? ? ? ? ? ? STP enabled? ? ? ? interfaces
docker0? ? ? ? ? ? ? ? 8000.02427f17e635? ? ? ? no? ? ? ? ? ? ? ? veth7680282
docker_gwbridge? ? ? ? ? ? ? ? 8000.024266f01344? ? ? ? no? ? ? ? ? ? ? ?
[[email protected] ~]#

没有生成新的网桥,使用默认的docker0

172.17.65.0 同一docker主机 容器通过docker0连接
172.17.0.0 ? 不同docker主机 容器通过flannel.1转发

3.3?
容器连接flannel网络

[[email protected] ~]# docker run -itd --name bbox1 busybox
dc344e4b30e48bbc5b914edc3724e43d56fa0ee7abf97246dc35ce57b6cf872c
[[email protected] ~]# docker exec bbox1 ip r
default via 172.17.65.1 dev eth0
172.17.65.0/24 dev eth0 scope link? src 172.17.65.2
[[email protected] ~]#

[[email protected] ~]# docker run -itd --name bbox2 busybox
e9e824f93e8dedb498f436b169ed8dcb85bc951f4d05ae4f254aad1e3d538a8a
[[email protected] ~]# docker exec bbox2 ip r
default via 172.17.16.1 dev eth0
172.17.16.0/24 dev eth0 scope link? src 172.17.16.2
[[email protected] ~]#

互ping

[[email protected] ~]# docker exec bbox1 ping -c 5 172.17.16.2
PING 172.17.16.2 (172.17.16.2): 56 data bytes
64 bytes from 172.17.16.2: seq=0 ttl=60 time=3.274 ms
64 bytes from 172.17.16.2: seq=1 ttl=60 time=1.356 ms
^C
[[email protected] ~]#

?[[email protected] ~]# docker exec bbox2 ping -c 5 172.17.65.2
PING 172.17.65.2 (172.17.65.2): 56 data bytes
64 bytes from 172.17.65.2: seq=0 ttl=60 time=2.995 ms
64 bytes from 172.17.65.2: seq=1 ttl=60 time=1.396 ms
64 bytes from 172.17.65.2: seq=2 ttl=60 time=1.650 ms
^C
[[email protected] ~]#

分析数据流

从bbox1 ping 172.17.16.2

看看它的默认路由

[[email protected] ~]# docker exec bbox1 ip r
default via 172.17.65.1 dev eth0
172.17.65.0/24 dev eth0 scope link? src 172.17.65.2
[[email protected] ~]#

目的地址 172.17.16.2不在直连网络,因此数据包从default路由出。default路由的地址时 172.17.65.1,这个地址就是docker0的地址

[[email protected] ~]# ip a |grep docker0
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
? ? inet 172.17.65.1/24 brd 172.17.65.255 scope global docker0
24: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master docker0 state UP group default
[[email protected] ~]#

数据到达docker0后,发现这个数据包的地址是172.17.16.2,并不是给自己,寻找下一跳
看看node上的路由表

[[email protected] ~]# ip r
default via 192.168.211.2 dev ens33 proto dhcp metric 100
172.17.0.0/16 dev flannel.1
172.17.65.0/24 dev docker0 proto kernel scope link src 172.17.65.1
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
192.168.211.0/24 dev ens33 proto kernel scope link src 192.168.211.154 metric 100
[[email protected] ~]#

匹配到172.17.0.0/16这条路由,这是直连路由,数据送到flannel.1上

flannel.1收到数据包,自己不是目的地,需要发数据发送出去,数据包沿着网络协议向下流动,在二层封装以太包,填写目的mac地址。
这时发出arp“who is 172.17.16.2"

flannel.1是vxlan设备,vxlan并不会在二层发arp包,而是由linux kernel的 “L3 MISS"事件,将arp发到用户空间和flanneld程序。

linux kernel的 “L3 MISS"事件参数由下面的参数设置
[[email protected] ~]# cat /proc/sys/net/ipv4/neigh/flannel.1/app_solicit
3
[[email protected] ~]#

flanneld程序收到“L3 MISS”内核事件以及ARP请求后,并不会向外网发送arp request,而是从etcd查找匹配该地址的子网的vtep信息。

[[email protected] ~]#? curl -L http://192.168.211.140:2379/v2/keys/atomic.io/network/subnets/172.17.16.0-24
{"action":"get","node":{"key":"/atomic.io/network/subnets/172.17.16.0-24","value":"{\"PublicIP\":\"192.168.211.153\",\"BackendType\":\"vxlan\",\"BackendData\":{\"VtepMAC\":\"d2:72:c5:90:ff:8c\"}}","expiration":"2018-06-20T06:49:24.673562899Z","ttl":83016,"modifiedIndex":98,"createdIndex":98}}
[[email protected] ~]#

flanneld从etcd中找到答案

subnets:172.17.16.0-24
PublicIP:192.168.211.153
VtepMAC:d2:72:c5:90:ff:8c

VtepMAC:d2:72:c5:90:ff:8c 这个地址谁呢?

到192.168.211.153也就是这里的docker3检查下

[[email protected] ~]# ip -d link show

11: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
? ? link/ether d2:72:c5:90:ff:8c brd ff:ff:ff:ff:ff:ff promiscuity 0
? ? vxlan id 1 local 192.168.211.153 dev ens33 srcport 0 0 dstport 8472 nolearning ageing 300 noudpcsum noudp6zerocsumtx noudp6zerocsumrx?

可以看到docker3的flannel.1的mac地址就是

找到目的地后,flanneld将查询到的信息存入arp缓存

[[email protected] ~]# ip n
192.168.211.254 dev ens33 lladdr 00:50:56:ea:e1:f4 STALE
172.17.16.2 dev flannel.1 lladdr d2:72:c5:90:ff:8c STALE
192.168.211.2 dev ens33 lladdr 00:50:56:fd:b3:89 STALE
192.168.211.153 dev ens33 lladdr 00:0c:29:93:2c:89 STALE
192.168.211.1 dev ens33 lladdr 00:50:56:c0:00:08 DELAY
172.17.65.2 dev docker0 lladdr 02:42:ac:11:41:02 STALE
192.168.211.140 dev ens33 lladdr 00:0c:29:f9:b7:d2 DELAY
[[email protected] ~]#

最后封装vxlan包,发送到目的地

4.
flannel 为每个主机分配了独立的 subnet但 flannel.1 将这些 subnet 连接起来了相互之间可以路由。本质上flannel 将各主机上相互独立的 docker0 容器网络组成了一个互通的大网络实现了容器跨主机通信。flannel 没有提供隔离。

5.
host-gw

flannel支持多种backend,host-gw是flannel的另一种backend.
与vxlan不同,host-gw不会封装数据包.而是在主机路由表中创建到其他主机subnet的路由条目,实现容器跨主机通讯.

设置backend
修改下前面设置的,把backend改成host-gw

[[email protected] ~]# etcdctl set atomic.io/network/config ‘{"Network":"172.17.0.0/16", "SubnetMin": "172.17.1.0", "SubnetMax": "172.17.254.0", "Backend":{"Type":"host-gw"}}‘
{"Network":"172.17.0.0/16", "SubnetMin": "172.17.1.0", "SubnetMax": "172.17.254.0", "Backend":{"Type":"host-gw"}}
[[email protected] ~]# etcdctl get atomic.io/network/config
{"Network":"172.17.0.0/16", "SubnetMin": "172.17.1.0", "SubnetMax": "172.17.254.0", "Backend":{"Type":"host-gw"}}
[[email protected] ~]#

在docker2,docker3重新启动flanneld进程.

[[email protected] ~]# flanneld -etcd-endpoints=http://192.168.211.140:2379 -iface=ens33 -etcd-prefix=/atomic.io/network &

检查下路由表

[[email protected] ~]# ip r
default via 192.168.211.2 dev ens33 proto dhcp metric 100
169.254.0.0/16 dev ens33 scope link metric 1002
172.17.16.0/24 via 192.168.211.153 dev ens33
172.17.65.0/24 dev docker0 proto kernel scope link src 172.17.65.1
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
192.168.211.0/24 dev ens33 proto kernel scope link src 192.168.211.154 metric 100
[[email protected] ~]#

[[email protected] ~]# ip r
default via 192.168.211.2 dev ens33 proto dhcp metric 100
169.254.0.0/16 dev ens33 scope link metric 1002
172.17.16.0/24 dev docker0 proto kernel scope link src 172.17.16.1
172.17.65.0/24 via 192.168.211.154 dev ens33
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
192.168.211.0/24 dev ens33 proto kernel scope link src 192.168.211.153 metric 100
[[email protected] ~]#

172.17.16.0/24 via 192.168.211.153 dev ens33 ?docker2
172.17.65.0/24 via 192.168.211.154 dev ens33 ?docker3

出现了相对应的路由条目

需要修改mtu并且重启docker
mtu值前面已经有过说明,根据下面的值来修改

[[email protected] ~]# cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.17.0.0/16
FLANNEL_SUBNET=172.17.16.1/24
FLANNEL_MTU=1500
FLANNEL_IPMASQ=false
[[email protected] ~]#

[[email protected] ~]# sed -i ‘s/1450/1500/g‘? /etc/systemd/system/docker.service

重启docker

[[email protected] ~]# systemctl daemon-reload
[[email protected] ~]# systemctl restart docker
[[email protected] ~]#

通过容器数据走向分析下

[[email protected] ~]# docker exec bbox1 ping 172.17.16.2
PING 172.17.16.2 (172.17.16.2): 56 data bytes
64 bytes from 172.17.16.2: seq=0 ttl=62 time=1.851 ms
64 bytes from 172.17.16.2: seq=1 ttl=62 time=0.946 ms
64 bytes from 172.17.16.2: seq=2 ttl=62 time=0.972 ms
^C
[[email protected] ~]# docker exec bbox1 ip? r
default via 172.17.65.1 dev eth0
172.17.65.0/24 dev eth0 scope link? src 172.17.65.2
[[email protected] ~]#

走默认路由172.17.65.1

[[email protected] ~]# ip a |grep docker0
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
? ? inet 172.17.65.1/24 brd 172.17.65.255 scope global docker0
7: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default
[[email protected] ~]#

数据到达docker0

[[email protected] ~]# ip r
default via 192.168.211.2 dev ens33 proto dhcp metric 100
169.254.0.0/16 dev ens33 scope link metric 1002
172.17.16.0/24 via 192.168.211.153 dev ens33
172.17.65.0/24 dev docker0 proto kernel scope link src 172.17.65.1
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1
192.168.211.0/24 dev ens33 proto kernel scope link src 192.168.211.154 metric 100
[[email protected] ~]#

匹配到172.17.16.0/24 via 192.168.211.153 dev ens33

直接发送数据过去.

6.
vxlan和host-gw的简单比较

host-gw 把每个主机配置成网关,主机知道其他主机的subnet和转发地址.
vxlan是在主机间建立隧道.
不同的主机在一个大的网内.

vxlan需要对数据进行打包拆包,性能低于host-gw

原文地址:http://blog.51cto.com/goome/2155643

时间: 2024-10-11 10:38:52

docker flannel网络部署和路由走向分析的相关文章

Flannel网络部署

一.Flannel网络部署 为Flannel生成证书 [[email protected] ssl]# vim flanneld-csr.json { "CN": "flanneld", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C":

Kubernetes Flannel网络部署

之前的博客已经介绍了部署一个简单的Kubernetes集群,但是这个集群环境没有一个合理的网络配置.在实际生产中要实现集群中各个组件的通信,就需要使用第三方提供的网络插件. Flannel 二进制安装 1.下载fannel组件 wget https://github.com/coreos/flannel/releases/download/v0.10.0/flannel-v0.10.0-linux-amd64.tar.gz 2.安装Flannel 网络组件 tar xf flannel-v0.1

Docker Macvlan网络部署

Macvlan Bridge模式 节点1创建 docker network create -d macvlan --subnet=172.100.1.0/24 --gateway=172.100.1.1 -o parent=ens33 macvlan_net 网段为172.100.1.0/24 桥接的网卡为ens33 macvlan_net为指定的名称 查看 节点2也创建 创建容器测试互通 节点1创建 需要指定IP地址 docker run -it --net macvlan_net --ip=

部署Flannel网络

部署Flannel网络 部署flannel网络需要执行以下步骤: 1)写入分配的子网段到etcd,供flanneld使用 2)下载二进制包 3)配置Flannel 4)systemd管理Flannel 5)配置Docker启动指定子网段 6)启动 第一步:下载flannel二进制包 wget https://github.com/coreos/flannel/releases/download/v0.9.1/flannel-v0.9.1-linux-amd64.tar.gz 第二步:解压flan

Docker源码分析(六):Docker Daemon网络

1. 前言 Docker作为一个开源的轻量级虚拟化容器引擎技术,已然给云计算领域带来了新的发展模式.Docker借助容器技术彻底释放了轻量级虚拟化技术的威力,让容器的伸缩.应用的运行都变得前所未有的方便与高效.同时,Docker借助强大的镜像技术,让应用的分发.部署与管理变得史无前例的便捷.然而,Docker毕竟是一项较为新颖的技术,在Docker的世界中,用户并非一劳永逸,其中最为典型的便是Docker的网络问题. 毋庸置疑,对于Docker管理者和开发者而言,如何有效.高效的管理Docker

kubernetes容器集群部署Flannel网络

Overlay Network:覆盖网络,在基础网络上叠加的一种虚拟网络技术模式,该网络中的主机通过虚拟链路连接起来. VXLAN:将源数据包封装到UDP中,并使用基础网络的IP/MAC作为外层报文头进行封装,然后在以太网上传输,到达目的地后由隧道端点解封并将数据发送给目的地址. Fannel:Overlay网络的一种,也是将源数据包封装在另一种网络包里面进行路由转发和通信,目前已经支持UDP.VXLAN.AWS VPC和GCE路由等数据转发方式. 多主机容器网络通信其他主流方案:隧道方案(We

Docker源码分析(七):Docker Container网络 (上)

1.前言(什么是Docker Container) 如今,Docker技术大行其道,大家在尝试以及玩转Docker的同时,肯定离不开一个概念,那就是“容器”或者“Docker Container”.那么我们首先从实现的角度来看看“容器”或者“Docker Container”到底为何物. 逐渐熟悉Docker之后,大家肯定会深深得感受到:应用程序在Docker Container内部的部署与运行非常便捷,只要有Dockerfile,应用一键式的部署运行绝对不是天方夜谭: Docker Conta

Flannel网络组件部署

在部署K8S之前,需要在集群服务器上部署CNI容器网络组件,从而实现集群的网络互联互通.目前可选的组件比较多,例如flannel.calico.weave等,各容器网络组件对比可参考文档:http://dockone.io/article/2599 本文介绍flannel网络组件的部署,配置环境在完成前文etcd集群和tls认证配置后.一.生成flannel证书文件 # mkdir flanneld # cd flanneld # cat flanneld-csr.json { "CN"

kubernetes之Flannel网络插件部署

Kubernetes系统上Pod网络的实现依赖于第三方插件,而Flannel是由CoreOS主推的目前比较主流的容器网络解决方案,CNI插件有两种功能:网络配置和网络策略,由于flannel比较简单,并不支持网络策略,flannel项目自身只是一个框架,真正提供网络功能的是它的后端实现,目前,Flannel支持三种不同后端实现,分别是: UDP VXLAN host-gw UDP是Flannel项目最早支持的一种方式,是性能最差的方式,目前已被废弃. 用的最多的是VXLAN和host-gw模式的