Jumbo Frame(巨帧)
IEEE 802.3以太网标准仅规定支持1500Byte的帧MTU,总计1518Byte的帧大小。(使用IEEE 802.1Q VLAN/QoS标签时,增加至1522Byte)而巨型帧往往采用9000Byte的帧MTU,合计9018/9022Byte的帧大小。
目前巨型帧尚未成为官方的IEEE 802.3以太网标准的一部分。所以不同硬件厂商的设备支持程度可能不尽相同。
使用巨型帧,增大的有效报文长度提升了带宽使用效率的提升(如下图)。与此同时,增长的报文也带来传输时延的增加,时延敏感型数据并不适合使用巨型帧传输。
neutron中的MTU配置项
从配置项的描述总结而言,global_physnet_mtu与physical_network_mtus共同定义了underlay physical network的MTU,path_mtu定义了overlay network的MTU。
**调整MTU的3个用例
单MTU值物理网络体系**
在neutron.conf中
1.[DEFAULT]
2.global_physnet_mtu?=?900
在ml2.ini中
1.[ml2]??
2.path_mtu?=?9000
该配置定义了所有underlay网络(flat,vlan)与overlay网络(vxlan,gre)的MTU值均为9000。
多MTU值物理网络体系
在neutron.conf中
1.[DEFAULT]??
2.global_physnet_mtu?=?9000?
在ml2.ini中
1.? [ovs]??
2.??bridge_mappings?=?provider1:eth1,provider2:eth2,provider3:eth3??
3.? [ml2]??
4.??physical_network_mtus?=?provider2:4000,provider3:1500??
5.??path_mtu?=?9000??
该配置定义了underlay网络provider2的MTU值为4000,provider3的MTU值为1500,其他如provider1的MTU值为9000。而overlay网络的MTU值为9000。
Overlay网络MTU
在neutron.conf中
- [DEFAULT]??
2.?global_physnet_mtu?=?9000?
在ml2.ini中
- [ml2]??
2.?path_mtu?=?4000?
该配置定义了所有underlay网络MTU值为9000,overlay网络的MTU值均为4000。
代码浅析
创建network resource时的MTU处理
flat和vlan网络,根据实际的物理网络映射与physical_network_mtus、global_physnet_mtu信息,获取最小可用MTU值。
1.??def?get_deployment_physnet_mtu():??
2.??return?cfg.CONF.global_physnet_mtu??
3.? ??
4.??class?BaseTypeDriver(api.ML2TypeDriver):??
5.??def?init(self):??
6.??try:??
7.? self.physnet_mtus = helpers.parse_mappings(??
8.? cfg.CONF.ml2.physical_network_mtus, unique_values=False??
9.? ? )??
10.??except?Exception as e:??
11.? LOG.error("Failed to parse physical_network_mtus: %s", e)??
12.? self.physnet_mtus = []??
13.? ?
14.??def?get_mtu(self, physical_network=None):??
15.???return?p_utils.get_deployment_physnet_mtu()??
16.? ?
17.?class?FlatTypeDriver(helpers.BaseTypeDriver):??
18.? ? ?...??
19.??def?get_mtu(self, physical_network):??
20.? seg_mtu = super(FlatTypeDriver, self).get_mtu()??
21.??mtu?= []??
22.? if seg_mtu > 0:??
23.? ?mtu.append(seg_mtu)??
24.???if?physical_network in self.physnet_mtus:??
25.? mtu.append(int(self.physnet_mtus[physical_network]))??
26.??return?min(mtu) if mtu else 0??
27.? ?
28.?class?VlanTypeDriver(helpers.SegmentTypeDriver):??
29.? ? ?...??
30.??def?get_mtu(self, physical_network):??
31.? seg_mtu = super(VlanTypeDriver, self).get_mtu()??
32.? mtu = []??
33.??if?seg_mtu > 0:??
34.? mtu.append(seg_mtu)??
35.??if?physical_network in self.physnet_mtus:??
36.? mtu.append(int(self.physnet_mtus[physical_network]))??
37.??return?min(mtu) if mtu else 0??
Geneve,Gre,Vxlan类型网络,则根据global_physnet_mtu与path_mtu中选取最小的可用MTU值,减去各类型报文头部开销,获取实际可用MTU值。
1.? class _TunnelTypeDriverBase(helpers.SegmentTypeDriver):??
2.? ...??
3.? def get_mtu(self, physical_network=None):??
4.? seg_mtu = super(_TunnelTypeDriverBase, self).get_mtu()??
5.? mtu = []??
6.? if seg_mtu > 0:??
7.? mtu.append(seg_mtu)??
8.? if cfg.CONF.ml2.path_mtu > 0:??
9.? mtu.append(cfg.CONF.ml2.path_mtu)??
10.? version = cfg.CONF.ml2.overlay_ip_version??
11.? ip_header_length = p_const.IP_HEADER_LENGTH[version]??
- return min(mtu) - ip_header_length if mtu else 0??
13.? ?
- class GeneveTypeDriver(type_tunnel.EndpointTunnelTypeDriver):??
15.? ...??
- def get_mtu(self, physical_network=None):??
- mtu = super(GeneveTypeDriver, self).get_mtu()??
- return mtu - self.max_encap_size if mtu else 0??
19.? ?
- class GreTypeDriver(type_tunnel.EndpointTunnelTypeDriver):??
21.? ? ?...??
- def get_mtu(self, physical_network=None):??
23.? mtu = super(GreTypeDriver, self).get_mtu(physical_network)??
24.? return mtu - p_const.GRE_ENCAP_OVERHEAD if mtu else 0??
25.? ? ? ? ? ?
- class VxlanTypeDriver(type_tunnel.EndpointTunnelTypeDriver):??
27.? ...??
- def get_mtu(self, physical_network=None):??
- mtu = super(VxlanTypeDriver, self).get_mtu()??
- return mtu - p_const.VXLAN_ENCAP_OVERHEAD if mtu else 0??
在用户实际创建network资源时,若未显式指定网络MTU值,则使用该网络类型下系统定义的最大可用MTU。若显式指定MTU,neutron会检查用户定义MTU是否小于等于该网络类型下系统定义的最大可用MTU。
1.? def _get_network_mtu(self, network_db, validate=True):??
2.? mtus = []??
3.? ...??
4.? for s in segments:??
5.? segment_type = s.get(‘network_type‘)??
6.? if segment_type is None:??
7.? ?...??
8.? else:??
9.? mtu = type_driver.get_mtu(s[‘physical_network‘])??
-
Some drivers, like ‘local‘, may return None; the assumption??
11.? # then is that for the segment type, MTU has no meaning or??
-
unlimited, and so we should then ignore those values.??
13.? if mtu:??
14.? mtus.append(mtu)??
15.? ?
- max_mtu = min(mtus) if mtus else p_utils.get_deployment_physnet_mtu()??
- net_mtu = network_db.get(‘mtu‘)??
18.? ?
19.? if validate:??
20.? # validate that requested mtu conforms to allocated segments??
21.? if net_mtu and max_mtu and max_mtu < net_mtu:??
- msg = _("Requested MTU is too big, maximum is %d") % max_mtu??
- raise exc.InvalidInput(error_message=msg)??
24.? ?
25.? # if mtu is not set in database, use the maximum possible??
- return net_mtu or max_mtu??
虚拟机tap设置MTU
在使用Linux Bridge实现的Neutron网络中,Linux Bridge Agent在侦测到新的device后,会通过ip link set 操作,根据network中的MTU值,设置虚拟机绑定至Linux Bridge的tap设备的MTU值。反观Openvswitch实现的网络中却没有相关的设置。实际在使用过程中需要通过ovs-vsctl set Interface <tap name> mtu_request=<MTU Value>命令人工去设置tap设备的MTU值。
1.? class LinuxBridgeManager(amb.CommonAgentManagerBase):??
2.? def plug_interface(self, network_id, network_segment, tap_name,??
3.? device_owner):??
4.? return self.add_tap_interface(network_id, network_segment.network_type,??
5.? network_segment.physical_network,??
6.? network_segment.segmentation_id,??
7.? tap_name, device_owner,??
8.? network_segment.mtu)??
9.? ??
10.? def _set_tap_mtu(self, tap_device_name, mtu):??
- ip_lib.IPDevice(tap_device_name).link.set_mtu(mtu)??
网络设备tap设置MTU
dhcp和router相关的tap设备在plug时,neutron会根据网络的MTU,在各tap设备所在的namespace内运行“ip link set <tap name> mtu <MTU value>”设置tap设备的MTU值。
1.? class OVSInterfaceDriver(LinuxInterfaceDriver):??
- def plug_new(self, network_id, port_id, device_name, mac_address,??
3.? bridge=None, namespace=None, prefix=None, mtu=None):??
4.? ...??
5.? # NOTE(ihrachys): the order here is significant: we must set MTU after??
6.? # the device is moved into a namespace, otherwise OVS bridge does not??
7.? # allow to set MTU that is higher than the least of all device MTUs on??
8.? # the bridge??
9.? if mtu:??
10.? self.set_mtu(device_name, mtu, namespace=namespace, prefix=prefix)??
11.? else:??
12.? LOG.warning("No MTU configured for port %s", port_id)??
13.? ...??
14.? ?
- def set_mtu(self, device_name, mtu, namespace=None, prefix=None):??
16.? if self.conf.ovs_use_veth:??
- tap_name = self._get_tap_name(device_name, prefix)??
- root_dev, ns_dev = _get_veth(??
19.? tap_name, device_name, namespace2=namespace)??
20.? root_dev.link.set_mtu(mtu)??
- else:??
- ns_dev = ip_lib.IPWrapper(namespace=namespace).device(device_name)??
23.? ns_dev.link.set_mtu(mtu)??
24.? ?
- class IpLinkCommand(IpDeviceCommandBase):??
- COMMAND = ‘link‘??
27.? ? ?...??
28.? def set_mtu(self, mtu_size):??
29.? self._as_root([], (‘set‘, self.name, ‘mtu‘, mtu_size))?
bridge间veth设置MTU
Openstack从J版以后,neutron使用ovs patch port代替了linux veth实现OVS网桥之间的连接(出于性能提升的目的)。但依旧保留了veth连接的方式。在openvswitch_agent.ini中可以通过配置use_veth_interconnection=true启用veth连接网桥的功能。如果开启这项配置,默认的veth_mtu值为9000。当配置链路MTU大于9000时,需要修改openvswitch_agent.ini配置文件中veth_mtu的值,以免发生瓶颈效应。
1.? class OVSNeutronAgent(l2population_rpc.L2populationRpcCallBackTunnelMixin,??
2.? dvr_rpc.DVRAgentRpcCallbackMixin):??
3.? def init(self, bridge_classes, ext_manager, conf=None):??
4.? ...??
5.? self.use_veth_interconnection = ovs_conf.use_veth_interconnection??
6.? self.veth_mtu = agent_conf.veth_mtu??
7.? ?...??
8.? def setup_physical_bridges(self, bridge_mappings):??
9.? ?‘‘‘‘‘Setup the physical network bridges.?
10.??
11.? Creates physical network bridges and links them to the?
12.? integration bridge using veths or patch ports.?
13.??
14.? ?:param bridge_mappings: map physical network names to bridge names.?
15.? ?‘‘‘??
16.? self.phys_brs = {}??
17.? self.int_ofports = {}??
18.? self.phys_ofports = {}??
19.? ip_wrapper = ip_lib.IPWrapper()??
20.? ovs = ovs_lib.BaseOVS()??
21.? ovs_bridges = ovs.get_bridges()??
22.? for physical_network, bridge in bridge_mappings.items():??
23.? ?...??
24.? ?if self.use_veth_interconnection:??
25.? ?# enable veth to pass traffic??
26.? int_veth.link.set_up()??
27.? ?phys_veth.link.set_up()??
28.? ?if self.veth_mtu:??
29.? ?# set up mtu size for veth interfaces??
30.? int_veth.link.set_mtu(self.veth_mtu)??
31.? phys_veth.link.set_mtu(self.veth_mtu)??
32.? else:??
33.? # associate patch ports to pass traffic??
34.? self.int_br.set_db_attribute(‘Interface‘, int_if_name,??
35.? ?‘options‘, {‘peer‘: phys_if_name})??
36.? br.set_db_attribute(‘Interface‘, phys_if_name,??
37.? ? ‘options‘, {‘peer‘: int_if_name})??
虚拟机网卡如何设置MTU
虚拟机内部网卡配置MTU则是通过虚拟机DHCP请求IP地址时,顺便请求MTU值。在RFC2132 DHCP Option and BOOTP Vendor Extensions里明确定义了Interface MTU Option。DHCP Option Code 26 用两个字节的MTU数据,定义了网络接口的MTU值。如下表所示。
在DHCP agent中,dnsmasq的spawn_process会根据network的MTU值调整自身的启动参数。从而使虚拟机在DHCP过程中能正确地配置自身网卡的MTU值。
1.? class Dnsmasq(DhcpLocalProcess):??
2.? def _build_cmdline_callback(self, pid_file):??
3.? # We ignore local resolv.conf if dns servers are specified??
4.? # or if local resolution is explicitly disabled.??
5.? ...??
6.? mtu = getattr(self.network, ‘mtu‘, 0)??
7.? ?# Do not advertise unknown mtu??
8.? ?if mtu > 0:??
9.? cmd.append(‘--dhcp-option-force=option:mtu,%d‘ % mtu)??
10.? ?...??
11.? return cmd??
探测MTU
通过指定ICMP报文内容size以及IP报文不分片来探测MTU值设置是否正确。注意这里的size是指icmp data size。该size并不包含ICMP报文头部长度(8Byte)以及IP头部长度(20Byte)。
windows下:
1.? ping -f -l <size> <target_name/target_ip>??
linux下:
- ping -M do -s <size> <target_name/target_ip>
原文地址:https://blog.51cto.com/99cloud/2403060