Data Center手册(4):设计

基础架构

拓扑图

Switching Path

L3 routing at aggregation layer

L2 switching at access layer

L3 switch融合了三种功能:

RP, router processor, 处理路由协议

SP, switch processor, 处理L2协议

ASIC, Application-specific integrated circuit专用集成电路,用于重写header的

对于traffic forwarding有几种方法:

  • Process switching: 通过IP input过程,每个包都通过CPU处理,查找整个routing table,因而是最慢的
  • Fast switching: 将第一个包的路由查询结果放在cache里面,后续的package之间查找cache即可
  • CEF:是最快的方式,它处理routing table,得到一个可以快速查询的FIB forwarding information base,无论是第一个包,还是后续包,都能快速的查询。而且处理有特别的硬件ASIC进行。

Use VLAN

VLAN可以很好的进行二次隔离。

L3 switch可以允许不同的VLAN之间进行通信,通过一个L3的interface称为SVI,是一个在VLAN上虚拟网卡,没有物理端口与之对应,仅仅用于VLAN之间的通信。

当一个VLAN上面没有物理端口的时候,这个虚拟端口也会设置为down,从而不会有包再到这个VLAN,这种行为称之为Autostate.

Link Redundancy and Load Distribution

容错与分流

使用EtherChannels增加带宽,将多个连接绑定在一起,在STP看来是一个link, LACP

L2的分流方法,我们仅仅考虑Loop-free的情况。

HSRP, VRRP, and GLBP are the key protocols to provide redundancy when working with a static routing environment. HSRP is a Cisco proprietary protocol (RFC 2281, informational), VRRP is an Internet Engineering Task Force (IETF)–proposed standard (RFC 2338), and GLBP is a Cisco proprietary protocol.

With HSRP, only one of the two routers (the active router) is responsible for routing the servers’ traffic; the standby router assumes responsibility for the task when the active router fails.

Aggregation1 and Aggregation2 both have an interface on VLAN 10: 10.0.0.253 and 10.0.0.254.

Together, they provide the default gateway to the servers: 10.0.0.1.

Aggregation1 is the active HSRP router: when the server sends an ARP request for 10.0.0.1, Aggregation1 responds with the MAC address 0000.0c07.ac01, which is a virtual MAC (vMAC) address; the burned-in MAC address (BIA) for Aggregation1 is 0003.6c43.8c0a.

In case the interface of Aggregation1 on VLAN 10 is lost, Aggregation2 takes over 10.0.0.1 and the MAC address 0000.0c07.ac01.

HSRP Group

One VLAN segment can have multiple groups

multiple virtual IP addresses to be used concurrently

One single router interface can belong to multiple groups and be active for one group and standby for another one.

You assign half of the servers to use the HSRP IP address of group 1 (10.0.0.1) as the default gateway and the other half to use the HSRP IP address of group 2 (10.0.0.2).

VRRP conceptually is similar to HSRP

In the presence of multiple routers on a VLAN segment, VRRP elects a router as master and the other routers as backup for a given virtual router (equivalent to an HSRP group).

VRRP has preemption enabled by default. You can use the command no vrrp group preempt to  disable preemption.

The master router sends hello packets to the multicast IP address 224.0.0.18 (MAC 0100.5e00.0012) every 1 sec, and the backup detects the failure of the master after three hello packets are lost.

GLBP, it possible for the peer routers providing redundancy to the servers to be active concurrently on the VLAN segment.

All ARP requests for the default gateway from the servers are directed to the virtual IP address (vIP) 10.0.0.1.

Only one of the routers is authorized to respond to the ARP request, the active virtual gateway (AVG).

This router answers to the ARP requests by performing a round-robin among a number of vMAC addresses (for example, two MACs).

Each vMAC address identifies a router in the GLBP group; for example, 0007.B400.0101 is the vMAC for Aggregation1 and 0007.B400.0102 is the vMAC for Aggregation2.

By answering with different vMACs to different servers, the AVG achieves load distribution: half of the servers use Aggregation1 as their default gateway, and the other half uses Aggregation2.

Each router is an active virtual forwarder (AVF) for a given virtual MAC. Aggregation1 is AVF for 0007.B400.0101 and Aggregation2 is the AVF for

0007.B400.0102. Should Aggregation1 fail, Aggregation2 becomes the AVF for both the vMACs.

L3分流的方法

the links between the aggregation switches and the core are typically Layer 3 links, and it is desirable to take advantage of the bandwidth provided by all these links.

OSPF allows four equal-cost routes by default, which you can extend to eight routes with the command maximum-path under the router ospf configuration.

EIGRP allows load balancing for four equal-cost routes by default. You can modify this parameter with the maximum-path command. Differently from OSPF, EIGRP can also load-balance unequal-cost routes if you use the variance command.

Load-balancing routes多种方式:

Per-packet: Each packet is treated independently, and the router round-robins the packets on all the available routes (equal-cost routes). packages may out-of-order.

Per-destination: Traffic destined to a specific host always takes the same next hop; packets from different clients for the same destination take the same next hop.

Per-source-and-destination: Load balancing on both the source IP address and the destination IP address allows better load distribution without breaking the packet sequence for a specific flow

Process switching uses per-packet load balancing.

Fast switching uses per-destination load balancing.

CEF uses either per-packet or per-source-and-destination load balancing.

Flow-based MLS typically uses no load balancing by default. You can configure it to per-source-and-destination load balancing by changing the flowmask to source-destination.

CEF-based MLS typically uses per-source-and-destination load balancing (source and destination IP address) by default.

Dual-Attached Servers

attach dual NIC servers to a Layer 2 infrastructure for a loop-free design.

安全

那些需要保护的区域

Internet Edge

You can provide security at the Internet Edge using the following methods:

Deploying antispoofing filtering to prevent DoS attacks by limiting IP spoofing

RFC 1918 filtering:

RFC 1918 filtering makes sure that no packets using source IP addresses from the private address space are sent to or received from the Internet.

RFC 2827 filtering:

RFC 2827 filtering prevents the spoofing of the enterprise address space by blocking incoming packets with source IP addresses belonging to the public address space reserved for the enterprise’s public services.

Using uRPF, also to prevent DoS attacks by limiting IP spoofing

When uRPF is enabled, each packet is checked not only for its destination IP address but also for the routing table of the source IP addresses.

It verifies that there is a routing-table entry with the destination to the source IP address of the packet and the route is associated with the interface the packet came from.

ACL

allow only access to and from the public services provided by the enterprise.

these filters permit the typical services used in a Data Center, such as DNS, HTTP, Simple Mail Transfer Protocol (SMTP), ICMP, and Network Time Protocol (NTP).

Implementing traffic rate limiting to reduce the effect of DoS and DDoS attacks

Traffic rate limiting consists of implementing queuing mechanisms that control the volume of traffic forwarded through a router.

The traffic is usually classified based on protocol, source and destination IP address, and port numbers.

Each defined traffic type is assigned a threshold, after which packets are processed at a lower priority or are simply discarded.

You can use traffic rate limiting to reduce the effects of DoS attacks and their large volumes of data

缺点:

fixed thresholds

legitimate packets often cannot be distinguished from DoS packets

Securing routing protocols to avoid trust exploitation and routing disruptions

When you use dynamic routing, you implement Border Gateway Protocol (BGP) between the ISP and the Internet Edge routers, and you deploy an Interior Gateway Protocol (IGP) such as Open Shortest Path First (OSPF) or Enhanced Interior Gateway Routing Protocol (EIGRP) to propagate routing information to the interior of the enterprise network.

Attackers may do illegal routing updates.

Protocols such as BGP, OSPF, Inter-mediate System-to-Intermediate System (IS-IS), EIGRP, and Routing Information Protocol Version 2 (RIPv2) provide mechanisms to ensure that routing updates are valid and are received from legitimate routing peers. They achieve this goal by using route filters and neighbor router authentication.

Route filters are typically deployed at the ISP router to ensure that only the public networks assigned to the enterprise are externally advertised.

Internet Edge routers should use neighbor router authentication to ensure that routing updates are valid and are received only from legitimate peers.

1. The routers are configured with a shared secret key that is used to sign and validate each routing update.

2. Every time a router has to send a routing update, the routing update is processed with a hash function that uses the secret key to produce a digest.

3. The resulting digest is appended to the routing update. In this way, the routing update message contains the actual routing update plus its corresponding digest. The routing update message contains the actual routing update plus its corresponding digest.

4. Once the message is sent, the receiving router processes the routing update with the same hash function and secret key.

5. The receiving router compares the result with the digest in the routing update message. A match means that the sender has signed the update using the same secret key and hashing algorithm and that the message has not changed while in transit.

Deploying stateful firewalls to prevent unauthorized access

The use of stateful firewalls has two main goals, protecting the Internet server farm and controlling the traffic between the Internet and the rest of the enterprise network.

Implementing intrusion detection to detect network reconnaissance activities and to identify threats and intruders

When you deploy the network-based sensor in a switched infrastructure, you must use features such as switch port analyzer (SPAN) or capture to forward traffic to the monitoring interface of the IDS sensor.

DNS signatures: Examples are 6050 - DNS HINFO Request, 6051 - DNS Zone Transfer, 6052 - DNS Zone Transfer from High Port, 6053 - DNS Request for All Records, 6054 - DNS Version Request, 6055 - DNS Inverse Query Buffer Overflow, and 6056 - DNS NXT Buffer Overflow.

HTTP signatures: Examples are 5188 - HTTP Tunneling, 5055 - HTTP Basic Authentication Overflow, 3200 - WWW Phf Attack, 3202 - WWW .url File Requested, 3203 - WWW .lnk File Requested, 3204 - WWW .bat File Requested, 3212 - WWW NPH-TEST-CGI Attack, and 3213 - WWW TEST-CGI Attack.

FTP signatures: Examples are 3150 - FTP Remote Command Execution, 3151 FTP SYST Command Attempt, 3152 - FTP CWD ~root, 3153 - FTP Improper Address Specified, 3154 - FTP Improper Port Specified, 3155 - FTP RETR Pipe Filename Command Execution, 3156 - FTP STOR Pipe Filename Command Execution, 3157 - FTP PASV Port Spoof, 3158 - FTP SITE EXEC Format String, 3159 - FTP PASS Suspicious Length, and 3160 - Cesar FTP Buffer Overflow.

E-mail signatures: Examples are 3100 - Smail Attack, 3101 - Sendmail Invalid Recipient, 3102 - Sendmail Invalid Sender, 3103 - Sendmail Reconnaissance, 3104 - Archaic Sendmail Attacks, 3105 - Sendmail Decode Alias, 3106 - Mail Spam, and 3107 - Majordomo Execute Attack.

Host-based IDSs specifically target host vulnerabilities, including the following:

  • Protection against e-mail worm attacks such as GONER or NIMDA
  • Protection against application hijacking using a dynamic link libraries (DLLs) control hook
  • Protection against downloading files using instant-messenger applications
  • Protection against known buffer-overflow attacks
  • Control of application execution in the system

Campus Core

Disable any unnecessary services and harden the configuration of the switches and routers that build the campus core.

The second recommendation is to secure the exchange of routing updates with routing-update authentication, route filters, and neighbor definitions.

Use secure protocols such as Secure Shell (SSH) and Simple Network Management Protocol Version 3 (SNMPv3), and avoid insecure protocols that do not protect usernames and passwords

Intranet Server Farms

Management Isolation

Performance

Traffic Patterns

Internet Traffic Patterns

有一些组织进行这方面的研究

San Diego Supercomputer Center (SDSC) http://www.sdsc.edu/

The Cooperative Association for Internet Data Analysis (CAIDA) http://www.caida.org/

The National Laboratory for Applied Network Research, Measurement Network Analysis Group (NLANR) http://www.nlanr.net/

Wide-Area Internet Traffic Patterns and Characteristics

TCP averages 95 percent of bytes, 90 percent of packets, and at least 75 percent of flows on the link.

User Datagram Protocol (UDP) averages 5 percent of bytes, 10 percent of packets, and 20 percent of flows.

Web traffic makes 75 percent of bytes, 70 percent of packets, and 75 percent of flows in the TCP category.

In addition to Web traffic, Domain Name System (DNS), Simple Mail Transfer Protocol (SMTP), FTP data, Network News Transfer Protocol (NNTP), and Telnet are identified as contributing a visible percentage.

DNS represents 18 percent of flows but only 3 percent of total packets and 1 percent of total bytes.

SMTP makes 5 percent of bytes, 5 percent of packets, and 2 percent of flows.

FTP data produces 5 percent of bytes, 3 percent of packets, and less than 1 percent of flows.

NNTP contributes 2 percent of bytes and less than 1 percent of packets and flows.

Intranet Traffic Pattern

A good source of information for measuring performance of IP networks is the paper “Measuring IP Network Performance” by Geoff Houston on the Internet Protocol Journal at http://www.cisco.com/warp/customer/759/ipj_6-1/ipj_6-1_measuring_ip_networks.html.

common performance matrix

Throughput: The maximum rate at which none of the offered frames are dropped by the device.

Frame loss: Percentage of frames that should have been forwarded by a network device under steady state (constant) load that were not forwarded due to lack of resources.

Latency for store and forward devices: The time interval starting when the last bit of the input frame reaches the input port and ending when the first bit of the output frame is seen on the output port.

Latency for bit-forwarding devices: The time interval starting when the end of the first bit of the input frame reaches the input port and ending when the start of the first bit of the output frame is seen on the output port.

Connection processing rate: The maximum rate of new connections the device is able to process.

CC: The number of simultaneous connections the device is able to track and process.

Multilayer Switch Metrics

Throughput:

Throughput is measured in bits per second (BPS) or PPS. BPS gives the absolute throughput number, but PPS multiplied by the packet size

Multilayer switches process frames or packets

You obtain the maximum throughput values using the maximum transmission unit (MTU) size

Frame and Packet Loss:

the actual processing limits of the DUT(Device under test) under a constant load

Latency:

latency generally increases as the depth of packet inspection increases

Firewall Metrics

The DoS handling tests determine how the firewall deals with a high rate of TCP connection requests (SYN packets). This maximum rate indicates how well the firewall would fare under such an attack (a SYN flood attack).

HTTP transfer rate refers to how the firewall handles entire HTTP transactions that include the TCP connection request, the transfer of the objects associated with the URL in the request, and the final connection teardown.

HTTP transaction rate refers to the transaction rate per unit of time that the firewall is able to support.

Illegal traffic handling refers to the capability of the firewall to handle both legal and illegal traffic concurrently.

IP fragmentation handling refers to the capability of the firewall to process fragments that might require re-assembly before a rule could be applied.

Load Balancer Performance Metrics

CPS describes how many new connection requests per second a load balancer can process.

The term processing implies the successful completion of the connection handshake and connection teardown.

CC refers to the number of simultaneous connections a load balancer can support.

PPS describes how many packets per second a load balancer can process.

a load balancer has the potential to add more latency than other devices because it can execute tasks deeper in the payload of packets.

At Layer 4, the load balancer must perform the following tasks:

  • 5-tuple lookup
  • Lookup of content policy information on TCP/IP headers
  • Rewrite of MAC header information
  • Rewrite of IP header information
  • Checksum calculations for TCP
  • Calculation and rewrite of other TCP/UDP header information

At Layer 5, the load balancer performs all Layer 4 tasks in addition to the following:

  • Spoofing TCP connections toward the client side
  • Lookup of content policy information on packet payload
  • Initiating new TCP connections with the server
  • Maintaining both client and server connection synchronization, which requires SEQ and checksum calculation, in addition to other header rewrite operations for both connections

Response time is loosely defined as the elapsed time between the end of an application layer request (the user presses the Enter key) and the end of the response (the data is displayed in the user’s screen).

SSL Offloaders Performance Metrics

The CPS rate that you should measure for SSL offloaders is related to the number of SSL handshakes it can complete. This metric is often called transactions per second (TPS) or sessions per second

Concurrent connections or rather concurrent SSL sessions are mostly related to long-lived sessions and therefore indicate the memory capacity to hold them.

As with load balancers, measuring PPS requires real traffic or at least real SSL connections.

Latency on an SSL offloader indicates the time it would take the device to process the data, which in this case is the SSL handshake and subsequent encryption/decryption of packets.

Testing Tools

First are the web load tools:

http://www.testingfaqs.org/t-load.html lists a number of tools.

http://www.softwareqatest.com/qatweb1.html lists a number of tools under the category of load and performance tools.

http://www.aptest.com/resources.html lists a number of testing tools under the category of web test tools.

The next list outlines specific testing tools:

HTTPLOAD from ACME offers a variety of tools for HTTP-related tests at http://www.acme.com/software/http_load/.

The Web Application Stress Tool from Microsoft is at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnw2kmag00/html/StressTool.asp.

WebStone from Mindcraft for benchmarking Web servers is at http://www.mindcraft.com/webstone/.

WebBench from Ziff Davis is at http://www.etestinglabs.com/benchmarks/webbench/webbench.asp.

SPECweb99 from Standard Performance Evaluation Corporation is at http://www.spec.org/osg/web99/.

Data Center手册(4):设计

时间: 2024-10-13 02:32:01

Data Center手册(4):设计的相关文章

Data Center手册(2): 安全性

有个安全性有下面几种概念: Threat:威胁 Vulnerability: 安全隐患 Attack: 攻击 有关Threat 常见的威胁有下面几种 DoS(Denial of Service拒绝服务攻击) Breach of confidential information 破解机密信息 Data theft or alteration 数据盗用和篡改 Unauthorized use of compute resources 未授权访问 Identity theft 身份盗用 有关安全隐患

Data Center手册(1):架构

如图是数据中心的一个基本架构 最上层是Internet Edge,也叫Edge Router,也叫Border Router,它提供数据中心与Internet的连接. 连接多个网络供应商来提供冗余可靠的连接 对外通过BGP提供路由服务,使得外部可以访问内部的IP 对内通过iBGP提供路由服务,使得内部可以访问外部IP 提供边界安全控制,使得外部不能随意访问内部 控制内部对外部的访问 为了HA的需要,往往会有两个Border Router Typical enterprise Internet c

Data Center Group

Data Center Group||----Sr. Admin Assistant|----Technical Assistant|----Executive Assistant||--Enterprise & HPC platform Group--Cloud Platforms Group--Network Platform Group--Health& Life Sciences--Silicon Photonics Solutions Group--Storage Group||

Codeforces Gym 100513D D. Data Center 前缀和 排序

D. Data Center Time Limit: 20 Sec Memory Limit: 256 MB 题目连接 http://codeforces.com/contest/560/problem/B Description The startup "Booble" has shown explosive growth and now it needs a new data center with the capacity of m petabytes. Booble can b

Data Center Manager Leveraging OpenStack

这是去年的一个基于OpenStack的数据中心管理软件的想法. Abstract OpenStack facilates users to provision and manage cloud services in a convenient way, including compute instances, storage and network. Meanwhile, data center requires a converged, uniformed management solutio

Hot Topics on Data Center (HotDC) 2018

Keynote Session Accelerate Machine Intelligence: An Edge to Cloud Continuum Hadi Esmaeilzadeh - UCSD Background open source: http://act-lab.org/artifacts Data grows at an unprecedented rate new landscape of computing: personalize and targeted experie

CodeForces-528C Data Center Drama

题目链接:CodeForces-528C Data Center Drama 题意 给出一个无向图(连通,可能有重边和自环),要求加尽量少的边,并给每条边定向,使每个结点的入度和出度都是偶数. 思路 对于度数为奇数的结点,加边依次连接,例如结点$1,2,3,4$的度数为奇数,则连接$(1,2)$,$(3, 4)$,使所有结点度数都为偶数,则为欧拉图. 如果此时边数为奇数,则对任一结点加个自环,这样可以构造出偶数长度的欧拉回路.沿着欧拉回路每隔一条边反向一次,可令结点每一条入边和出边变成两条入边或

Data Structure 之 算法设计策略

1. 穷举法 基本思想:列举问题的所有可能解,并用约束条件逐一进行判定,找出符合约束条件的解. 穷举法的关键在于问题的可能解的列举和可能解的判别. 例如:凑数问题 2. 递归技术 定义:直接或间接调用自身的过程 递归三要素: (1)问题形式:返回结果是什么?需要哪些入口参数? (2)递归规则:问题如何进行分解? (3)终结条件:什么情况下可以无需套用递归规则直接求解? 3. 分治法 基本思想:待解问题若可以被分解成若干个相互独立的.与原问题同类型的.规模小于原问题的子问题,则可以先求解子问题,再

Hedera: Dynamic Flow Scheduling for Data Center Networks

摘要: 当今的数据中心为成千上万台计算机的群集提供了巨大的聚合带宽, 但是即使在最高端的交换机中,端口密度也受到限制,因此数据中心拓扑通常由多根树组成,这些树在任何给定的主机对之间都具有许多等价路径. 现有的IP多路径协议通常依赖于每流静态哈希,并且由于长期冲突而可能导致大量带宽损失. 在本文中我们介绍了Hedera,这是一种可伸缩的动态流调度系统,可自适应地调度多级交换结构以有效利用聚集的网络资源. 我们描述了使用商用交换机和未修改主机的实施方式,对于模拟的8,192个主机数据中心,Heder