Linux and the storage ecosystem

Linux is the Swiss Army knife of file systems, and it also offers a wide variety of storage technologies for both desktops and servers. Beyond the file system, Linux incorporates world-class NAS and SAN technologies, data protection, storage management, support for clouds, and solid-state storage. Learn more about the Linux storage ecosystem and why it‘s number one in server market share.

Linux is many things, and its power lies in its ability to flexibly support vastly different usage models. But one of Linux‘s most important strengths is serving as the workhorse of the storage domain. Thinking about Linux and storage commonly conjures an image of direct-attached disks or the latest file system, but there‘s much more to storage and Linux than meets the eye. Elements in the Linux are not only stable but also cutting-edge.

This article explores the various storage technologies that keep Linux at the center of the storage universe. Let‘s start at the bottom—namely, storage architectures—and then work up the stack to features, file systems, and futures (see Figure 1).

Figure 1. Storage stack for exploration in this article

Storage architecture

How the storage attaches to the platform is key to the overall storage architecture. Three general architectures cover the vast majority of models:

  • Direct-attached storage (DAS)

  • Storage area networks (SAN)
  • Network-attached storage (NAS)

Of course, Linux supports all three and has evolved with the changes that are occurring with these models.

Other storage media

This article focuses mainly on disk storage, but it‘s difficult to ignore the breadth of other devices supported in Linux. From the quickly disappearing floppy drives, CD-ROM and DVD readers and writers, and even enterprise tape systems, Linux can‘t be beat when it comes to mass storage device support.

Figure 2 illustrates the models, with a focus on the location of the file system and storage. The DAS model covers the direct attachment of storage to the platform and represents the vast majority of storage use. The SAN separates the storage from the platform and makes it accessible over one of a number of block storage protocols. Finally, NAS provides a similar architecture to the SAN but operates at the file level.

Figure 2. Major storage architectures

Direct-attached storage

Linux supports a large variety of DAS interfaces, including old standards like parallel Advanced Technology Attachment (ATA)—Integrated Drive Electronics [IDE]/ATA—parallel SCSI, and Fibre Channel as well as new storage interfaces like serial attached SCSI (SAS), serial ATA (SATA), and external SATA (eSATA). You‘ll also find advanced storage technologies such as USB3 (Extensible Host Controller Interface [xHCI]) and Firewire (Institute of Electrical and Electronics Engineers 1394).

Storage area network

The SAN provides consolidation of block-level storage so that it can be shared among a number of servers. The storage appears local to the servers, where the endpoint storage device may implement additional services for the client devices (such as backup and replication).

Protocols and interfaces for SANs are wide and varied. You can find the typical SAN protocols in Linux such as Fibre Channel as well as its extension over IP (iFCP). Newer protocols, such as SAS, Fibre Channel over Ethernet (FCoE), and Internet SCSI (iSCSI), are also present, as are more domain-specific protocols like iSCSI Extensions for remote direct memory access (RDMA—iSER) and the SCSI RDMA Protocol (SRP), which extends SCSI over RDMA for Infiniband.

The emergence of Ethernet as a storage protocol has been fully realized in Linux, as it illustrates the power and flexibility of these approaches. Further, 10-gigabit Ethernet (10GbE) is fully supported in Linux, permitting construction of high-performance SANs. You can also find protocols like ATA over Ethernet (ATAoE), which extends the ATA protocol over the ubiquitous Ethernet protocol.

Network-attached storage

Last but not least is NAS. NAS is a consolidation of storage over a network for access by heterogeneous clients at the file level. Two of the most popular protocols, which are fully supported in Linux, are Network File System (NFS) and Server Message Block/Common Internet File System (SMB/CIFS).

Although the original SMB implementation was proprietary, it was reverse-engineered to be supported in Linux. The later SMB revisions were openly documented to allow simpler development in Linux.

Linux has continued to evolve with the various enhancements and extensions made to NFS. NFS is now a stateful protocol and includes optimizations for data and metadata separation as well as data access parallelism. You can read more about the evolution of NFS using the links in Resources. As with Ethernet-based SANs, 10GbE support in Linux enables high-performance NAS repositories.

Other storage architectures

Not all storage architectures fit cleanly in the DAS, SAN, and NAS buckets. Because Linux is open, it makes it easy to develop new technologies within it, which is why you can find the newest bleeding-edge technologies in Linux.

One interesting storage architecture, which is not new but worthwhile to mention, is the object storage architecture. Object storage architectures split a file from its metadata and store them independently (on their respective data and metadata servers). This split provides certain advantages, such as minimizing the metadata bottleneck (because interactions with this server are only required to locate and open a file). Performance can also be enhanced by striping the data over multiple data servers for parallel access. Object storage is implemented in a variety of ways within Linux, including support for the Object Storage Device (OSD) specification as well as within the Linux clUSTER (Lustre) and Extended Object File System (exofs).

A similar technology exists called content-addressable storage (CAS) that uses a hash of the data to identify its name and address. This technology, also known as fixed-content storage (FCS), is advantageous, because it‘s easy to identify duplicate data: The hash (if strong enough) will be the same and permit simple de-duplication. The Venti architecture supports this approach and exists within Linux (in addition to the Plan 9 distribution of Bell Labs).

Storage services: logical volume management

Storage virtualization was once a feature unique to high-end storage systems, but it is now a standard feature of Linux. One of the most important services available in Linux is the Logical Volume Manager (LVM). The LVM is a thin layer that sits above physical storage available in the underlying storage architecture (with accompanying user-space tools) and abstracts it to one or more logical volumes that are simpler to manage. For example, while a physical disk cannot be resized, a logical volume can be resized to add or remove space from it.

With the ability to abstract physical devices into logical devices, LVM creates a number of other storage capabilities, such as read-only and read-write snapshots of volumes, data striping across volumes for performance (redundant array of independent disks [RAID]-0), data mirroring across volumes (RAID-1), and migration of volumes (even while online) between physical devices.

For data protection beyond mirroring, Linux includes md (which stands for multiple disks) and provides a rich set of RAID functionality. This element implements software RAID functionality, supporting RAID-4 (striped data with a parity block), RAID-5 (striped data with a distributed parity block), RAID-6 (striped data with distributed and dual-redundant parity blocks), and RAID-10 (striped and mirrored data).

The LVM relies on another storage component called the Device-mapper, which provides (among other features) the ability to multipath. For example, in a SAN environment, there are commonly multiple storage interfaces into the SAN fabric. Multipathing is a feature that protects against the failure of a given path, ensuring that storage remains available as long as a path exists to communicate with the endpoint.

Storage features

In the past few years, two relatively simple features have been added to the storage stack that address the evolution of the storage ecosystem:

  • Data integrity

  • Support for solid-state disks (SSDs)
Data integrity

The first change addresses the use of commodity drives in enterprise storage settings. Although enterprise-class drives (such as SAS drives) are reliable, SATA drives are built with different requirements and with cost as a major factor. For this reason, SATA drives can suffer from a problem known as silent data corruption, where errors can be introduced and not detected when the data is read from the disk. To solve this problem and support SATA drives in enterprise settings, data integrity codes are added to blocks on the disk (where the disk uses 520-byte sectors instead of the traditional 512-byte blocks). In addition, the drive itself can validate the data being written, so that its integrity code matches the data. In this way, errors can be caught as they‘re written to the disk, instead of detecting the error later when nothing can be done about it.

This mechanism is called the data integrity field (DIF), as shown in Figure 3, and represents an 8-byte trailer that includes a cyclic redundancy check (CRC) over the block of data, a reference tag (typically a portion of the logical block addressing [LBA]), and an application tag that the application defines. The reference tag is useful for catching mis-writes of data to an incorrect block, where the application tag can be used to catch other errors in the software stack. For example, if a PDF document is written, the application tag could be set to a value indicating a special PDF tag. When the PDF is read, each block‘s application tag can be inspected to ensure that all specify the PDF tag. DIF is supported within Linux as of kernel version 2.6.27.

Figure 3. DIF structure for a 512-byte sector

Growing support for SSDs

The introduction of SSDs is changing the storage ecosystem in a number of ways. These disks remove some of the large latencies found in spinning disks and therefore provide a way to maintain data flow to and from the CPU. But SSDs are different from hard disk drives (HDDs) in that they are consumable. The storage within an SSD can be written a finite number of times (depending on the technology); therefore, it‘s important to be as efficient as possible when writing data. To make matters worse, the SSD must internally shift data to minimize the introduction of errors in a process called garbage collection or wear-leveling. This process results in writes to the consumable storage and should therefore be minimized.

Another issue with SSDs and traditional storage is that an HDD didn‘t care whether data on disk was valid. If the file system invalidated the data, the data could remain on disk without any downside. This constraint does not exist with SSDs because of the wear-leveling requirement. For this reason, Linux now supports the ability of the file system to communicate discarded blocks to the SSD (as of kernel version 2.6.29). This ability allows the SSD to remove these blocks from wear-leveling processes and helps to increase the endurance of the drive.

File systems

What truly sets Linux apart from other operating systems is its vast library of file systems. In Linux, you can find traditional client file systems like the third extended file system (ext3) and the fourth extended file system (ext4), but you‘ll also find the state of the art in distributed file systems, cluster file systems, and parallel file systems. You can find new, cutting-edge file systems based around new ideas and addressing new problems in the storage domain, as well.

In terms of cutting-edge file systems today, Linux supports both ZFS and Butter FS (BTRFS). These two file systems compete with one another and share the distinction of copy-on-write semantics (blocks are never written in place). In addition, both file systems support data de-duplication, internal data protection (RAID-like protection), data and metadata checksums, and other storage features (like snapshots).

Linux is home to many distributed file systems, as well. One example is Lustre, which is a massively parallel distributed file system that supports tens of thousand of nodes and scales to petabytes of storage capacity. Ceph provides similar functionality and, in the past year, was introduced into the Linux kernel. Other examples in Linux include GlusterFS and the General Parallel File System (GPFS).

You can find specialized file systems in Linux, as well, including log-structured file systems like the New Implementation Log Structure File System (NiLFS(2)) and object-based file systems like exofs. Because Linux finds itself in many use models, you‘ll also find file systems for resource-constrained uses (such as embedded systems) as well as low-latency applications such as high-performance computing (HPC). File systems in the embedded area include the Yet Another Flash File System version 2 (YAFFS2), the Journaling Flash File System version 2 (JFFS2), and the Unsorted Block Image File System (UBIFS). File systems in the HPC space include the Parallel NFS (pNFS), Lustre, and the GPFS.

Linux storage ahead

Linux is and will continue to be the target for file systems and general storage research because of its openness and large community of developers.

One of the latest changes in storage is the use of remote services for cost-efficient storage of archive data. Known today as cloud storage, numerous vendors provide efficient and transparent access to remote, centralized storage with varying service level agreements (covering capabilities like protection and bandwidth). Two examples include Ubuntu One and Dropbox. Another service, called SpiderOak, can be used to back up your local user directories to the cloud for a small fee.

What other features might be on the horizon for Linux? Support for large sector sizes (moving beyond 512-byte sectors), thin provisioning to avoid reserved but unused capacity (where advertised storage exceeds the physical capacity), storage de-duplication (to maximize storage availability), and an even more efficient storage stack to exploit new speeds and efficiencies of drives like SSDs, perhaps? Whatever is coming in storage ecosystem evolution, Linux will be there first.

Resources

Learn

Linux and the storage ecosystem,布布扣,bubuko.com

时间: 2024-10-15 10:49:51

Linux and the storage ecosystem的相关文章

LINUX USB MASS STORAGE DRIVER流程图

如何实现Linux下的U盘(USB Mass Storage)驱动

摘要 本文主要介绍了USB Mass Storage的相关的各种协议之间的关系,以及如何在Linux的USB驱动框架下实现U盘驱动 本文提供多种格式供: 在线阅读 HTML HTMLs PDF CHM TXT RTF 下载(7zip压缩包) HTML HTMLs PDF CHM TXT RTF HTML版本的在线地址为: http://www.crifan.com/files/doc/docbook/usb_disk_driver/release/html/usb_disk_driver.htm

Linux parted 分区

转自http://tilt.lib.tsinghua.edu.cn/node/889 如何使用parted对齐分区以得到最优性能 Sat, 03/08/2014 - 18:02 - tlbluestar 来源地址: http://rainbow.chard.org/2013/01/30/how-to-align-partitions-for-best-per... How to align partitions for best performance using parted There ar

Linux系统下安装 LSI StorCLI64工具查看和管理raid卡

OS环境:CentOS 6.3/CentOS 7.1 一.可用如下命令查询服务器raid卡硬件相关信息: [[email protected] /]# dmesg | grep raid megaraid_sas 0000:04:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30 megaraid_sas 0000:04:00.0: setting latency timer to 64 megaraid_sas 0000:04:00.0: irq

【翻译自mos文章】oracle linux 和外部存储系统 的关系

oracle  linux 和外部存储系统 的关系 参考原文: Oracle Linux and External Storage Systems (Doc ID 753050.1) 适用范围: Linux OS - Version Oracle Linux 4.4 to Oracle Linux 6.0 with Unbreakable Enterprise Kernel [2.6.32] [Release OL4U4 to OL6] Linux x86-64 Linux x86 Linux

USB Mass Storage大容量存储的基本知识

http://www.crifan.com/files/doc/docbook/usb_disk_driver/release/htmls/ch02_msc_basic.html 目录 2.1. USB Mass Storage相关的协议 2.1.1. USB Mass Storage相关协议简介 2.1.1.1. USB MSC Control/Bulk/Interrupt (CBI) Transport 2.1.1.2. USB MSC Bulk-Only (BBB) Transport 2

Life of an Oracle I/O: tracing logical and physical I/O with systemtap

https://db-blog.web.cern.ch/blog/luca-canali/2014-12-life-oracle-io-tracing-logical-and-physical-io-systemtap Luca Canali on 01 Dec 2014 Topic: This post is about tracing logical and physical reads in Oracle using SystemTap. You find here a few examp

codis+redis 集群搭建管理

Codis 是一个分布式 Redis 解决方案, 对于上层的应用来说, 连接到 Codis Proxy 和连接原生的 Redis Server 没有明显的区别 (不支持的命令列表), 上层应用可以像使用单机的 Redis 一样使用, Codis 底层会处理请求的转发, 不停机的数据迁移等工作, 所有后边的一切事情, 对于前面的客户端来说是透明的, 可以简单的认为后边连接的是一个内存无限大的 Redis 服务. Codis 由四部分组成: Codis Proxy (codis-proxy) Cod

【内推】2020微软亚太研发集团在北京上海苏州热招300+软硬件研发工程师

2020微软亚太研发集团在北京上海苏州热招300+软硬件研发工程师北京: 1.Azure Global Asia Engineering team-Azure Security-后端开发(1年以上经验) 2.Azure Big Data-Cosmos -(big data)-前后端开发(1年以上) 3.MMX-Edge/Launcher for Mobile-(IOS/Android)- 后端开发(5年以上) 5.MSRA-Business Analyst(北京OR苏州) 5.MSRA-Serv