自己动手做一个数据库-英文版

Readings in Databases

A list of papers essential to understanding databases and building new data systems. The list is curated and maintained by Reynold Xin (@rxin). If you think a paper should be part of this list, please submit a pull request. It might take a while since I need to go over the paper.

If you are reading this and taking the effort to understand these papers, we would love to talk to you about opportunities at Databricks.

Table of Contents

  1. Basics and Algorithms
  2. Essentials of Relational Databases
  3. Classic System Design
  4. Columnar Databases
  5. Data-Parallel Computation
  6. Consensus and Consistency
  7. Trends (Cloud Computing, Warehouse-scale Computing, New Hardware)
  8. Miscellaneous
  9. External Reading Lists

Basics and Algorithms

  • The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb (1997): This paper (and the original one proposed 10 years earlier) illustrates a quantitative formula to calculate whether a data page should be cached in memory or not. It is a delight to read Jim Gray approach to an array of related problems, e.g. how big should a page size be.
  • AlphaSort: A Cache-Sensitive Parallel External Sort (1995): Sorting is one of the most essential algorithms in databases, as it is used to do joins, aggregations, and sorts. In algorithms 101 class, CS students are asked to reason about big O complexity and ignore the constant factor. In practice, however, the change in constant from L2 cache can be as big as two or three orders of magnitude. This is a good paper to learn about all the tricks fast sorting implementations use.
  • Patience is a Virtue: Revisiting Merge and Sort on Modern Processors (2014): Sorting revisited. Actually also a good survey on sorting algorithms used in practice and their trade-offs.

Essentials of Relational Databases

  • Architecture of a Database System (2007): Joe Hellerstein‘s great overview of relational database systems. This essay walks readers through all components essential to relational database systems.
  • A Relational Model of Data for Large Shared Data Banks (1970): Codd‘s argument for data independence (from 1970), a fundamental concept in relational databases. Despite the current NoSQL trend, I believe ideas from this paper are becoming increasingly important in massively parallel data systems.
  • ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging (1992): The first algorithm that actually works: it supports concurrent execution of transactions without losing data even in the presence of failures. This paper is very hard to read because it mixes a lot of low level details in the explanation of the high level algorithm. Perhaps try understand ARIES (log recovery) by reading a database textbook before attempting to read this paper.
  • Efficient Locking for Concurrent Operations on B-Trees (1981) and The R*-tree: An Efficient and Robust Access Method for Points and Rectangles (1990): B-Tree is a core data structure in databases (not just relational). It is optimized and has a low read amplification factor for random lookups of on-disk data. R-tree is an extension of B-tree to support lookups of multi-dimensional data, e.g. geodata.
  • Improved Query Performance with Variant Indexes (1997): Analytical databases and OLTP databases require different trade-offs. These are reflected in the choices of indexing data structures. This paper talks about a number of index data structures more suitable for analytical databases.
  • On Optimistic Methods for Concurrency Control (1981): There are two ways to support concurrency. The first is the pessimistic way, i.e. to lock shared data preemptively. This paper explains an alternatively to locking called Optimistic Concurrency Control. Optimistic approaches assume conflicts are rare and executes transactions without acquiring locks. Before committing the transactions, the database system checks for conflicts and aborts/restarts transactions if conflicts arise.
  • Access Path Selection in a Relational Database Management System (1979): The basics of query optimization. SQL is declarative, i.e. you specify using a query language what data you want, not how you want it. There are usually multiple ways (query plans) of executing a query. The database system examines multiple plans and decides on an optimal one (best-effort). This process is called query optimization. The traditional way of doing query optimization is to have a cost-model for different access methods and query plans. This paper explains the cost-model and a dynamic programming algorithm to pick the best plan.
  • Eddies: Continuously Adaptive Query Processing (2000): Traditional query optimization (and the cost model used) is static. There are two problems with the traditional model. First, it is hard to build the cost model absent of data statistics. Second, query execution environment might change in long running queries and a static approach cannot capture the change. Analogous to fluid dynamics, this paper proposes a set of techniques that optimize query execution dynamically. I don‘t think ideas in Eddies have made their way into commercial systems yet, but the paper is very refreshing to read and might become more important now.

Classic System Design

  • A History and Evaluation of System R (1981): There were System R from IBM and Ingres from Berkeley, two systems that showed relational database was feasible. This paper describes System R. It is impressive and scary to note that the internals of relational database systems in 2012 look a lot like System R in 1981.
  • The Google File System (2003) and Bigtable: A Distributed Storage System for Structured Data (2006): Two core components of Google‘s data infrastructure. GFS is an append-only distributed file system for large sequential reads (data-intensive applications). BigTable is high-performance distributed data store that builds on GFS. One way to think about it is that GFS is optimized for high throughput, and BigTable explains how to build a low-latency data store on top of GFS. Some of these might have been replaced by newer proprietary technologies internal to Google, but the ideas stand.
  • Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications (2001) and Dynamo: Amazona€?s Highly Available Key-value Store (2007): Chord was born in the days when distributed hash tables was a hot research. It does one thing, and does it really well: how to look up the location of a key in a completely distributed setting (peer-to-peer) using consistent hashing. The Dynamo paper explains how to build a distributed key-value store using Chord. Note some design decisions change from Chord to Dynamo, e.g. finger table O(logN) vs O(N), because in Dynamo‘s case, Amazon has more control over nodes in a data center, while Chord assumes peer-to-peer nodes in wide area networks.

Columnar Databases

Columnar storage and column-oriented query engine are critical to analytical workloads, e.g. OLAP. It‘s been almost 15 years since it first came out (the MonetDB paper in 1999), and almost every commercial warehouse database has a columnar engine by now.

  • C-Store: A Column-oriented DBMS (2005) and The Vertica Analytic Database: C-Store 7 Years Later (2012): C-Store is an influential, academic system done by the folks in New England. Vertica is the commercial incarnation of C-Store.
  • Column-Stores vs. Row-Stores: How Different Are They Really? (2012): Discusses the importance of both the columnar storage and the query engine.
  • Dremel: Interactive Analysis of Web-Scale Datasets (2010): A jaw-dropping paper when Google published it. Dremel is a massively parallel analytical database used at Google for ad-hoc queries. The system runs on thousands of nodes to process terabytes of data in seconds. It applies columnar storage to complex, nested data structures. The paper talks a lot about the nested data structure support, and is a bit light on the details of the query execution. Note that a number of open source projects are claiming they are building "Dremel". The Dremel system achieves low-latency through massive parallelism and columnar storage, so the model doesn‘t necessarily make sense outside Google since very few companies in the world can afford thousands of nodes for ad-hoc queries.

Data-Parallel Computation

  • MapReduce: Simplified Data Processing on Large Clusters (2004): MapReduce is both a programming model (borrowed from an old concept in functional programming) and a system at Google for distributed data-intensive computation. The programming model is so simple yet expressive enough to capture a wide range of programming needs. The system, coupled with the model, is fault-tolerant and scalable. It is probably fair to say that half of the academia are now working on problems heavily influenced by MapReduce.
  • Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (2012): This is the research paper behind the Spark cluster computing project at Berkeley. Spark exposes a distributed memory abstraction called RDD, which is an immutable collection of records distributed across a cluster‘s memory. RDDs can be transformed using MapReduce style computations. The RDD abstraction can be orders of magnitude more efficient for workloads that exhibit strong temporal locality, e.g. query processing and iterative machine learning. Spark is an example of why it is important to separate the MapReduce programming model from its execution engine.
  • Shark: SQL and Rich Analytics at Scale (2013): Describes the Shark system, which is the SQL engine built on top of Spark. More importantly, the paper discusses why previous SQL on Hadoop/MapReduce query engines were slow.
  • Spanner (2012): Spanner is "a scalable, multi-version, globally distributed, and synchronously replicated database". The linchpin that allows all this functionality is the TrueTime API which lets Spanner order events between nodes without having them communicate. There is some speculation that the TrueTime API is very similar to a vector clock but each node has to store less data. Sadly, a paper on TrueTime is promised, but hasn‘t yet been released.

Consensus and Consistency

  • Paxos Made Simple (2001): Paxos is a fault-tolerant distributed consensus protocol. It forms the basis of a wide variety of distributed systems. The idea is simple, but notoriously difficult to understand (perhaps due to the way the original Paxos paper was written).
  • The Raft Consensus Algorithm (2014) : Raft is a consensus algorithm designed as an alternative to Paxos. It was meant to be more understandable than Paxos by means of separation of logic, but it is also formally proven safe and offers some new features.[1] Raft offers a generic way to distribute a state machine across a cluster of computing systems, ensuring that each node in the cluster agrees upon the same series of state transitions.
  • CAP Twelve Years Later: How the "Rules" Have Changed (2012): The CAP theorem, proposed by Eric Brewer, asserts that any net?-worked shared-data system can have only two of three desirable properties: Consistency, Availability, and Partition-Tolerance. A number of NoSQL stores reference CAP to justify their decision to sacrifice consistency. This is Eric Brewer‘s writeup on CAP in retrospective, explaining "‘2 of 3‘ formulation was always misleading because it tended to oversimplify the tensions among properties."

Trends (Cloud Computing, Warehouse-scale Computing, New Hardware)

  • A View of Cloud Computing (2010): This is THE paper on Cloud Computing. This paper discusses the economics and obstacles of cloud computing (referring to the elasticity of resources, not the consumer-facing "cloud") from a technical perspective. The obstacles presented in this paper will impact design decisions for systems running in the cloud.
  • The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines: Google‘s Luiz Andr?? Barroso and Urs H??lzle explains the basics of data center hardware and software for warehouse-scale computing. There is an accompanying video. The video talks about the importance of cutting long-tail latency in massively parallel systems. The other key idea is the disaggregation of resources. Technologies such as GFS/HDFS already disaggregate disks because of high network bandwidth, but yet to see the same trend applying to DRAMs because that‘d require low-latency networking.

Miscellaneous

  • Reflections on Trusting Trust (1984): Ken Thompson‘s Turing Award acceptance speech in 1984, describing black box backdoor issues and pointing out trust is not absolute.
  • What Goes Around Comes Around: Michael Stonebraker and Joseph M. Hellerstein provide a summary of 35 years of data model proposals, grouped into 9 different eras. The paper discusses the proposals of each era, and show that there are only a few basic data modeling ideas, and most have been around a long time. Later proposals inevitably bear a strong resemblance to certain earlier proposals.

External Reading Lists

A number of schools have their own reading lists for graduate students in databases.

原文地址:https://www.cnblogs.com/weloveshare/p/8727191.html

时间: 2024-10-14 02:47:40

自己动手做一个数据库-英文版的相关文章

自己动手做一个小型“资源管理器”吧

自己动手做一个小型“资源管理器”吧 注:tvDirectory是treeView控件,lvDirectory是listView控件 首先搭建一下界面: 左边是treeView控件,右边是listView控件.(listView的网格线只需把GridLins设置成True就可以了.) 由于要用到IO流,所以别忘了导入命名空间:using System.IO; 我们只要创建一个文件类就可以了: 1 public class MyFile 2 { 3 //文件长度 4 public float Fil

教你动手做一个 iOS 越狱 app

前言 俗话说得好, 万事开头难. 仅仅是上图一个如此简单地不能再简单的小app, 其实都不算是app, 只是注入了一段代码进系统中, 等到特定的函数方法调用的时候就会被我们hook掉, 执行我们写的代码, 仅此而已. 哈哈, 就是这样一个小功能, 搞得我焦头烂额了一个晚上, 所以大家不用害怕, 连我这样的菜鸟都能实现, 大家只要认真看我的总结, 就一定可以自己也动手在自己越狱iPhone, iPad上嗨起来! ps : 为了纪念这神圣的一刻, 就算是这么简单的程序我也把它放到了我的Github上

自己动手做一个adb的wifi连接apk

前言 估计搞安卓开发的小伙伴们都放弃了eclipse的开发转战android studio了吧,现在gradle都3.0 buiild都24.0.2了以后 都不再支持2.2及以下的版本了.eclipse的同学快上车. 上面的都不是重点,每当你大清早起来去公司上班,然后打开电脑发现你的usb线没有带,作为一个安卓程序员的你有点尴尬了. 然后你果断的给android studio装上ADB WIFI插件,哈哈,事实证明你还是相当的机智,然后你会面临另外一个尴尬(蛋疼).你使用时会出现如图: 你会蛋微

动手做第一个Chrome插件

Chrome插件是令人惊讶的简单,一旦你弄懂它的工作和实现原理.它是由一部分HTML,一部分Js,然后混合了一个叫做manifest.json的Json文件组合而成的整体.这意味着你可以使用你最擅长的js框架去实现它. 如果你还是一个Chrome插件的新手并且想尝试写一个的话,下面的文章将会带领大家并且尝试让大家理解Chrome插件的工作机制.这篇文章将会讲述每一块架构,以及相互之间的联系和插件的一般化形式. 架构 Chrome插件中的文件大体上可以分成2部分:Chrome插件中确确实实存在的文

自己动手做个智能小车(1)

自己动手做个智能小车(1) --介绍及工具材料准备 目前物联网发展很迅速,软硬结合也越来越流行.对硬件有些兴趣的我,也想来做点业余玩具. 这是个业余小作品,可以学习一些基本硬件.单片机,底层软件,Android编程等知识,同时不失趣味性. 直接上图: 怎么每张图有个手机?-- 哈哈那是遥控器. 介绍: 这个小车是一个三轮车,为啥采用三轮:方便.灵活.控制简单,比如要转个弯,只需要把这边的轮子速度降下来,就转过去了:还可以原地旋转,两个轮子速度相同,方向相反即可,这是四轮车办不到的. 问,边上的手

用struts2标签如何从数据库获取数据并在查询页面显示。最近做一个小项目,需要用到struts2标签从数据库查询数据,并且用迭代器iterator标签在查询页面显示,可是一开始,怎么也获取不到数据,想了许久,最后发现,是自己少定义了一个变量,也就是var变量。

最近做一个小项目,需要用到struts2标签从数据库查询数据,并且用迭代器iterator标签在查询页面显示,可是一开始,怎么也获取不到数据,想了许久,最后发现,是自己少定义了一个变量,也就是var变量.<s:iterator>标签有一个value属性,用来存放在Action类的方法中存数据的list集合,还有一个id,好像是说指定集合的索引的意思,就是给list集合遍历出来的每个对象加上一个数字标签,反正我是这么理解的,没用过.还有一个很重要,就是var变量,我在s:iterator按ctr

使用xenserver克隆功能做一个备份数据库服务器

测试环境 需求:将目前的一个主数据库服务器(简称A机,WindowsServer2008下安装的mysql数据库,数据量4000多万),系统是使用xenserver创建的VM,要做一个实时备份.因为数据量太大.初始数据导入导出耗时太长,要10几20几个小时.所以使用以下思路. 将A机关机,使用xenserver的克隆功能,copy vm .... => 选择:Fast clone ,瞬间完成克隆,克隆机简称B机.之后将A.B机启动. 一.在B机上做以下操作: 1.更改IP地址.主机名. 2.编辑

一个python做得数据库同步

最近在做python的项目,那么今天也来介绍下python的一个项目.首先先讲一下python的特点:其实python本身主要不是为了网络开发而设计的.至于现在很多web开发python用得广泛,主要还是因为python起步容易,上手快,代码易读性很高.但是值得一说的是,真正用python做项目时,其逻辑也很复杂,并不都是那么简单.这里介绍一个数据库同步的python项目: # -*- coding: utf-8 -*- #''' #Created on 2015-1-5 #@author: l

我们该怎么学习?做一个学者还是习者?

今天我们来谈的话题是"学习".本文的部分构成素材来自网友:"lesini" (乐死你?还是累死你?). 我们从出身到将来多年后的"走人",每个环节.时间段都穿插了一个与社会.环境.人有极高互动性的要素,那就是"学习". 据翻查资料,原来"学习"这个词是孔子发明的,最早时期"学"和"习"是分开的两个独立字.孔子在<论语.学而>提出了"学而时习之,不