facebook architecture

Web front-end written in PHP. Facebook‘s HipHop Compiler then converts it to C++ and compiles it using g++, thus providing a high performance templating and Web logic execution layer. Because of the limitations of relying entirely on static compilation, Facebook‘s started to work on a HipHop Interpreter as well as a HipHop Virtual Machine which translate PHP code to HipHop ByteCode.

HipHop for PHP (HPHPc) is a PHP transpiler created by Facebook. By using HPHPc as a source-to-source compiler, PHP code is translated into C++, compiled into a binary and run as an executable, as opposed to the PHP‘s usual execution path of PHP code being transformed into opcodes and interpreted. The original motivation behind HipHop was to save resources on Facebook servers, given the large PHP codebase of facebook.com. Increases in web page generation throughput by factors of up to six have been observed over the Zend PHP.

HipHop Virtual Machine (HHVM), which is a just-in-time (JIT) compilation-based execution engine for PHP, also developed by Facebook. There were many reasons for this;

  1. one of them was HPHPc‘s flattened curve for further performance improvements.
  2. Also, HPHPc did not fully support the PHP language, including the create_function() and eval() constructs,
  3. it involved a specific time- and resource-consuming deployment process that required a bigger than 1 GB binary to be compiled and distributed to many servers in short order.
  4. In addition, maintaining HPHPc and HPHPi in parallel (as they needed to be, for the consistency of production and development environments) was becoming cumbersome.
  5. Finally, HPHPc was not a drop-in replacement for Zend, requiring external customers to change their whole development and deployment processes to use HPHPc

Business logic is exposed as services using Thrift. The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.

Services implemented in Java don‘t use any usual enterprise application server but rather use Facebook‘s custom application server. At first this can look as wheel reinvented but as these services are exposed and consumed only (or mostly) using Thrift, the overhead of Tomcat, or even Jetty, was probably too high with no significant added value for their need.

Persistence is done using MySQL, Memcached, Hadoop‘s HBase. Memcached is used as a cache for MySQL as well as a general purpose cache.

Offline processing is done using Hadoop and Hive.The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

Data such as logging, clicks and feeds transit using Scribe and are aggregating and stored in HDFS using Scribe-HDFS, thus allowing extended analysis using MapReduce. Scribe is a server for aggregating log data streamed in real time from a large number of servers.

BigPipe is their custom technology to accelerate page rendering using a pipelining logic.

Varnish Cache is used for HTTP proxying. They‘ve prefered it for its high performance and efficiency.

The storage of the billions of photos posted by the users is handled by Haystack, an ad-hoc storage solution developed by Facebook which brings low level optimizations and append-only writes.

Facebook Messages is using its own architecture which is notably based on infrastructure sharding and dynamic cluster management. Business logic and persistence is encapsulated in so-called ‘Cell‘. Each Cell handles a part of users ; new Cells can be added as popularity grows. Persistence is achieved using HBase.

Facebook Messages‘ search engine is built with an inverted index stored in HBase. Link

The typeahead search uses a custom storage and retrieval logic. Link

Chat is based on an Epoll server developed in Erlang and accessed using Thrift. Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability. Some of its uses are in telecoms, banking, e-commerce, computer telephony and instant messaging. Erlang‘s runtime system has built-in support for concurrency, distribution and fault tolerance.

They‘ve built an automated system that respond to monitoring alert by launching the appropriated repairing workflow, or escalating to humans if the outage couldn‘t be overcome.

About the resources provisioned for each of these components, some information and numbers are known:

  • Facebook is estimated to own more than 60,000 servers. Their recent datacenter in Prineville, Oregon is based on entirely self-designed hardware that was recently unveiled as Open Compute Project.
  • 300 TB of data is stored in Memcached processes
  • Their Hadoop and Hive cluster is made of 3000 servers with 8 cores, 32 GB RAM, 12 TB disks that is a total of 24k cores, 96 TB RAM and 36 PB disks
  • 100 billion hits per day, 50 billion photos, 3 trillion objects cached, 130 TB of logs per day as of july 2010

facebook architecture

时间: 2024-11-03 22:09:47

facebook architecture的相关文章

Facebook大规模Flash失效分析研究 - SIGMetrics, 2015

Facebook与卡内基梅隆大学最近在SIGMetrics ( June 15–19, 2015, Portland, OR, USA).发表一篇关于大规模应用下PCI-e flash失效研究的文章”A Large-Scale Study of Flash Memory Failures in the Field” .基于对Facebook数据中心近4年来大量flash失效数据的总结,揭示了一些有趣的现象,对flash,存储软件,全闪存阵列,应用者以及基础架构运维者有启发意义. 原文在  htt

Analyzing The Papers Behind Facebook's Computer Vision Approach

Analyzing The Papers Behind Facebook's Computer Vision Approach Introduction You know that company called Facebook? Yeah, the one that has 1.6 billion people hooked on their website. Take all of the happy birthday posts, embarrassing pictures of you

100 open source Big Data architecture papers for data professionals

zhuan :https://www.linkedin.com/pulse/100-open-source-big-data-architecture-papers-anil-madan Big Data technology has been extremely disruptive with open source playing a dominant role in shaping its evolution. While on one hand it has been disruptiv

可扩展的Web系统和分布式系统(Scalable Web Architecture and Distributed Systems)

Open source software has become a fundamental building block for some of the biggest websites. And as those websites have grown, best practices and guiding principles around their architectures have emerged. This chapter seeks to cover some of the ke

JavaScript Application Architecture On The Road To 2015

JavaScript Application Architecture On The Road To 2015 I once told someone I was an architect. It’s true in a way since I now have to design an intricate web of lies to back it up. On a serious note, I thought it might be salutary to look at the sta

facebook chat 【转】

Facebook Chat, offered a nice set of software engineering challenges: Real-time presence notification: The most resource-intensive operation performed in a chat system is not sending messages. It is rather keeping each online user aware of the online

Facebook 对 Memcache 伸缩性的增强

概要:Memcached 是一个知名的,简单的,全内存的缓存方案.这篇文章描述了facebook是如何使用memcached来构建和扩展一个分布式的key-value存储来为世界上最大的社交网站服务的.我们的系统每秒要处理几十亿的请求,同时存储了几万亿的数据项,可以给全世界超过10亿的用户提供丰富体验. 1 介绍 近些年SNS网络大行其道,这对网站基础建设提出了巨大的挑战.每天有亿万的用户在使用这些网络服务,巨大的计算.网络和I/O资源的需求使传统的web架构不堪重 负.SNS网站的基础架构需要

SOA: UBER工程代码架构的拓展和演变SERVICE-ORIENTED ARCHITECTURE: SCALING THE UBER ENGINEERING CODEBASE AS WE GROW

像很多初创型公司一样,Uber的架构一开始也是一整块的,或者说是整体的.不可分割的,服务端部署在一个城市,对外整体上是单个节点.这个也迎合了当时服务范围和功能选项有限的业务场景.可执行代码部署在单个节点,对于这种场景下,可以说是简洁.易管理的,而且直接上来说,满足了我们的业务需求:简单的连接司机和乘客,出账单,支付.在这种"小而美"的场景下,将Uber的这些简单的业务逻辑放在一起,也是很有道理.很有实际操作性.很有性价比的:).但是,当我们的业务迅速拓展到多个城市,并且产品也不再那么单

Facebook的体系结构分析---外文转载

Facebook的体系结构分析---外文转载 From various readings and conversations I had, my understanding of Facebook's current architecture is: Web front-end written in PHP. Facebook's HipHop Compiler [1] then converts it to C++ and compiles it using g++, thus providi