How to Design Youtube (Part I)

How to Design Youtube (Part I)

One of the most common types of system design interview questions is to design an existing popular system. For example, in the past, we’ve discussed How to Design TwitterDesign Facebook Chat Function and so on so forth.

Part of the reason is that the question is usually general enough so that there are a lot of areas to discuss. In addition, if candidates are generally curious enough, they are more likely to explore how existing products are designed.

So this week, we’re going to talk about how to design Youtube. It’s a broad question because building Youtube is like building a skyscraper from scratch and there are just too many things to consider. Therefore, we’ll cover most of the “major” components from interviewer’s perspective, including database model, video/image storage, scalability, recommendation, security and so on.

Overview

Facing this question, most people’s minds go blank as the question is just too broad and they don’t know where to start. Just the storage itself is non-trivial as serving videos/images seamlessly to billions of users is extremely complicated.

As suggested in 8 Things You Need to Know Before a System Design Interview, it’s better to start with a high-level overview of the design before digging into all the details. This is true especially for problems like this that has countless things to consider and you’ll never be able to clarify everything.

Basically, we can simplify the system into a couple of major components as follows:

  • Storage. How do you design the database schema? What database to use? Videos and images can be a subtopic as they are quite special to store.
  • Scalability. When you get millions or even billions of users, how do you scale the storage and the whole system? This can be an extremely complicated problem, but we can at least discuss some high-level ideas.
  • Web server. The most common structure is that front ends (both mobile and web) talk to the web server, which handles logics like user authentication, sessions, fetching and updating users’ data, etc.. And then the server connects to multiple backends like video storage, recommendation server and so forth.
  • Cache is another important components. We’ve discussed in details about cache before, but there are still some differences here, e.g. we need cache in multiple layers like web server, video serving, etc..
  • There are a couple of other important components like recommendation system, security system and so on. As you can see, just a single feature can be used as a stand-alone interview question.

Storage and data model

If you are using a relational database like MySQL, designing the data schema can be straightforward. And in reality, Youtube does use MySQL as its main database from the beginning and it works pretty well.

First and foremost, we need to define the user model, which can be stored in a single table including email, name, registration data, profile information and so on. Another common approach is to keep user data in two tables – one for authentication related information like email, password, name, registration date, etc. and the other for additional profile information like address, age and so forth.

The second major model is video. A video contains a lot of information including meta data (title, description, size, etc.), video file, comments, view counts, like counts and so on. Apparently, basic video information should be kept in separate tables so that we can first have a video table.

The author-video relation will be another table to map user id to video id. And user-like-video relation can also be a separate table. The idea here is database normalization – organizing the columns and tables to reduce data redundancy and improve data integrity.

Video and image storage

It’s recommended to store large static files like videos and images separately as it has better performance and is much easier to organize and scale. It’s quite counterintuitive that Youtube has more images than videos to serve. Imagine that each video has thumbnails of different sizes for different screens and the result is having 4X more images than videos. Therefore we should never ignore the image storage.

One of the most common approaches is to use CDN (Content delivery network). In short, CDN is a globally distributed network of proxy servers deployed in multiple data centers. The goal of a CDN is to serve content to end-users with high availability and high performance. It’s a kind of 3rd party network and many companies are storing static files on CDN today.

The biggest benefit using CDN is that CDN replicates content in multiple places so that there’s a better chance of content being closer to the user, with fewer hops, and content will run over a more friendly network. In addition, CND takes care of issues like scalability and you just need to pay for the service.

Popular VS long-tailed videos

If you thought that CDN is the ultimate solution, then you are completely wrong. Given the number of videos Youtube has today (819,417,600 hours of video), it’ll be extremely costly to host all of them on CDN especially majority of the videos are long-tailed, which are videos have only 1-20 views a day.

However, one of the most interesting things about Internet is that usually, it’s those long-tailed content that attracts the majority of users. The reason is simple – those popular content can be found everywhere and only long-tailed things make the product special.

Coming back to the storage problem. One straightforward approach is to host popular videos in CDN and less popular videos are stored in our own servers by location. This has a couple of advantages:

  • Popular videos are viewed by a huge number of audiences in different locations, which is what CND is good at. It replicates the content in multiple places so that it’s more likely to serve the video from a close and friendly network.
  • Long-tailed videos are usually consumed by a particular group of people and if you can predict in advance, it’s possible to store those content efficiently.

Summary

There are just too many topics we’d like to cover for the question “how to design Youtube”. In our next post, we’ll talk more about scalability, cache, server, security and so on.

By the way, if you want to have more guidance from experienced interviewers, you can check Gainlo that allows you to have mock interview with engineers from Google, Facebook ,etc..

The post is written by Gainlo - a platform that allows you to have mock interviews with employees from Google, Amazon etc..

原文地址:https://www.cnblogs.com/vicky-project/p/9177049.html

时间: 2024-08-30 16:00:57

How to Design Youtube (Part I)的相关文章

System design interview: how to design comments and reply, likes button and total views on Youtube

System design interview: how to design comments and reply, likes button and total views on Youtube Methodology: READ MF! [Originally from the Post: System design interview: how to design a chat system (e.g., Facebook Messenger, WeChat or WhatsApp)] R

Lucidpress | Free Design Tool(Web打印)

插件介绍: 在工作中常常会用到打印,如果是简单的Word上的数据还简单,Web打印的确是个很麻烦的问题,有了Lucidpress | Free Design Tool(Web打印)就可以解决你的烦恼,现在,任何人都可以创建令人惊叹的视觉内容的打印,移动或网络.宣传册的制作.时事通讯.杂志.报告.或更多. 打印:下载你的内容作为一个高品质的PDF或JPG的完美打印. 数字:出版你的作品作为一个PNG或圆滑的数字文档.所有的一切都是为了在电脑.平板电脑或智能手机上进行优化. 使用说明: 将Lucid

进入Material Design时代

由于本文引用了大量官方文档.图片资源,以及开源社区的Lib和相关图片资源,因此在转载的时候,务必注明来源,如果使用资源请注明资源的出处,尊重版权,尊重别人的劳动成果,谢谢! Material Design 官方Material Design详细介绍文档:http://www.google.com/design/spec/material-design/introduction.html 关于Material Design是Android 5.0系统的重头戏,并在以后App中将成为一种设计标准,而

【Android】进入Material Design时代

由于本文引用了大量官方文档.图片资源,以及开源社区的Lib和相关图片资源,因此在转载的时候,务必注明来源,如果使用资源请注明资源的出处,尊重版权,尊重别人的劳动成果,谢谢! Material Design 官方Material Design详细介绍文档:http://www.google.com/design/spec/material-design/introduction.html Material Design是Android 5.0系统的重头戏,并在以后App中将成为一种设计标准,而且随

Android Programming: Pushing the Limits -- Chapter 4: Android User Experience and Interface Design

User Stories Android UI Design 附加资源 User Stories: @.通过写故事来设计应用. @.每个故事只关注一件事. @.不同的故事可能使用相同的组件,因此尽早地对故事进行分类. @.把目标用户构想到故事里,描述他们的基本特征,会在什么时候.什么地点使用该应用等信息,因此来确定故事的优先级. Android UI Design: @.构思应用需要展示的界面及内容,不需要详细的界面设计. @.确定各界面的跳转关系. @.用户界面原型设计,可通过工具进行,比如A

手把手教你打造一个Material Design风格的App(一)

你应该听说过Android的Material Design,它是在Android 5.0(Lollipop)版本引入的.在Material Design中还引入了很多新东西,比如Material Theme,新的小部件,自定义的阴影,矢量图片及自定义动画等.如果你之前没有用过Material Design,那么本文将是一个很好的入门教程. 在这篇教程中,我们将会学习Material Design开发的基本步骤,即编写自定义的主题以及使用RecyclerView来实现抽屉导航. 通过下面的两个链接

官方 Material Design App

[转]MaterialDesignCenter 发表回复 转: https://github.com/lightSky/MaterialDesignCenter MaterialDesignCenter Collection of material design libs and res. 如果你也有不错的Material Design相关资源,可直接Commit,但在Commit之前,请进行Preview changes,保持整体的美观性,图片的命名和规格可以参照已提交的图片. 欢迎大家Sta

实现Instagram的Material Design概念设计

几个月前(这篇文章的日期是2014 年11月10日),google发布了app和web应用的Material Design设计准则之后,设计师Emmanuel Pacamalan在youtube上发布了一则概念视频,演示了Instagram如果做成Material风格会是什么样子: 视频地址 http://v.youku.com/v_show/id_XODg2NDQ1NDQ4.html 这仅仅是停留在图像上的设计,是美好的愿景,估计很多人都会问,能否使用相对简单的办法将它实现出来呢?答案是:ye

facebook design question 总结

http://blog.csdn.net/sigh1988/article/details/9790337 这里原帖地址: http://www.mitbbs.com/article_t/JobHunting/32492515.html 以下为转载内容 ===========================我是分割线================== 稍微总结一下 1. 入门级的news feedhttp://www.quora.com/What-are-best-practices-for-