[转]What is a WebRTC Gateway anyway? (Lorenzo Miniero)

https://webrtchacks.com/webrtc-gw/

As I mentioned in my ‘WebRTC meets telecom’ article a couple of weeks ago, at Quobis we’re currently involved in 30+ WebRTC field trials/POCs which involve in one way or another a telco network. In most cases service providers are trying to provide WebRTC-based access to their existing/legacy infrastructure and services (fortunately, in some cases it’s not limited to do only that). To achieve all this, one of the pieces they need to deploy is a WebRTC Gateway. But, what is a WebRTC Gateway anyway? A year ago I had the chance to provide a first answer during the Kamailio World Conference 2013 (see my presentation WebRTC and VoIP: bridging the gap) but, since Lorenzo Miniero has recently released an open source, modular and general purpose WebRTC gateway called Janus, I thought it would be great to get him to share his experience here.

I’ve known Lorenzo for some years now. He is the co-founder of a small but great startup called Meetecho. Meetecho is an academic spinoff of the University of Napoli Federico II, where Lorenzo is currently also a Ph.D student. He has been involved in real-time multimedia applications over the Internet for years, especially from a standardisation point of view. Within the IETF, in particular, he especially worked in XCON on Centralized Conferencing and MEDIACTRL on the interactions between Application Servers and Media Servers. He is currently working on WebRTC-related applications, in particular on conferencing and large scale streaming as part of his Ph.D, focusing on the interaction with legacy infrastructures — here it is where WebRTC gateways play an interesting role. As part of the Meetecho team he also provides remote participation services on a regular basis to all IETF meetings. Most recently, he also spent some time reviewing Simon P. and Salvatore’s L. new WebRTC book.

{“intro-by” : “victor“}



Lorenzo Miniero

What is a WebRTC Gateway anyway? (by Lorenzo Miniero)

Since day one, WebRTC has been seen as a great opportunity by two different worlds: those who envisaged the chance to create innovative and new applications based on a new paradigm, and those who basically just envisioned a new client to legacy services and applications. Whether you belong to the former or the latter (or anywhere in between, as me), good chances are that, sooner or later, you eventually faced the need for some kind of component to be placed between two or more WebRTC peers, thus going beyond (or simply breaking) the end-to-end approach WebRTC is based upon. I, for one, did, and have devoted my WebRTC-related efforts in that direction since WebRTC first saw the light.

A different kind of peer

As you probably already know (and if you don’t, head here and do your homework!), WebRTC has been conceived as a peer-to-peer solution: that is, while signalling goes through a web server/application, the media flow is peer-to-peer.

Figure 1: WebRTC native peer-to-peer communication

I won’t go into the details of how this paradigm may change, especially considering this has been the subject of a previous blog post. What’s important to point out is that, even in a simple peer-to-peer scenario, one of the two involved parties (or maybe even both) doesn’t need to be a browser, but may very well be an application. The reasons for having such an application may be several: it may be acting as an MCU, a media recorder, an IVR application, a bridge towards a more or less different technology (e.g., SIP, RTMP, or any legacy streaming platform) or something else. Such an application, which should implement most, if not all, the WebRTC protocols and technologies, is what is usually called a WebRTC Gateway: one side talks WebRTC, while the other still WebRTC or something entirely different (e.g., translating signalling protocols and/or transcoding media packets).

Figure 2: One of the peers as a logically decomposed WebRTC gateway (SIP example)

Gateways? Why??

As anticipated, there are several reasons why a gateway can be useful. Technically speaking, MCUs and server-side stacks can be seen as gateways as well, which means that, even when you don’t step outside the WebRTC world and just want to extend the one-to-one/full-mesh paradigm among peers, having such a component can definitely help according to the scenario you want to achieve.

Nevertheless, the main motivation comes from the tons of existing and so called legacy infrastructures out there, that may benefit from a WebRTC-enabled kind of access. In fact, one would assume that the re-use of existing protocols like SDP, RTP and others in WebRTC would make this trivial. Unfortunately, most of the times that is not the case. In fact, if for instance we refer to existing SIP infrastructures, even by making use of SIP as a signalling protocol in WebRTC there are too many differences between the standards WebRTC endpoints implement and those available in the currently widely available deployments.

Just to make a simple example, most legacy components don’t support media encryption, and when they do they usually only support SDES. On the other end, for security reasons WebRTC mandates the use of DTLS as the only way to establish a secure media connection, a mechanism that has been around for a while but that has seen little or no deployment in the existing communication frameworks so far. The same incompatibilities between the two worlds emerge in other aspects as well, like the extensive use WebRTC endpoints make of ICE for NAT traversal, RTCP feedback messages for managing the status of a connection or RTP/RTCP muxing, whereas existing infrastructures usually rely on simpler approaches like Hosted Nat Traversal (HNT) in SBCs, separate even/odd ports for RTP and RTCP, and more or less basic RFC3550 RTCP statistics and messages. Things get even wilder when we think of the additional stuff, mandatory or not, that is being added to WebRTC right now, as BUNDLETrickle ICE, new codecs the existing media servers will most likely not support and so on, not to mention Data Channels and WebSockets and the way they could be used in a WebRTC environment to transport protocols like BFCP or MSRP, that SBCs or other legacy components would usually expect on TCP and/or UDP and negotiated the old fashioned way.

Ok, we need a gateway… what now?

Luckily for you (and for us all!), several people have worked on gateways since the first WebRTC browsers have been made available. Even just to focus on open source efforts alone, a lot of work has been done on platforms like Asterisk or Kamailio to make the interaction with existing SIP infrastructures easier, and new components like DoubangoKurentoLicode or the Jitsi stack have been released in the latest months. Each application usually addresses different requirements, depending on whether you just need a WebRTC-to-SIP gateway, a conferencing MCU, a WebRTC-compliant streaming server, a more generic stack/media server and so on.

Since I’ve recently worked on an open source WebRTC gateway implementation called Janus myself, and considering its more general purpose approach to gatewaying, I’ll try and guide you through the common requirements and challenges such a WebRTC-driven project can face you with.

Where to start?

When it comes to gateways, the harder step is always the first one. Where should you start? The easiest way is starting from addressing the functional requirements, that usually are:

  • architectural, as in “should the gateway be monolithic, or somehow decomposed between signalling and media plane”?;
  • protocols, as you’ll need to be able to talk WebRTC and probably something else too, if you’re going to translate to a more or less different technology;
  • media management, depending on whether you’re only going to relay media around or handle it directly (e.g., transcoding, mixing, recording, etc.);
  • signalling, that is how you’re going to setup and manage media sessions on either side;
  • putting this all together, as, especially in WebRTC, all current implementations have expectations on how the involved technologies should behave, and may not work if they’re failed.

The first point in particular is quite important, as it will obviously impact the way the gateway is subsequently going to be designed and implemented. In fact, while a monolithic approach (where signalling and media planes are handled together) might be easier to design, a decomposed gateway (with signalling and media planes handled separately, and the two interacting somehow) would allow for a separate management of scalability concerns. There is a middle ground, if for instance one relies on a more hybrid modular architecture. That said, all of them have pros and cons, and if properly designed each of them can be scaled as needed.

Apart from that, at least from a superficial point of view there’s nothing in the requirements that is quite different from a WebRTC-compliant endpoint in general. Of course, there are differences to take into account: for one, a gateway is most likely going to handle many more sessions that a single endpoint; besides, no media needs to be played locally, which makes things easier on one side, but presents different complications when it comes to what must happen to the media themselves. The following paragraphs try to go a bit deeper in the genesis of a gateway.

Protocols

The first thing you need to ask when choosing or implementing a WebRTC implementation is: can I avoid re-inventing the wheel? This is a very common question we ask ourselves everyday in several different contexts. Yet, it is even more important when talking about WebRTC, as it does partly re-use existing technologies and protocols, even if “on steroids” as I explained before.

The answer luckily is, at least in part, “mostly“. You may, of course, just take the Chrome stack and start it all from there. As I anticipated, a gateway is, after all, a compliant WebRTC implementation, and so a complete codebase like that can definitely help. For several different reasons, I chose a different approach, that is trying to write something new from scratch. Whatever the programming language, there are several open source libraries you can re-use for the purpose, like openssl (C/C++) or BouncyCastle (Java) for DTLS-SRTP, libnice (C/C++), pjnath (C/C++) or ice4j (Java) for everything related to ICE/STUN/TURN, libsrtp for SRTP and so on. Of course, a stack is only half of the solution: you’ll need to prepare yourself for every situation, e.g., acting as either a DTLS server or client, handle heterogeneous NAT traversal scenarios, and basically be able to interact with all compliant implementations according to the WebRTC specs.

As you may imagine (especially if you read Tim’s rant), things do get a bit harder when it comes to SDP: while there are libraries that allow you to parse, manipulate and generate SDP, the several attributes and features that are needed for WebRTC are quite likely not supported, if not by working on the library a lot. For instance, for Janus I personally chose a relatively lightweight approach: I used Sofia-SDP as a stack for parsing session descriptions, while manually generating them instead of relying on a library for the purpose. Considering the mangling we already all do in JavaScript, until a WebRTC-specific SDP library comes out it looked like the safest course of action. What’s important to point out is that, since the gateway is going to terminate the media connections somehow, the session descriptions must be prepared correctly, and in a way that all compliant implementations must be able to process: which means, be prepared to handle whatever you may receive, as your gateway will need to understand it!

Media

Once you deal with the protocols, you’re left with the media, and again, there are tons of RTP/RTCP libraries you can re-use for the purpose. Once you’re at the media level, you can do what you want: you may want to record the frames a peer is sending, reflect them around for a webinar/conference, transcode them and send them somewhere else, translate RTP and the transported media to and from a different protocol/format, receive some from an external source and send them to a WebRTC endpoint, and so on.

Figure 3: Bridging to different technologies

RTCP in particular, though, needs special care, especially if you’re bridging WebRTC peers through the gateway: in fact, RTCP messages are tightly coupled with the RTP session they’re related to, which means you have to translate the messages going back and forth if you want them to keep their meaning. Considering a gateway is a WebRTC-compliant endpoint, you may also want to take care of the RTCP messages yourself: e.g., retransmit RTP packets when you get a NACK, adapting the bandwidth on reception of a REMB, or keep the WebRTC peer up-to-date on the status of the connection by sending proper feedback. Some more details are available in draft-ietf-straw-b2bua-rtcp which is currently under discussion in the IETF.

Signalling

Last, but not least: what kind of signalling should your gateway employ? WebRTC doesn’t mandate any, which means you’re free to choose the one that fits your requirements. Several implementations rely on SIP, which looks like the natural choice when bridging to existing SIP infrastructures. Others make use of alternative protocols like XMPP/Jingle.

That said, there is not a perfect candidate, as it mostly depends on what you want your gateway to do and what you’re most comfortable with in the first place. If you want it to be as generic as possible, as I did, an alternative approach may be relying on an ad-hoc protocol, e.g., based on JSON or XML, which leaves you the greatest freedom when it comes to design a bridge to other technologies.

Long story short…

As you might have guessed, writing a gateway is not easy. You need to implement all the protocols and use them in a way that allows you to seamlessly interact with all compliant implementations, maybe even fixing what may cause them not to interact with each other as they are, while at the same time taking into account the requirements on the legacy side. You need to tame the SDP beast, be careful of RTCP, and take care of any possible issue that may arise when bridging WebRTC to a different technology. Besides, as you know WebRTC is a moving target, and so what works today in the gateways world may not work tomorrow: which means that keeping updated is of paramount importance.

Anyway, this doesn’t need to scare you. Several good implementations are already available that address different scenarios, so if all you need is an MCU or a way to simply talk to well known legacy technologies, good chances are that one or more of the existing platforms can do it for you. Some implementations, like Janus itself, are even conceived as more or less extensible, which means that, in case no gateway currently supports what you need, you probably don’t need to write a new one from scratch anyway. And besides, as time goes by the so-called legacy implementations will hopefully start aligning with the stuff WebRTC is mandating right now, so that gateways won’t be needed anymore for bridging technologies but only to allow for more complex WebRTC scenarios.

That said, make sure you follow the Server-oriented stack topic on discuss-webrtc for more information!

{“author” : “Lorenzo Miniero“}

Want to keep up on our latest posts? Please click here to subscribe to our mailing list if you have not already. We only email post updates. You can also follow us on twitter at @webrtcHacks for blog updates and news of technical WebRTC topics or our individual feeds @chadwallacehart, @victorpascual and @tsahil.

时间: 2024-10-17 04:42:21

[转]What is a WebRTC Gateway anyway? (Lorenzo Miniero)的相关文章

揭开webRTC媒体服务器的神秘面纱——WebRTC媒体服务器&开源项目介绍

揭开webRTC媒体服务器的神秘面纱--WebRTC媒体服务器&开源项目介绍 WebRTC生态系统是非常庞大的.当我第一次尝试理解WebRTC时,网络资源之多让人难以置信.本文针对webRTC媒体服务器和相关的开源项目(如kurento,janus,jitsi.org等)做一些介绍.并且将尝试降低理解WebRTC的业务价值所需要的技术门槛. 何为WebRTC服务器? 自从WebRTC诞生之初以来,该技术的主要卖点之一是它可以进行点对点(browser-to-browser)通信,而几乎不需要服务

webrtc教程

cdsn博客不支持word文件,所以这里显示不完全.可到本人资源中下载word文档: v0.3:http://download.csdn.net/detail/kl222/6961491 v0.1:http://download.csdn.net/detail/kl222/6677635  下载完后评论,可以返还你的积分.此文档还在完善中,欢迎大家交流,共同完善.    Webrtc  教程 版本0.3(2014年2月) 康林 ([email protected]) 本文博客地址:http://

真实场景中WebRTC 用到的服务 STUN, TURN 和 signaling

FQ收录转自:WebRTC in the real world: STUN, TURN and signaling WebRTC enables peer to peer communication. BUT... WebRTC still needs servers: For clients to exchange metadata to coordinate communication: this is called signaling. To cope with network addre

WebRTC iOS平台的基本实现

前面介绍了如何下载编译WebRTC,现在介绍如何利用WebRTC在iOS客户端上简单实现音视频通话. 对下载编译还有问题的,请先查看:WebRTC(iOS)下载编译. 不需要下载源码只需要库文件的也可以用CocoaPods下载编译好的库:pod 'libjingle_peerconnection' 开始编写之前,我们首先要搭建一个服务器,此服务器主要用于信令交互.我们这里采用github上的开源项目:SkyRTC项目. 完整Demo下载:FLWebRTCDemo. 开始iOS客户端的实现: 1.

crosswalk 快速入门,利用WebRTC(html)开始开发视频通话

crosswalk 快速入门,利用WebRTC(html)开始开发视频通话 安装Python 从http://www.python.org/downloads/ 下载安装程序 安装完后,再添加到环境变量. 安装Oracle JDK 下载页面: http://www.oracle.com/technetwork/java/javase/downloads/ 选择要下载的Java版本(推荐Java 7). 选择一个JDK下载并接受许可协议. 一旦下载,运行安装程序. 安装Ant Ant:下载http

国内如何下载code.google、googlecode上的源码webrtc

Github下载代码确实很方便,直接下载那个zip包就OK,无奈有很多开源代码只在googlecode上有,googlecode又只能通过svn下载.在国内需要翻墙才能下载.本人常用的翻墙手段GoAgent和某某门,GoAgent只能在chrome中用,看看网页:某某门穿透力还是差了点.于是在baidu上有搜索到了一个新方法,成功下载googlecode上的webrtc代码,步骤如下: Step1: 打开https://code.google.com/p/smarthosts/,下载hosts文

孢子框架-接口访问层、ESB、微服务API GateWay对比

如果从百度去搜索“接口访问层”你会发现主要是.NET里面的技术,叫做IDAL,其实是数据访问层接口.它的主要作用是兼容多种数据库.比如你定义一个标准接口,然后实现改接口的SqlServer访问和Oracle访问,那么利用IDAL就可以自由切换数据库.看.NET DEMO PetShop4,总共有22个项目.大体思想是3层,从Model.DAL.BLL,然后他在各层上又采用了工厂模式,把逻辑与实现想分离,比如以前BLL直接调用DAL就好了,但现在BLL却调用了IDAL,IDAL就是一个接口层,里面

Webrtc服务器搭建

1.WebRTC后台服务: 通话的房间服务器(Room Server) 房间服务器是用来创建和管理通话会话的状态维护,是双方通话还是多方通话,加入与离开房间等等,我们暂时沿用Google部署在GAE平台上的AppRTC这个房间服务器实现,该GAE App的源码可以在github.com上获取.该实现是一个基于Python的GAE应用,我们需要下载Google GAE的离线开发包到我们自己的Linux服务器上来运行该项目,搭建大陆互联网环境下的房间服务器. 通话的信令服务器(Signaling S

关于在android层释放webrtc资源的问题

最近一段时间在做基于webrtc的android应用在释放资源时遇到一些问题,现在记录下来用于备忘. 官方给出的AppRTCDemo太过于简单很多问题没涉及到. 1.释放peerconnection资源的问题. 场景:A和B进行通话(视频通话) 现在B中终止通话 错误:在B终止通话之后,A端的程序程序会意外退出. 分析:在A和B进行通话的时候会见了相应的PeerConnection类实例,这个实例保存的有stream的引用(localstream和remote stream等等). B在终止通话