WebRTC: A few tips & some advice on getting started

You’ve probably used WebRTC even if you don’t know what it is and if you’re a business owner with an application (web, phone etc) that is geared around communication, you’ve probably had a thought about adding video chat to your application.

The technology WhatsApp uses is WebRTC, it’s an open source project with a big lead from Google. It allows 2 devices to talk directly to each-other without any middle man, its free and well maintained.

If you’re an application developer, or a business owner who has given somebody the task of doing this, here are a few things I wish I’d known earlier.

1. WebRTC moves fast, the documentation doesn’t

There is a huge amount of documentation, public git repositories, blog posts and StackOverflow questions and many of them where correct at a different time but seem to be no longer.

There was once a time when PeerConnectionFactory didn’t exist. Be careful you’re not looking at old information.

2. There is no WebRTC “signalling server”

This one is fun. You have 2 clients, A and B, before you can have your video call you need to send a few signals.

WebRTC would be half as complex if it came with a signalling server but it doesn’t so here we are (in some respects, it would be twice as complex if it did).

You need a method to help 2 people connect, somebody needs a place to connect and shout “Hey, Adam are you there?” and Adam needs to be able to respond “Yes, I’m here”. You then start to swap offers and answers. This is done via the signalling server but it can literally be any method you want, HTTP calls, socket.io (this is my personal favourite), text messages, literally anything!

So when documentation mentions “signalling server”, it should be a “signalling protocol” because how you implement this communication is up to you, there is no off the shelf product here, you implement it based on your technology, requirements and needs.

3. PeerConnection is where the magic happens

Most of the work what you will do will revolve around PeerConnection but don’t neglect these

PeerConnectionFactory
PeerConnection.IceServer
DefaultVideo(En|De)?coderFactory
MediaConstraints
PeerConnection.RTCConfiguration

4. WebRTC does not work on every platform

IOS, Android and most but not all web browsers, that’s about your lot. There are some native C++ API’s so anything you can compile them on, it will also work. If you’re working with things like Xamarin, I would imagine there are 3rd party libraries but nothing “native” or official.

5. Many WebRTC examples focus around web browsers (but it’s good)

The properties, methods, class names etc are mostly the same in any language so if you see a piece of documentation in Javascript, it will likely be 95% of what you need in JAVA. Once variation is a browser you need RTCPeerConnection but in Java it’s just PeerConnection

6. STUN and TURN are not twins. You likely do not need TURN in development, WebRTC treats them as the same

Hansel and Gretel left breadcrumbs, STUN helps the other person find the breadcrumbs. I won’t explain STUN further than that. Its use is simple, connect, get data, profit.

TURN is a middle man. It’s not always possible to use a direct connection to your callee because of various reasons but if you can both make an outbound connection, you can both interact via TURN. They’re expensive.

In WebRTC they’re kind of doing the same job, but they’re both different. Stick with STUN if you’re starting out and if you’re finding video problems and connectivity problems, look at setting up a TURN server.

7. Learn the flow

WebRTC is quite asynchronous, you generate some data, wait for the response, send it via signalling server, that needs to go somewhere and then you’re kinda waiting until you get an event back saying somebody responded.

Learn the transaction flow of WebRTC, when messages are needed, when they’re generated/sent. A Google Image search for “webrtc flow” will help. Here is a nice one and here are the official ones.

8. Don’t cache or keep any data

WebRTC clients generate a bunch of messages (offers, answers, candidates) don’t cache these or keep them. It seems simple to think “I’ll keep these messages on the signalling server and push them when the next client connects”. Don’t do it. Have a signal to say “Hey, you’re here, I’m there, let’s talk!” and start the transaction.

I hope this helped you get some clarity on WebRTC, I recently worked on a project with WebRTC. Find me on social media or email me if you think I can help with your project.