Building a WebRTC app – LIVE!

I’m Tim Panton. And this is Alex. I do stuff. You know, WebRTC stuff mostly,
but I started out on the web. And I’m really liking
the idea of getting back into web development and
escaping from telephony. So that’s the kind of
sub theme for this talk. And Alex will tell
you who he is. ALEX THOMAS: I’m Alex Thomas. I first got swept
into the WebRTC world by creating my own startup. I wanted to create audio
chat for gamers and teams in the web browser. And like earlier speakers said,
it was virtually impossible. The price cap was way too huge. And I stumbled upon a company
called XirSys and started using them. And it was a very
natural fit for me to actually get a job there. And we provide STUN and TURN
servers, globally scalable. You can bring any API into us. And it’s great if you think
the STUN and TURN server area is just a land of dragons. TIM PANTON: So I
want to talk really about how to build a WebRTC app. And that comes into two chunks. The first one I want
to absolutely emphasize is what not to do. I can’t say this enough, right? Do not– do not– replicate
the PSTN functionality. There’s really no point. And specifically,
because it devalues both. So the value in WebRTC
is around its context. It’s around the finely
tuned UX that you’re providing to your web users
that you know a lot about. You know where they are. You know their friends. You know all of this stuff. And you tailor a UX to meet
their goals of the thing that they’re trying to achieve. And the PSTN is
exactly the other way. It’s a generic mechanism
that is nonspecific, but it’s ubiquitous. And the great
thing about PSTN is you can still charge
by the minute. Now, what happens
when you combine the two is you lose
all of those things. WebRTC stops being specific. It stops being a tailored UI. PSTN, you can’t
charge by the minute anymore, because
people don’t expect to be charged on the web. And hey, also, you lose
the ubiquitousness. So in general, it’s like
the worst of both worlds. And I can say this, because
I ran a startup doing it. And it wasn’t a huge success. And actually there
was So having said all that,
just kind of for laughs, we’re going to do it anyway. So what we have here
is the first demo. Now let me just make
sure we have audio. Yes, OK. So what we have here is
a framework I worked on and I still kind of
have warm feelings for that allows you
to basically build– and I could do with a slightly
bigger font on that one. I’ll come back to it. It allows you to build
apps that call out into the PSTN and some
other things using WebRTC. And just to give you a
sense of how easy it is, I’m running those four
bits of JavaScript. And now, we can
make a phone call. Hopefully. [DIALING] PHONE OPERATOR: Hold please
while we connect your call. Thank you for calling
1-800-PedMeds, America’s largest pet pharmacy. If you’ve placed an order with
us before, please press 1. All other callers,
please press 2. TIM PANTON: Then you can
press 2 and all that stuff. But the interesting thing
is if we hang in there for a little longer, she will
tell you to go to the website. ALEX THOMAS: I would like
to speak to a person. TIM PANTON: So my point there
is that her thing is actually to drive you back
to the web, right? So even if you do make a PSTN
call with your WebRTC thing, the first thing you’re going
to hit is somebody saying, hey, go back to the web. So that is a
serious issue, and I think you can’t
underestimate it. And now I need to
get my screen back. There we go. So having annoyed all the
PSTN people who are watching, let’s build something to
actually delight somebody. Let’s build something
that actually does do something kind
of fun and useful. And so what we
thought we’d do was to build something that
would at least delight a pet. And pets are kind of–
you know, a lot of them are really social animals. And I must tell
you, so this parrot that we may call later
if the gods are with us. First time I met this
parrot, it gave me a really seriously hard time. I came into the
room with my friend, and this parrot started dive
bombing me and squawking at me, right? And so my friend
says, shake hands. And I’m thinking, well, I know
this guy from like five years. We’ve been on a desert
island together. We know each other pretty well. We’re past the
shaking hands thing. So I look at him really puzzled. And he says, no,
really, shake hands. So we shake hands, and
the parrot watches. And immediately, the
parrot’s seen the social cue. The parrot knows
that I’m now welcome, and therefore, it calms down. It still keeps a bit
of a beady eye on me, but really, it
stops hassling me. So they are social animals, and
they understand visual clues. So hey, video for
them sounds fine. And actually, they do also
do– they like phone calls. Parrots will talk to
you over the phone. So we thought we’d build YoPet. So having said that, we thought
we’d probably better show you how you build an app. AUDIENCE: [INAUDIBLE]. ALEX THOMAS: Sorry? AUDIENCE: Watch it. Someone’s going
to want to invest. TIM PANTON: Yeah, a seed round. [LAUGHTER] TIM PANTON: So just to
give you a sense of how easy this stuff is,
really, kind of ish, this is all the app
is going to do, right? All we have there is a
couple of video elements. And well, that’s almost it. The only thing we do then
is to add some scripting. JQuery because I’m lazy. Hoodie, Hoodie’s a
really cute framework. Hoodie’s a really nice
framework for throwing up apps really quickly. It does all of the kind
of sign in, sign out, consistency across
multiple devices for you, synchronization across multiple
devices, all that stuff. And I’m going to ruthlessly
abuse that synchronization as a way of doing
signaling in WebRTC. And adapter.js just gets
rid of some stuff that is slightly different
between Firefox and Chrome. So that’s a thing that
the Google guys support. And then there’s assets.js. But first I’ll just show you
the pet side of the thing. It’s essentially the same
thing, except that, in general, pets don’t like
seeing themselves. Like most of them
are really pretty territorial about other
animals of their same species. So there is only
one video element, and that’s me as the owner. We don’t show the
pet themselves. So here we go, Pretty
much the same code. And then we get into the
kind of complicated stuff. This is how you move
stuff around in WebRTC. We’re initializing
Hoodie down here. We’re adding some
signing in as test, and that allows lazy
stuff and status stuff. And it’s pretty much
a standard web app. There’s nothing exciting there. A Little bit of cleaning up here
that we remove anything that’s loafing around the
database when we start up. We set an image just so that we
know that the thing’s loaded. You know that we’ve
got to that point. And now here’s the
real magic here. We’re run a WebRTC
PeerConnection. So we fire one of those up. And then once we’ve fired one
up, we then run getUserMedia. And getUserMedia
accesses the camera. Having been given
permission to the camera, we then get given a stream. We connect the stream to the
video element we saw earlier. We then notify the
PeerConnection of the stream. Now, I have to take
some responsibility for the complexity
of this because there was an argument about whether
PeerConnection and getUserMedia should know about
stuff under the hood or whether the JavaScript
developer should be responsible for
passing things from A to B and that there should
be– that should be the only kind of
connection point. And my view was that it was
the JavaScript developer’s responsibility, and
we– well, that’s how it turned out
in the standard. Whether that was
right or not, I think there’s still people
who may disagree. But hey, it’s fine. It works. So having that, we
create an offer. And then once the offer is
created, this is the magic. We push it into
the Hoodie store. Now, Hoodie does this wonderful
thing, which on the pet side, you can say that I’m
listening to offers. And when Hoodie sees a
new offer in the store, you get this callback called. And we can then do
something with it. We create a remote description. We stuff it into
the PeerConnection, and we clean up the
Hoodie database. So this whole thing
kind of goes that way. And then we do the same thing
the other way with the answer. And then we do the whole
thing with the candidates. So we move around. I think in this case
we probably move around 20 messages between Hoodie
to get this call set up. And in the end, with any luck,
you get to talk to your pet. So let’s see if we can try that. Now, I need to get
back to my browser. So I have Hoodie running, yes. Now Alex is going to be
the pet for the moment. There’s– no, actually, no. Let’s not go there. And I’m going to be the
owner, with any luck. So here’s the owner. The user experience
here is really simple. That’s a graphical
representation of me on a bad day. And this is a graphical
representation of the pet. And then with any luck,
I get to call the pet. Does it come up? There we go. And we get to meet the pet. So it’s no big deal. But hey, we’ve done
that in 150 lines of JavaScript
maximum, complete app. So we looked at that. Oh! But actually, pets do get
lonely at home, right? If you leave your pet all day,
it’s good to call your pet. It’s good to make
sure– and we figured what you could do is you could
take an old Android phone, like maybe you were given
at Google I/O two years ago, strap it to your pet’s cage, and
then leave it in answer mode. And you could just call the
pet any time you feel like it. And I should point out that
the user experience is not reversible. The pet cannot call you at work. [LAUGHTER] So we’ve built YoPet. But actually, it turns out that
it’s a bit of pain in the neck, all that. It doesn’t scale very well. We’re using Hoodie
really in an ugly way. And as somebody else
said, the timing gets messed up if you’re not
just right next to each other on the same LAN. So actually, it turns
out to be easier to use a framework
in a lot of cases. And there are a lot
of them out there. And they simplify the coding. They help you keep up with
changing browser versions. And some of them
provide nice features. So we rewrote it. Can you just ping Harvin and
make sure we’ve got a live pet? ALEX THOMAS: He’s ready. TIM PANTON: OK, cool. So having said that–
and we’re going to get rid of that pet call. And I knew what the agenda is. So if we look at the code
for using a framework, so we basically rewrote
this thing slightly shorter. So we used HTM instead of HTML,
because it’s more compact. And so we now have actually
pretty much the same app. But I folded the JavaScript
in, because by this point there was barely any
point in having two files. And the difference being
that I’ve loaded this. Now, this is a framework that’s
in closed beta at the moment. But my friends were
kind enough to allow me to use it for this demo. And I kind of like it. It’s got some interesting
little minor features, which I don’t know if they’re unique
or not, but I quite like them. So user stuff for the framework,
you sign in with an app ID, create yourself a client. And you get notified
when you get connected. But the most interesting
aspect of this is that within an
application, their API allows you to
assert an identity. So I can say, in this
case, I’m the pet. And in the owner matching file,
we say that we’re the owner. And that allows us to
just assert an identity. And it could be Fred
or Bill or whatever you want to– your
pet’s name or whatever. But basically you can
assert that identity without having to do any kind
of other infrastructure, which is I’m finding actually
pretty convenient. And there’s pretty
much nothing else. It’s pretty simple code. All we’re doing is
calling the thing. So in fact, let’s try that. Assuming I can get my– this
is the thing I always– There we go. So now we need to
find a browser. So I was taking this
really seriously, actually. I bought myself
the domain as well. So I’m the owner here. And there’s got to be a business
model here, so we need adverts. Hey, we’re in Google. So we have now a
cruddy user interface. And is Harvin– he’s ready. OK, so he’s running it? ALEX THOMAS: I believe so. TIM PANTON: Cool. Well, with any luck– now,
I must ask you to be quiet. I don’t want to
scare this parrot. I’ve only just got to the
point where it trusts me, and I don’t want to scare it. So if we do connect
to the parrot– Hi Gallagher. Oh, it needs some audio. So somewhere in there,
there’s a parrot. Where’s he gone? Hey Gallagher! How you doing? Hey, Gallagher. How’s it? Do dee, do dee, do dee do. Do dee, do dee, do dee do. Hi Gallagher, how you doing? Yay! [CHIRPING] TIM PANTON: Do dee,
do dee, do dee do. Do dee, do dee, do dee do. Normally I can get Gallagher
to dance, but not today. Bye! Bye, bye, Gallagher. Right. So there you go. [APPLAUSE] Thank Harvin for me. So actually, the next
version of this app will have the ability to
push tunes to the far end, so that you can get
your pet to dance. So that’s an app of sorts. But that might not be
what you really want. You might not be
building a new app. You might not be seed
funding a startup. You might be wanting to build
it into an existing app. And you may be in
an organization where it’s not that
easy to get at the code base of the existing app that
you’re trying to support. Well, let me just say,
what you need to do is make sure the identity
stuff lines up correctly. You want to pick
a framework that fits the app, that uses the
same kind of things like JQuery and whatever. And add some magic. Do something interesting
for goodness sake. So what we’re going to
do now– and this is just like stupidly risky. Actually, so stupidly risky I’ve
forgotten how to– there we go. And I must say, I
have to give credit. Luis– I saw this demo
Luis did in Atlanta, and it was just
fantastically cool. So I talked to him
afterwards, and I asked him if I could steal a
couple of ideas from it. He said absolutely,
but give me credit. And I said, I am doing. And he said, I’m after a job. He’s just graduated in Chicago. He’s looking for a job. So if you like what I do
here, give Luis a job. What we’re doing
here is we’re going to add voice to Atlassian. I’m sorry, guys. I know you’ve got
HipChat, but this is something slightly different. So what we do with
this is– now, I’m going to live
code this into Chrome. But actually it’s equivalent
to running a Chrome extension. So everything I’m
doing here is something you can do as a
Chrome extension. They’re a little messier
to set up and deploy. So it didn’t seem like
that would be good theater. But it’s essentially
the same process. So what we’re going to do here
is you’re going to add– now, this is cute. I didn’t know you could do
this until the other day, but you can actually inject
JavaScript files with a Chrome console into a live running
page, which is sweet. So we’re now loading
JQuery on our library. And having done that, we will
sign up with that library. Now, actually one
of us should mute, because this is going
to get noisy, OK? And now, if I– Alex is doing
the same thing on the screen you can’t see on the same page. So the concept here
is, if you’re both talking about an
issue, like you’re both worried about an issue
and you need to discuss it, why not use the issue
page as the place where you have the, effectively,
conference call? So oops. That doesn’t look good. What have I done wrong? What have I done wrong? So one person say that louder. AUDIENCE: [INAUDIBLE]. TIM PANTON: Oh. OK. Thank you. You’re right. Because it’s up here. I’m really, really
glad somebody’s watching what I’m doing. Thank you. So if we set by name, my name. And then with any luck– yes. Good. Excellent. Now the important thing
there is, and I should– I missed that. And that was a
really good point. Because what I’ve done is,
up here, somewhere deep in the DOM, it
knows my name, OK? So the app already
knows my name. So what I’ve done is I’ve
pulled it out with that query and pushed it into
the framework. So now the framework
knows who I am. Now, Alex on the
other side is going to pull that out
and make a call. And we are now in a call. So we’re now having a chat. But you know, actually that’s
a bit unsatisfactory, really, because the whole
point about issues is that you’re recording
what it is that you’re doing and keeping
it as a record. So you know, hey,
why don’t we do that? So if I speak now, with any luck
it goes into my GitHub issue whilst I’m talking to Alex. So I’m now recording
this conversation. And it’s going into the issue. And Alex is– if he spoke,
it would go into his. And then we can combine them. Now there’s a whole
lot of stuff you would add to make
this a real app. But my point is that you
can do this in the browser. This is all happening
in the browser. This isn’t something
the framework’s doing. As you saw, this is whatever
that is– six lines of code to kick off the Google Chrome
recognizer with a web API. So there you go. We’re inserting text into a
web page that I don’t own. Somebody from Atlassian is
responsible for this page, not me. So I suppose the point I’m
trying to get over here is we can do stuff to existing
web properties and existing contexts. And Serge mentioned
WebRTC for dogs. I did actually
whilst– after you said that, I hacked
something together. It turns out that some
dogs are colorblind. So I thought we could rewrite
the pet experience so that we’d be much more in
sympathy with the dogs and do that in black and white. So I did that. We’ll just very briefly
show you that, and then I’m out of your hair. And I’m conscious
that I’m standing between you and the questions
and also between you and beer, neither of which I
really want to do. So what am I doing? I’m going to YoPet. So I’m going to go to YoPet dog. So I’m now going to
log in as dog on YoPet. And now from my– and I
want to point this out. This is a Firefox
OS phone, right? Now if this works, I’m just
going to be really happy. Because I hacked it together
in the– there we go. So we have a sepia-toned
toned [INAUDIBLE]. [APPLAUSE] And just as a final
thing, that was done using the Seriously.js
library, which applies a WebGL filter to the
incoming WebRTC call. So again, it’s a few
lines of JavaScript to do something
really– well, that used to be big news in the TV world. Anyway, but that’s us. Alex has been an
immensely good sport with putting up
with this nonsense. And I want to thank
him very much for that. [APPLAUSE]


  1. source code or it didn't happen. also if you didn't type it live, you didn't make it live. Impressive though.

  2. Why all this developers have such poor presentation skill? They should organize the presentation well, so the audience can follow the idea better. For technical detail, I would say please show it in the step-by-step way.

Leave a Reply

(*) Required, Your email will not be published