We wanted to test the limits of chat integration and do something a little bit different. The teams at Pusher Chatkit and Chirp came together to create a progressive web app that uses data-over-sound to localize many devices to the same chat room with a single audible interaction.
Check out the app for yourself at https://chat.chirp.io.
A product collaboration between Chirp and Pusher Chatkit
We wanted to test the limits of chat integration and do something a little bit different. So Pusher and Chirp teamed up to create a progressive web app that uses data-over-sound to localize many devices to the same chat room with a single audible interaction.
The app uses Pusher Chatkit and the Chirp WebAssembly SDK in the browser to create chat rooms that can be shared exclusively via sound. Once a room has been created you can broadcast its address to other devices by means of an audible chirp. Any device within earshot will be immediately notified and invited into the chat.
Once an invite has been accepted, a socket connection is established between the receiver device and all other members of the room. Chatkit maintains these connections and handles all data transfer between members; including text, files, typing and presence indicators.
This is an experimental application with made with beta products. It has been tested on Safari, Firefox, Chrome, and Opera on the desktop, Safari on iOS and Chrome on Android. Please report any bugs you find as issues on the repository.
Why did we build this?
Humans maintain thousands of threads of communication over the internet but to connect one or more devices to each other we rely on some service to broker the connecting – this is called signaling. For peers of any network to congregate, one peer needs to signal to the others a location or address of where to be and when; much like me saying to the rest of my team at work “Meet at the Kings Head Pub around 6 pm“.
The challenge for us was to replicate this interaction digitally, to allow devices (that might have had no prior connection or knowledge of each other’s existence) to invite each other to congregate at a specific address with the intention of interacting in some way.
The traditional way of encoding addresses like this (intended for syncing devices) is to use a series of digits that a human might enter into the device by hand or a QR code that the device would detect and decode into a series of digits and enter them automatically. Both of these methods could point a device to a shared location where they can interact freely.
But we weren’t satisfied with these solutions! We set out to use both of our employers’ products to solve this problem in a fun and novel way; in the browser, using data-over-sound and web sockets.
Using the Application
Once logged in with GitHub, a user is given an identity which is used by ChatKit to maintain a collection of room addresses. When microphone permissions have been granted the audio processing begins and the app will continuously listen for any broadcasts.
Creating New Rooms
A room represents all activity that has happened or is happening at a given address. There are no room names, instead, each address is represented as a hex color in the UI. Once a room has been created the address to that room is chirped out loud for any other devices that might be listening.
Broadcasting Existing Rooms
For others to join the room, they need the room address. To get it you will need to broadcast out the room address via a chirp. Any devices that are inaudible range will be joined to the room automatically or prompted to join via the UI. The signal is designed to be robust over several meters, even in noisy environments!
All messages received are multiplexed into a single feed. Clicking on a message causes messages from all other rooms to be filtered out of the feed. The message input UI is revealed from where the user can send text messages and attachments to room members.
We had a good idea of what needed to be built but that is not to say we didn’t come across hurdles on the way. The core components included:
Below we explain a little bit about some of the steps we took to build the app.
Porting Chirp Core to WebAssembly
Typically software written for native and embedded platforms cannot be run in the browser but recent advancements in browser support for WebAssembly has made it more realistic to port and run existing low-level codebases on the web. The Chirp core is written in C and has many native SDKs (Python, Obj-C, Swift, and Java) and stood out as a candidate for being executable in the browser.
Chirp core requires audio input and output to function. Previously this kind of data was not readily available in the browser but now the MediaStream and Web Audio API have been integrated, microphone data is more accessible and easier to process client-side on most mobile and desktop browsers.
Creating Local Connections
The expectation for proximity-based connectivity is growing but still has no universal solution. Pairing your phone with another device is still an unpredictable process that is dependant on the hardware having a common language, protocol, and a low-level medium to communicate over. Traditionally this is done using an RF standard such as Bluetooth or infrared.
Chirp uses sound to transfer data which means it can be transmitted by any device with a speaker and received by any device with a microphone, with no device pairing necessary. Encoding a message in sound allows one-to-many connections to be made in a single interaction as sound by its very nature, is omnidirectional.
[ header, message, error ]
There are a plethora of use cases here but we liked the idea of being able to instantly invite members of a team, classroom or social event to tune in to the same digital frequency.
Once a set of peers have shared knowledge of a common location they can then start communicating. You could do this over sound too in theory, but it would be somewhat impractical. The transfer rate of a standard chirp is ~130bps which is around 16 characters a second. To communicate anything more than a short text message at this rate would take a long time, not to mention how audibly noisy it would become if everyone was doing it!
A lot of the heavy lifting here is done by the Chatkit. The node SDK is used to create users triggered by a GitHub authentication success callback. Once the user exists on the system then they can start joining rooms and their joined rooms will be stored and made retrievable next time the user loads the app.
Optimized Message Payload
In order to send the data in as little time as possible, rather than sending the room identifier as a string it can be represented as a 32-bit integer. As a chirp payload is just an array of bytes, bit masking can be used to pack the data into 4 bytes as opposed to 7 bytes in the example below. A similar technique is used when subnetting IPv4 networks.
roomId = 1352719
roomId === 0b00000000 0010100 10100100 00001111
const encode = v => new Uint8Array([v >> 24, (v >> 16) & 0xff, (v >> 8) & 0xff, v & 0xff])
const decode = a => (a << 24) | (a << 16) | (a << 8) | a
We created a nice looking app that we think demonstrate a novel way of solving the problem of connecting peers that previously had no knowledge of each other by directing them to a common address where they can interact in real time with all other connected peers.
The implementation is not without its flaws (we are taking advantage of some bleeding edge betas) but we will try iron out inconsistencies over time. Please do let us know if you notice anything broken and or even better, make a pull request on the repo https://github.com/lukejacksonn/chirpchat.
Chirp is free to use for personal and commercial projects up to 10k monthly active users. Documentation for the SDK can be found at here. To use the Chirp WebAssembly SDK in your own web app, you can use the hosted version.