Learning About Dat Protocol and Decentralization

July 18, 2018 0 Comments

Learning About Dat Protocol and Decentralization

 

 

This past Sunday, June 15, I got to attend a skillshare hosted by the Distributed Web of Care initiative. This session included a workshop on Black Feminist art criticism, spearheaded by Jessica Lynne, and a workshop on accessbility spearheaded by Shannon Finnegan. For the last portion Callil Capuozzo gave a brief 20 min introduction to using the Dat protocol, so I'm going to try to quickly summarize what I've begun learning from that workshop here.

So the (very) quick and dirty summarization of the web in it's current form is that most information is shared in a top down hierarchal fashion from server to client. Things like applications are hosted on remote servers (the cloud) which allows the browser to act as just an interface to access data that is held elsewhere. And we access this data generally using http protocol. This works well for a lot of things, and is generally a successful model but is not without its downsides. Among the downsides to the data being held elsewhere is the potential lack of ownership or privacy over the data that is held. And for things that are inherently user to user, like sending an email, or sending a file, we involve a third party (the server).

A Slide from Paul Frazee's Peer2Peer Web Talk

The Dat protocol can be thought of an alternative to the http protocol that works on a Decentralized Web to circumvent this problem of server centralization. Decentralization pops up in many forms, notable WebRTC, Blockchain, or IFPS protocol.

Dat in laymans terms is like if git and bittorrent had a child.

The abstract of the whitepaper for Dat has the following:

Dat is a protocol designed for syncing folders of data, even if they are large or changing constantly. Dat uses a cryptographically secure register of changes to prove that the requested data version is distributed. A byte range of any file's version can be efficiently streamed from a Dat repository over a network connection. Consumers can choose to fully or partially replicate the contents of a remote Dat repository, and can also subscribe to live changes. To ensure writer and reader privacy, Dat uses public key cryptography to encrypt network traffic. A group of Dat clients can connect to each other to form a public or private decentralized network to exchange data between each other. A reference implementation is provided in JavaScript.

Dat files are distributed peer 2 peer, with networked users contributing bandwidth by rehosting files, and anyone can publish from their local machine. Like bittorrent, the more people who are networked to host a file, the larger the network swarm, and the more bandwidth load is shared among the swarm - which allows another party more resources (more servers) to receive bits of the same dataset from.

Anyone can get set up with dat very quickly:

If you have node and npm already installed, installing dat is fairly simple - run the below in your Terminal to install dat on your system:

Then you'll want to share a file. So let's say I want to share a folder of all my poems with my friend Brad. Really quickly, from my Terminal let me make a folder of .txt files.

$ mkdir greatpoetry 
$ cd great
poetry $ touch deeppoem.txt $ profoundpoem.txt

And let's pretend I've populated the text files with deep and profound text. And I want to share them now. With dat, within the directory, all I have to do to share it, is enter 'dat share'

$ dat share dat v13.11.3 
Created new dat in /Users/jarretbryan/great_poetry/.dat
dat://71978b78f44efe013e3d34642bd8e4efb7b40e1a83202408ecaa06d5fef4357e
Sharing dat: 2 files (62 B) 0 connections | Download 0 B/s Upload 0 B/s Watching for file updates Ctrl+C to Exit

And that will initialize a dat repository for me. Essentially, dat share has created a set of hidden metadata files like the below, which will maintain the encrypted key and the version history - essentially if the files are changed. It will simultaneously have my computer act as a miniserver to host the file.

If files are changed from the source, the data updates for everyone rehosting this particular document or set of documents, but the previous versions of the data are not maintained - only the fact that the data was changed, making dat dynamic without overburdening the users. Only the source user can edit the files, but any networked user can copy the files, edit them, and then rehost them with a different encrypted key.

The important bit here now is the line

dat://71978b78f44efe013e3d34642bd8e4efb7b40e1a83202408ecaa06d5fef4357e 

This is the encrypted key that points directly to the files. If Brad wants to see my poetry, all he has to do is run in his terminal -

dat clone dat://71978b78f44efe013e3d34642bd8e4efb7b40e1a83202408ecaa06d5fef4357e 

and he will receive the files directly from my hosting computer, and also simultaneously begin rehosting them. Now we share bandwith. Now let's say our friend Alex has heard all the fuss about these profound poems and also wants to receive the files. All Alex has to do is run the same 'dat clone' command, and she will receive the files from both Brad and I, as long as we both are hosting. She then will also begin rehosting the files. And if our friend Kurt wants access to the files as well, the process repeats, and he will receive bits of the data from Brad, Alex, and I. So the process scales with the user - the larger the network swarm, the more the bandwidth is shared.

We cna see in my above block of code that reflects the hosting that there are 1 connections, but that number will reflect the number of users currently connected.

At no point in the process are the files hosted on a centralized server - they are distributed among the networked users. And as long as someone is hosting the files, they will be accessible - so I, as the original hoster, can stop hosting for a bit, but in theory there is still access to the networked files, even if they don't reflect changes that I've made locally.

The example I've given is fairly simple, using just .txt files but - imagine if I instead host .html files? I can host an entire website, or conceivably an entire application via a decentralized web.


Tag cloud