Exploring Data with Serverless and Vue: Automatically Update GitHub Files With Serverless Functions

October 10, 2017 0 Comments

Exploring Data with Serverless and Vue: Automatically Update GitHub Files With Serverless Functions



I work on a large team with amazing people like Simona Cotin, John Papa, Jessie Frazelle, Burke Holland, and Paige Bailey. We all speak a lot, as it's part of a developer advocate's job, and we're also frequently asked where we'll be speaking. For the most part, we each manage our own sites where we list all of this speaking, but that's not a very good experience for people trying to explore, so I made a demo that makes it easy to see who's speaking, at which conferences, when, with links to all of this information. Just for fun, I made use of three.js so that you can quickly visualize how many places we're all visiting.

You can check out the live demo here, or explore the code on GitHub.

In this tutorial, I'll run through how we set up the globe by making use of a Serverless function that gets geolocation data from Google for all of our speaking locations. I'll also run through how we're going to use Vuex (which is basically Vue's version of Redux) to store all of this data and output it to the table and globe, and how we'll use computed properties in Vue to make sorting through that table super performant and slick.

Article Series:

Recently I tweeted that "Serverless is an actually interesting thing with the most clickbaity title." I'm going to stand by that here and say that the first thing anyone will tell you is that serverless is a misnomer because you're actually still using servers. This is true. So why call it serverless? The promise of serverless is to spend less time setting up and maintaining a server. You're essentially letting the service handle maintenance and scaling for you, and you boil what you need down to functions that state: when this request comes in, run this code. For this reason, sometimes people refer to them as functions as a service, or FaaS.

Is this useful? You bet! I love not having to babysit a server when it's unnecessary, and the payment scales automatically as well, which means you're not paying for anything you're not using.

Is FaaS the right thing to use all the time? Eh, not exactly. It's really useful if you'd like to manage small executions. Serverless functions can retrieve data, they can send email notifications, they can even do things like crop images on the fly. But for anything where you have processes that might hold up resources or a ton of computation, being able to communicate with a server as you normally do might actually be more efficient.

Our demo here is a good example of something we'd want to use serverless for, though. We're mostly just maintaining and updating a single JSON file. We'll have all of our initial speaker data, and we need to get geolocation data from Google to create our globe. We can have it all work triggered with GitHub commits, too. Let's dig in.

Creating the Serverless Function

We're going to start with a big JSON file that I outputted from a spreadsheet of my coworker's speaking engagements. That file has everything I need in order to make the table, but for the globe I'm going to use this webgl-globe from Google data arts that I'll modify. You can see in the readme that eventually I'll format my data to extract the years, but I'll also need the latitude and longitude of every location we're visiting

var data = [ [ 'seriesA', [ latitude, longitude, magnitude, latitude, longitude, magnitude, ... ] ], [ 'seriesB', [ latitude, longitude, magnitude, latitude, longitude, magnitude, ... ] ] ];

Eventually, I'll also have to reduce the duplicated instances per year to make the magnitude, but we'll tackle that modification of our data within Vue in the second part of this series.

To get started, if you haven't already, create a free Azure trial account. Then go to the portal: preview.portal.azure.com

Inside, you'll see a sidebar that has a lot of options. At the top it will say new. Click that.

Showing the sidebar in Azure where you can create a new service

Next, we'll select function app from the list and fill in the new name of our function. This will give us some options. You can see that it will already pick up our resource group, subscription, and create a storage account. It will also use the location data from the resource group so, happily, it's pretty easy to populate, as you can see in the GIF below.

Create a new function

The defaults are probably pretty good for your needs. As you can see in the GIF above, it will autofill most of the fields just from the App name. You may want to change your location based on where most of your traffic is coming from, or from a midpoint (i.e. if you have a lot of traffic both in San Francisco and New York), it might be best to choose a location in the middle of the United States.

The hosting plan can be Consumption (the default) or App Service Plan. I choose Consumption because resources are added or subtracted dynamically, which the magic of this whole serverless thing. If you'd like a higher level of control or detail, you'd probably want the App Service plan, but keep in mind that this means you'll be manually scaling and adding resources, so it's extra work on your part.

Modify the function once it's created

You'll be taken to a screen that shows you a lot of information about your function. Check to see that everything is in order, and then click the functions plus sign on the sidebar.

Add in the actual function

From there you'll be able to pick a template, we're going to page down a bit and pick GitHub Webhook - JavaScript from the options given.

choose the github javascript template from the templates in the portal

Selecting this will bring you to a page with an `index.js` file. You'll be able to enter code if you like, but they give us some default code to run an initial test to see everything's working properly. Before we create our function, let's first test it out to see that everything looks ok.

New default template

We'll hit the save and run buttons at the top, and here's what we get back. You can see the output gives us a comment, we get a status of 200 OK in green, and we get some logs that validate our GitHub webhook successfully triggered.

testing the default output

Pretty nice! Now here's the fun part: let's write our own function.

Writing our First Serverless Function

In our case, we have the location data for all of the speeches, which we need for our table, but in order to make the JSON for our globe, we will need one more bit of data: we need latitude and longitude for all of the speaking events. The JSON file will be read by our Vuex central store, and we can pass out the parts that need to be read to each component.

The file that I used for the serverless function is stored in my github repo, you can explore the whole file here, but let's also walk through it a bit:

The first thing I'll mention is that I've populated these variables with config options for the purposes of this tutorial because I don't want to give you all my private info. I mean, it's great, we're friends and all, but I just met you.

// GitHub configuration is read from process.env let GH_USER = process.env.GH_USER; let GH_KEY = process.env.GH_KEY; let GH_REPO = process.env.GH_REPO; let GH_FILE = process.env.GH_FILE;

In a real world scenario, I could just drop in the data:

// GitHub configuration is read from process.env let GH_USER = sdras;

… and so on. In order to use these environment variables (in case you'd also like to store them and keep them private), you can use them like I did above, and go to your function in the dashboard. There you will see an area called Configured Features. Click application settings and you'll be taken to a page with a table where you can enter this information.

Working with our dataset

First, we'll retrieve the original JSON file from GitHub and decode/parse it. We're going to use a method that gets the file from a GitHub response and base64 encodes it (more information on that here).

module.exports = function(context, data) { // Make the context available globally gContext = context; getGithubJson(githubFilename(), (data, err) => { if (!err) { // No error; base64 decode and JSON parse the data from the Github response let content = JSON.parse( new Buffer(data.content, 'base64').toString('ascii') );

Then we'll retrieve the geo-information for each item in the original data, if it went well, we'll push it back up to GitHub, otherwise, it will error. We'll have two errors: one for a general error, and another for if we get a correct response but there is a geo error, so we can tell them apart. You'll note that we're using gContext.log to output to our portal console.

getGeo(makeIterator(content), (updatedContent, err) => { if (!err) { // we need to base64 encode the JSON to embed it into the PUT (dear god, why) let updatedContentB64 = new Buffer( JSON.stringify(updatedContent, null, 2) ).toString('base64'); let pushData = { path: GH_FILE, message: 'Looked up locations, beep boop.', content: updatedContentB64, sha: data.sha }; putGithubJson(githubFilename(), pushData, err => { context.log('All done!'); context.done(); }); } else { gContext.log('All done with get Geo error: ' + err); context.done(); } }); } else { gContext.log('All done with error: ' + err); context.done(); } }); };

Great! Now, given an array of entries (wrapped in an iterator), we'll walk over each of them and populate the latitude and longitude, using Google Maps API. Note that we also cache locations to try and save some API calls.

function getGeo(itr, cb) { let curr = itr.next(); if (curr.done) { // All done processing- pass the (now-populated) entries to the next callback cb(curr.data); return; } let location = curr.value.Location;

Now let's check the cache to see if we've already looked up this location:

 if (location in GEO_CACHE) { gContext.log( 'Cached ' + location + ' -> ' + GEO_CACHE[location].lat + ' ' + GEO_CACHE[location].long ); curr.value.Latitude = GEO_CACHE[location].lat; curr.value.Longitude = GEO_CACHE[location].long; getGeo(itr, cb); return; }

Then if there's nothing found in cache, we'll do a lookup and cache the result, or let ourselves know that we didn't find anything:

 getGoogleJson(location, (data, err) => { if (err) { gContext.log('Error on ' + location + ' :' + err); } else { if (data.results.length > 0) { let info = { lat: data.results[0].geometry.location.lat, long: data.results[0].geometry.location.lng }; GEO_CACHE[location] = info; curr.value.Latitude = info.lat; curr.value.Longitude = info.long; gContext.log(location + ' -> ' + info.lat + ' ' + info.long); } else { gContext.log( "Didn't find anything for " + location + ' ::' + JSON.stringify(data) ); } } setTimeout(() => getGeo(itr, cb), 1000); }); }

We've made use of some helper functions along the way that help get Google JSON, and get and put GitHub JSON.

Now if we run this function in the portal, we'll see our output:

Running the function and seeing the console: status 200 ok

It works! Our serverless function updates our JSON file with all of the new data. I really like that I can work with backend services without stepping outside of JavaScript, which is familiar to me. We need only git pull and we can use this file as the state in our Vuex central store. This will allow us to populate the table, which we'll tackle the next part of our series, and we'll also use that to update our globe. If you'd like to play around with a serverless function and see it in action for yourself, you can create one with a free trial account.

Article Series:

Tag cloud