Taking Javascript into the field:

a case study

FieldDB is a 100% Javascript stack for collaboratively collecting data. It was designed to run offline on multiple platforms (Chrome Extension and Android). In this talk we will discuss the hurdles along the way to building a data heavy offline app.

Road Map

We will breifly go over our Architecture, which is all OpenSourced on GitHub. Then we go more deeply into two key libraries and how they fit into our stack: PouchDB alpha, a Javascript implementation of the NoSQL database CouchDB, for storing large amounts of data offline, and Require.js for simplifying development and deploying to different targets.

Case study: FieldDB

What do we mean by Offline?

Most Javascript apps are not offline, they're online in constant contact with a web server. Others are online and can go offline for a period of time by living out of a cache. In our case we are talking about completely offline. Files are delivered in the first download of the app, and they persist (along with their storage) even after the user has uninstalled the app (in the case of a Chrome extension, or Android). Sure, the app has a few buttons which make REST calls to a web service, but only when the user wants to connect.

Our App is designed to run on a couple of clients, namely a Chrome extension, an Android tablet and in a limited capacity, on a website. There are 4 web services: Authentication, AudioSplicing/Transcription, Search/MachineLearning, Database and a webserver, 4 are Node.js and one is CouchDB itself.

We are using Backbone, Handlebars, Require.js to manage dependencies and PouchDB which wraps IndexedDB. In this talk we are going to look at two key libraries and their pitfalls which we think might help you guys out, if you have similar stack.

Apache CouchDB is a database that uses JSON for documents, JavaScript for MapReduce queries and regular HTTP for an API.

Why CouchDB?

NoSQL
Not all NoSQL is created equal, we love the flexibility of Couch, we were using MongoDB for our users and found changing the User Schema frustrating...

Why CouchDB?

                                MapReduce
Search functions are written in Javascript using the MapReduce programming paradigm, handy for semi-structured data, handy for extracting training sets or running machine learning algorithms in the data :)

Why CouchDB?

Versioned
Couchdb is versioned which is super handy when you have collaborative editing, as you can all imagine...

Why CouchDB?

After about a month and a half we realized how pre-alpha Pouch was. We spent a week on something, and came back on Monday to it suddenly working because the Pouch guys had seen us starting figure out their code.

PouchDB Modes

IndexedDB came out last December in Chrome, that's fresh. There was a point in June when we knew our app wouldn't work in September because the APIs were changing.
IndexedDB
A tip: You can see your docs in IndexedDB/PouchDB by looking in the Resources tab of Chrome dev tools.

PouchDB Modes

We needed HTTPS because Chrome manifest v2 requires contacting only HTTPS services (in their fight to get rid of malicious Chrome extensions etc). So we added it and then Dale brought it over to Pouch.

PouchDB Modes

We needed HTTPS because Chrome manifest v2 requires contacting only HTTPS services (in their fight to get rid of malicious Chrome extensions etc). So we added it and then Dale brought it over to Pouch.

PouchDB Modes

We needed HTTPS because Chrome manifest v2 requires contacting only HTTPS services (in their fight to get rid of malicious Chrome extensions etc). So we added it and then Dale brought it over to Pouch.

Vanilla Backbone Models (url or urlRoot property)

We expect most of you have done or seen some Backbone or similar framework so we wont bore you with the details, suffice it to say that you add a property and it says which table/API your data is headed to..

Using backbone-pouchdb (pouch property)

To bring in pouch you simply call a function, which returns a function that uses the IndexedDB you want.

The eval problem

By far the most exciting hurdle getting an alpha library to run in a chrome extension was the eval problem. Eval is evil, but sometimes a necesary evil. This is a line in Pouch which takes in your Map function (so you can search in your database), which is stored as a doc (i.e. string) in your database, and evaluates it into code so it can be called. The best part is that Chrome extensions can no longer call eval. So what good is a database if you cant search it? And because the eval has to be called in the scope where the emit function is defined, avoiding the eval requires a redesign of PouchDB. So here is our hack...

Our hack

  1. Hardcode the Map and Reduce functions where emit is in scope
  2. Compare the desired function against the hardcoded versions
  3. Assign the hardcoded version to fun.map
We put together an object containing the functions needed in our database for client side (we have some that are just for webservices, training etc...). We do string match on these functions compared to the function in the doc and call the function that string matches. No eval problem, but still a hack and requires putting code into the PouchDB library.

Our hack

Here are the 10 lines of code we had to bury deep on line 2308 of pouch.alpha.js
RequireJS is a JavaScript file and module loader. It is optimized for in-browser use, but it can be used in other JavaScript environments, like Rhino and Node.
Turning now to modularity and code-reuse. Using a script loader like RequireJS drastically reduced the complexity of developing and deploying our codebase for multiple clients.

Why Require.js?

Require.js makes for cleaner classes, if you're used to developing in Java, having Require makes you feel right at home. Require.js calculates the files your class depends on and includes them for you. Require can also 'build' your classes into a minified js file for shipping down the wire, whether it be on page load for a website, or when the user downloads your app from the Android market. Similarly require helps for building a subset of the codebase i.e. for building an activity feed widget which your users can embed in their WordPress, website etc.

Getting libraries (Backbone, Underscore, JQuery)
to play nice
with Require.js ...

As far as we can see there's a clear political divide between two camps of Javascript developers. Here's what we can pull away from it. One camp thinks you do/should always build and therefore you don't need to use a module loader in development, and one camp doesn't. Libraries from one camp dont (didn't) play nice with libraries of others... Specifically getting Underscore/Backbone and JQuery to play nice with Require.js has been a give-up point for a lot of devs. We don't know much about the politics but there was a release or two where underscore and backbone were AMD compliant but then it was removed for some reason "As of version 1.3.0 Underscore removed AMD (RequireJS) support." http://stackoverflow.com/questions/8131265/loading-backbone-and-underscore-using-requirejs

Enter: shims (May 28, 2012)

The solution infact is simple, well, if you were Googling after May 28th, when James burke the maintainer of Require added in shims. After that we've had no trouble building new fancy build and testing configurations... We are now over 14 libraries and 70 classes and no loading problems... https://github.com/jrburke/requirejs/wiki/Upgrading-to-RequireJS-2.0

What it really looks like...

Don't like toy examples? Me neither... here is what our require really looks like...

Build for Hackability

No, seriously. Deploying with a built in Javascript console is a feature for our power users. They can automate data cleaning, itterate through their data, use breakpoints and more...
                                Hackability
We are power user friendly, and because write-permissions are double checked on the server side, we let the users have full access to their data...
var datums = appView.currentPaginatedDataListDatumsView.collection.models

for(var d in datums){
  console.log(datums[d].get("datumFields").where({label:"utterance"})[0].get("mask"));
}
    

Build for Speed

Build for Android

super small for download from market over 3g

Build Read Only Models

Reduce the hackability of your models...

Build Embeddable Widgets

auth, 3user, corpus->datalist, datum

Offline is tricky, but possible

Here's our take home. We are glad we built with offline in mind from the beginnning we developed in a Chrome extension, with no server to hit. Debugging in chrome extensions is almost everything we need, break points to run code in a context and more. Finally, if a library is in active development take the time to read the code, you'll be glad you did and you can contribute back to the library.
Acknowledgements
  • PouchDB
  • Our clients
  • Our interns
We'd like to thank Dale and Joe the PouchDB guys, with out which we would have had to do the whole db logic client side ourselves. We'd also like to thank our clients for letting us keep our codebase open source and finally our interns who went from logic class to BDD without a drop of fear...
Questions?
Thanks, the client codebase and webservices are all on github, we also have a dev blog where our interns created screencasts and board casts of what they learned while developing the app.