Archive for December, 2008

CouchDB: Improving the interval API

Sunday, December 28th, 2008

I posted this to the couchdb-dev mailinglist but so far it didn’t arrive. So I store it here

While writing something about using CouchDB I came across the issue of “slice indexes” (called startkey and endkey in CouchDB lingo).

I found no exact definition of startkey and endkey anywhere in the documentation. Testing reveals that access on _all_docs and on views documents are retuned in the interval

[startkey, endkey] = (startkey <= k <= endkey).

I don’t know if this was a conscious design decision. But I like to promote a slightly different interpretation (and thus API change):

[startkey, endkey[ = (startkey <= k < endkey).

Both approaches are valid and used in the real world. Ruby uses the inclusive ("right-closed" in math speak) first approach:

>> l = [1,2,3,4]
>> l.slice(1,2)
=> [2, 3]

Python uses the exclusive (”right-open” in math speak) second approach:

>>> l = [1,2,3,4]
>>> l[1:2]
[2]

For array indices both work fine and which one to prefer is mostly an issue of habit. In spoken language both approaches are used: “Have the Software done until saturday” probably means right-open to the client and right-closed to the coder.

But if you are working with keys that are more than array indexes, then right-open is much easier to handle. That is because you have to *guess* the biggest value you want to get. The Wiki at http://wiki.apache.org/couchdb/View_collation contains an example of that problem:

It is suggested that you use
startkey=”_design/”&endkey=”_design/ZZZZZZZZZ”
or
startkey=”_design/”&endkey=”_design/\u9999″
to get a list of all design documents

This breaks if a design document is named “ZZZZZZZZZTop” or “\9999Iñtërnâtiônàlizætiøn”. Such names might be unlikely but we are computer scientists; “unlikely” is a bad approach to software engineering.

The think what we really want to ask CouchDB is to “get all documents with keys starting with ‘_design/’”.

This is basically impossible to do with right-closed intervals. We could use startkey=”_design/”&endkey=”_design0″ (’0′ is the ASCII character after ‘/’) and this will work fine … until there is actually a document with the key “_design0″ in the system. Unlikely, but …

To make selection by intervals reliable currently clients have to guess the last key (the ZZZZ approach) or use the fist key not to include (the _design0 approach) and then post process the result to remove the last element returned if it exactly matches the given endkey value.

If couchdb would change to a right-open interval approach post processing would go away in most cases. See http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ for two real world examples.

At least for string keys and float keys changing the meaning to [startkey, endkey[ would allow selections like

* “all strings starting with ‘abc’”
* all numbers between 10.5 and 11

It also would hopefully break not to much existing code. Since the notion of endkey seems to be already considered “fishy” (see the ZZZZZ approach) most code seems to try to avoid that issue. For example ’startkey=”_design/”&endkey=”_design/ZZZZZZZZZ”‘ still would work unless you have a design document being named exactly “ZZZZZZZZZ”.

Building a Track and Trace Application with CouchDB

Sunday, December 28th, 2008

Background

In my compay we use a logistics application called huLOG. Part of huLOGs (award winning) functionality is to aggregate track and trace events from about two dozen sources. The events are schema free in there nature: some might contain a ZIP code, some may geographic coordinates attached, some relate to a certain packet or pallet (called “movable unit” in huLOG-spreak) some relate to a certain shipment (which is a group of movable units). Some have file attachments, e.g. Images of the packages, signatures proofing delivery or pictures of the number plates of the trucks.

My first approach was an SQL database. Later I coupled it with a self-designed Document store called DoDoStorage. For background on that project see here, here and here.

In summer 2007 we experienced severe performance problems with DoDoStorage. I started a rewrite in Erlang and came across Damien Katz and his then obscure CouchDB project. It was just in the transition from XML to JSON and Damion assured me that “it will not be production ready for an other two years”.

So we decided to solve the DoDoStorage speed issues with more hardware for the time being.

In Fall 2008 the landscape for CouchDB hat changed: it was now the hot thing in database technology and everybody’s darling. CouchDB 0.9 came around and it started to look usable for serious use. We hired Jan Lenhardt, one of the CouchDB core team to work with us on moving huLOG from PostgreSQL and DoDoStorage to CouchDB.

In December we started migrating services and data over to a CouchDB based system. While CouchDB has its wards we are very happy with it so far.

Data Model

As stated above tracking data comes in many different flavors ad colors. It might reference an MUI (”movable unit ID”), a shipment or both. Let’s concentrate on events referencing a MUI. Some typical tracking events might look like this:

{
   "_id": "01420000000378-20061005T064500.000000",
   "mui": "01420000000378",
   "shipment": 572,
   "message": "Processing, 0132-Vlotho, Route 0132, Code 101",
   "code": "410",
   "timestamp": "20061005T064500.000000",
   "facility": "DPD Depot 0132",
   "ort": "Vlotho (DE)",
   "plz": "32657"
},
{
   "_id": "01420000000378-20061005T085400.000000",
   "mui": "01420000000378",
   "shipment": 572,
   "message": "proof of delivery",
   "code": "421",
   "timestamp": "20061005T085400.000000",
   "plz": "32634",
   "_attachments": {
       "POD.pdf": {
           "stub": true,
           "content_type": "application/pdf",
           "length": 136446
       }
   }
}

One thing about my choice of keys (the _id field). I have choosen a “meaningful” ID over traditional “random” UUIDs. The IDs we use consist of MUI-Timestamp The Timestamp is the time the event was generated (e.g. the pallet was loaded). The good news is that this automatically keeps my database free of duplicates. If I import the same file with events twice, the events will have the same IDs during both imports and thus the second import will overwrite the first one: exactly what I want.

The bad news is that while physically there can’t really happen two events at once due to clock drift etc. I might get two different events for a MUI with exactly the same timestamp. This would result in the same key being generated for both events and thus the second event would overwrite the first one. With the relatively sparse populated space of timestamps I expect this to happen once in 10.000 events or so.

For huLOG it is acceptable to use one in 10.000 events – especially if it is the earlier one. We only get about 95% of the events we should get due to problems in the track and trace infrastructure of the freight companies. As long as the all important “has been handed to the customer” messages arrive, our system has to be able to handle missing messages.

Data Access by MUI

With that we just need a few view functions to work with the data. Most obvious we want to get all documents for a certain MUI.

This is easy since our document IDs already contain the MUI. We only have to get a list of all IDs starting with MUI. The CouchDB Document API already provides that functionality:

$ curl -s 'http://localhost:5984/hulog_events/_all_docs
  ?startkey=%22094147562251-0%22
  &endkey=%2209445147562253-9%22
  &include_docs=true' 

{"total_rows":184708,"offset":184705,"rows":[
{"id":"09445147562251-20081114T165600.000000",
 "key":"09445147562251-20081114T165600.000000",
 "value":{"rev":"463392510"},
 "doc": {..., "message":"Einrollung, 0147, Route 0280, Code 461 105",
        "mui":"09445147562251", "shipment":124524}},
{"id":"09445147562253-20081114T165700.000000",
 "key":"09445147562253-20081114T165700.000000",
 "value":{"rev":"2436256439"},
 "doc":  {..., "message":"Einrollung, 0280, Route 0234, Code 461 105",
          "mui":"09445147562253", "shipment":124524}}
]}

Some things to note: startkey and endkey must be set to valid JSON objects. We want a string containing the mui so we have to put “quotes” arround it. After URL-encoding we end up with %22094147562251%22.

The returned documents are within the interval [startkey:endkey]. If we choose a startkey which is guaranteed same or smaller than the key we want to retrieve. That’s easy: 09445147562251 is smaller than any string which has more characters and starts with 09445147562251. Endkey is somewhat more tricky. What is the biggest value? 09445147562252? This might catch one record to much, because there actually might be a key 09445147562252.

A common idiom used in the Erlang community is to use something like 09445147562251Z as the endkey. But then what is if there actually is a key 09445147562251Zabc? So use 09445147562251ZZZZZ, but …

An other suggestion is using a “high unicode character” like \u9999. But at least there is no highest unicode code point and you are well of in considering the sorting rules for obscure unicode characters as indetermistic. See the CouchDB Wiki for further information.

In our case all this is no problem. The MUI is followed by a ISO 8601 timestamp. So
startkey=094147562251-0, endkey=09445147562253-9 should work fine until the end of year 8999.

Data Access by Shipment

That was easy. Now we want to access data based on something which is not encoded in the document ID. Say the “shipment” number. For that we need a map function which gives us shipment numbers.

map: function(doc) {
    if(doc.shipment) {
        emit(doc.shipment, null);
    }
}

We permanently save this function in the server to allow CouchDB to play its clever optimization tricks. See the CouchDB wiki for more information on the HTTP View API. You could use the Futon GUI at http://localhost:5984/_utils/ for that or curl:

 curl -X PUT -H 'Content-Type: applicatioe": "javascript",
  "views": {"all":{"map":"function(doc){if(doc.shipment)
                          {emit(doc.shipment, null);}}"}}}
  ' http://localhost:5984/hulog_events/_design%2fsendung

{"ok":true,"id":"_design/sendung","rev":"23237918"}

Now we can query the view and with the startkey and endkey parameters. Here we get again into trouble for choosing the endkey. The hack we used for the IDs did work because there was a timestamp of well known format in the ID so we construct a “slightly bigger” value. Shipments are numeric. So what is bigger than 128996? 128997 is obviously bigger but would get us documents for two shipments (128996 and 128997). Since the shipments are numeric, a trick like “128996Z” does not work out.

But numeric comparing of numeric values in JavaScript work nicely with different numeric types. Shipments are integers. If we use a float key we nicely can place it to be bigger than 128996 but smaller than 128997. We use 128996.1 as our endkey.

$ curl 'http://localhost:5984/hulog_events/_view/shipment/all
  ?startkey=128996&endkey=128996.1'

{"total_rows":184707,"offset":184674,"rows":[
{"id":"09445122481336-20081223T105212.491567","key":128996,"value":null},
{"id":"09445122481336-20081223T155200.000000","key":128996,"value":null},
{"id":"09445122481336-20081223T202600.000000","key":128996,"value":null},
{"id":"09445122481337-20081223T105216.377305","key":128996,"value":null},
{"id":"09445122481337-20081223T155300.000000","key":128996,"value":null},
{"id":"09445122481337-20081223T202500.000000","key":128996,"value":null}
]}

You can also add &include_docs=true to the query – this will get you the complete documents, not only the IDs.

At this point we have a scalable, elegant datastore for our Track & trace related events ad files.

Usage

Currently we are running CouchDB with subset of our tracking archives. This subset is about a quarter million documents of wich 20% or so have attachments resulting in a database size of about 5 GB. No complains so far.

How map/reduce works in CouchDB

Saturday, December 27th, 2008

I have huge trouble how CouchDBs system of views actually works.

By experimenting and reading the source I came up with thisdescription in pseudo Python:

def mapstep(alldata):
    # the map is applied to every document
    # and the result is collected in two lists of rows
    k_rows = []
    v_rows = []
    for _id, doc in alldata:
       k, v = mapfun(doc) # actually mapfunc uses emit() not return()
       k_rows.append([k, _id])
       v_rows.append(v)
    return k_rows, v_rows

def reducestep(keys, values):
    # now several reduce steps follow. For this example
    # we randomly chose two
    # all even elements
    tmp1 = reducefun(k_rows[::2], v_rows u[::2], False)
    # all uneven elements
    tmp2 = reducefun(k_rows[1::2], v_rows u[1::2], False) 

    # finally several rereduce steps follow.
    # For this example we use only one.
    return reducefun(None, [tmp1, tmp2], True)

result = reducestep(mapstep(alldocs()))

If you call the view with group=true the map step stays the same, but the server applies grouping and calls the reduce step for each group. It looks like this:

def reduce_with_grouping(keys, values):
    gdict = {}
    # create dictionary mapping values to keys
    for k, v in zip(keys, values):
        gdict.setdefault(k, []).append(v)
    ret = []
    for k, values in gdict.items():
        ret.append([k, reducestep(k*len(values), values])
    return ret

result = reduce_with_grouping(mapstep(alldocs()))

If you experiment with views keep in mind that the the Futon Web-Client silently adds group=true to your views and that group=true is ignored if you don’t provide a reduce function.

CouchDB broke my Box (not)

Saturday, December 20th, 2008

I tried to see how much beating CouchDB can take. So I installed it on a modest box (1.8GHz, 512 MB RAM, Debian) and started at 12:00h pouring data in it. At about 22:00h I asked for the computation of a simple view while still dumping data into it. At that Time it contained about 400.000 Documents of with about 10 % contained an Attatchment. DB size was on Disk was about 8 GB.

And an hour later (now) I can’t reach the box anymore.