Archive for the ‘couchdb’ Category

Running a CouchDB cluster on Amazon EC2

Sunday, December 20th, 2009

CouchDB is a nearly zero-configuration multi-master document oriented database. It is a awsome product build by an awsome team.

So far I have been using CouchDB like we would have used any other modern Document Datastore: in a centraized fashion. One Server at our premises. For backup purposes we replicated on a second couchdb instance running on our backup server.

Hosting about 300 GB of data a small 2.6 GHz Server with consumer-grade disks we started seeing preformance issues. Also we see latency issues since we are hosting some application at Amazon EC2 “in the cloud” which
results an an addiotional 40 ms delay for all queries to our locally hosted server.

So this is the right time to use some more of CouchDBs capabilities and spin up additional instances on demand at Amazon EC2. I assume you have already set up an Amazon EC2 account and are comfortable with the general concepts.

There are some tutorials out there which threat EC2 like a regular hosting provider. This is a seriously misguided approach. If you don’t use EC2 in a way that you always can loose one or two instances, you are using it wrong. If you are not spinning up servers in a way that it takes the same time to set up one instance than it takes to set up 10 instances you are using it wrong.

To use EC2 as it meant to be used, we need automation. We will use puppet in this example.

I assume that you have installed a “puppetmaster” on a machine called puppet.example.com. I also assume the puppet configuration on the puppetmaster is at /etc/puppet. On BSD ist might be located instead at /usr/local/etc/puppet. Place the following content at /etc/puppet/files/etc/couchdb/local.ini:

[couchdb]
database_dir = /mnt/couchdb
view_index_dir = /mnt/couchdb

[httpd]
bind_address = 0.0.0.0

[couch_httpd_auth]
require_valid_user=true

[admins]
admin = sekrit

This ensures that only clients which authenticate as user “admin” with the password “sekrit” are allowed to access the server. You might want to change “sekrit” to something more suble.

Add /etc/puppet/fileserver.conf to make sure the local.ini file can be moved the clients:

[files]
  path /etc/puppet/files
  allow *

Then add /etc/puppet/manifests/site.pp to allow automatic installation and configuration:

class couchserver {
  package { "couchdb": ensure => latest }
  package { "python-couchdb": ensure => installed }
  group { "couchdb": ensure => present }
  user { "couchdb": ensure => present, groups => "couchdb",
    comment => "CouchDB Administrator",
    home => "/mnt/couchdb" }
  file { "/etc/couchdb": ensure => directory,
    owner  => couchdb, group  => couchdb,
    mode   => 755 }
  file { "/mnt/couchdb": ensure => directory,
    owner  => couchdb, group  => couchdb,
    mode   => 700 }
  file {"local.ini":
    mode => 774,
    owner => couchdb, group => couchdb,
    path => "/etc/couchdb/local.ini",
    source =>
      "puppet://puppet.exmple.com/files/etc/couchdb/local.ini"
  }
  service { couchdb:
    ensure    => running,
    subscribe => [Package[couchdb],
                  File["local.ini"],
                  File["/mnt/couchdb"]]
}}

node "PLACEHOLDER" {
    include couchserver
}

Now we have to create a Amazon “security group” to firewall our CouchDB servers. Since I like the belt and suspenders way of doing things we not only will use HTTP-Auth in CouchDB but also firewall rules. You have to have the EC2 commandline tools installed. I assume your comapny has a public IP range at 17.18.19.0/24.

$ ec2-add-group couchserver -d 'couchdb server'
$ ec2-authorize couchserver -P tcp -p 5984 -s 17.18.19.0/24

Next step is starting a EC2 instance. We use a Small Ubuntu 9.10 AMI since it comes with a decent version of CouchDB. We then log in and install Puppet.

$ ec2-run-instances ami-a62a01d2 --key YOUR_EC2_SSH_KEY \
  --instance-type m1.small --region eu-west-1 \
  --group default --group couchdb
INSTANCE       i-ec985e9b   ...
$ sleep 120
# get the id from the output of ec2-run-instances
$ ec2-describe-instances i-ec985e9b
INSTANCE       i-ec985e9b      79.125.56.43   10.227.94.80
# get the ip from the output of ec2-describe-instances
$ ssh -i ~/.ssh/YOUR_EC2_SSH_KEY ubuntu@79.125.56.43
# on the EC2 instance:
$ sudo apt-get update -y
$ sudo apt-get install -y puppet
$ puppetd --test --server puppet.example.com

This will result in a Error message about certificates. The puppet client requested a certificate and you have to sign this certificate at the puppet server. There is still some room for automatation. Log into the puppetmaster and list the signature requests with puppetca -l. You’ see the name of your newly created instance. Sign that name by using puppetca -s:

root@puppet:~# puppetca -l
ip-10-20-30-40.eu-west-1.compute.internal
root@puppet:~# puppetca -s ip-10-20-30-40.compute.internal
Signed ip-10-20-30-40.eu-west-1.compute.internal
root@puppet:~# perl -npe 's//ip-10-20-30-40.compute.internal/;' \
   -i.bak /etc/puppet/manifests/site.pp

The last line automatically edits /etc/puppet/manifests/site.pp to contian configuration information for the new instance. That’s all there is to do on the puppet master.

Now back on the new instance you can make puppet configure your CouchDB by typing puppetd --test --server puppet.example.com. This should install CouchDB and configure it to use the “big” 140 GB disk of your instance and to require password authentication.

You can test if CouchDB is up, running and secured by using cURL:

$ curl http://127.0.0.1:5984
{"error":"unauthorized","reason":"Authentication required."}

This is the point in time where we can start replication from our internal, behind-the-firewall CouchDB to the new box running at Amazon. Since there are some issues regarding commandline tools and authentication I created a patched version of python-couchdb at GitHub. Download it from here to a machine in your internal network, untar it and change in the couchddb-python directory. Then initiate replication:

$ PYTHONPATH=. python ./couchdb/tools/manual_replication.py \
  --source=http://couchdb.internal.example.com:5984 \
  --target=http://admin:sekrit@79.125.56.43:5984/ --push \
  --continuous

After this ran, set up permanent two-way replication between the two Servers:

$ PYTHONPATH=. python ./couchdb/tools/manual_replication.py \
  --source=http://couchdb.internal.example.com:5984 \
  --target=http://admin:sekrit@79.125.56.43:5984/ --push
$ PYTHONPATH=. python ./couchdb/tools/manual_replication.py \
  --source=http://admin:sekrit@79.125.56.43:5984/ \
  --target=http://couchdb.internal.example.com:5984 \
  --continuous

Basicaly that’s it. We are still missing a few bit’s and pices to get full automation, but we are nearly there. and for a cluster you probably want more than one CouchDB instance running at Amazon.

Simple “full text” search with CouchDB

Monday, January 19th, 2009

To have a shiny application you need domain specific search. E.g. if our call center wants to enter a new order, they might not have the customer number ready. So they need a snappy way to get the customer number based on name, city or whatever.

We did experiment with lot’s of LIKE queries to our In our legacy ERP database system. This didn’t feel good, had some SQL injection vulnerabilities and required lot’s of full table scans.

Looking for alternatives we decided to use CouchDB. There is some work on full text indexing for CouchDB you can build something much more simple yourself.

Once a day we copy all customer data from the legacy system into CouchDB. Then we use the map function to emit a line for each word in each data field of each document. It looks like this:

function(doc) {
    function output(value) {
        // Split into search terms
        if(value && (value != "-") && (value.length > 2)) {
            emit(value, 1);
            for(var word in value.split(" ")) {
                if(word && (word != "-") && (word.length > 2)) {
                    emit(word, 1);
                }
            }
        }
    }
    output(doc.kundennr);
    output(doc.name1);
    output(doc.name2);
    output(doc.ort);
    output(doc.land + "-" + doc.plz);
}

This basically generates a view (index) containing every word and the document it occurs in.

You now can use that for a prefix based search in a function like this:

from couchdb.client import *

def finde_kundendaten(searchstring):
    server = Server('http://couchdb.local.hudora.biz:5984/')
    db = server['kunden']
    rows = []
    while len(rows) < 1 and len(searchstring) > 2:
        rows = db.view('suche/alle_felder', startkey=searchstring, limit=25)
        if rows:
            break
        searchsting = searchstring[:-1]
    return [(x.id, x.key) for x in rows]
>>> finde_kundendaten("Sport Dornseif") # no exact match in the DB
[(u'51320', u'Sport Alm SysIntersport'),
 (u'27094', u'Sport Freizeit'),
 (u'31071', u'SPORT FREIZEIT  TREFF'),
...]

Nifty!

CouchDB: Improving the interval API

Sunday, December 28th, 2008

I posted this to the couchdb-dev mailinglist but so far it didn’t arrive. So I store it here

While writing something about using CouchDB I came across the issue of “slice indexes” (called startkey and endkey in CouchDB lingo).

I found no exact definition of startkey and endkey anywhere in the documentation. Testing reveals that access on _all_docs and on views documents are retuned in the interval

[startkey, endkey] = (startkey <= k <= endkey).

I don’t know if this was a conscious design decision. But I like to promote a slightly different interpretation (and thus API change):

[startkey, endkey[ = (startkey <= k < endkey).

Both approaches are valid and used in the real world. Ruby uses the inclusive ("right-closed" in math speak) first approach:

>> l = [1,2,3,4]
>> l.slice(1,2)
=> [2, 3]

Python uses the exclusive (”right-open” in math speak) second approach:

>>> l = [1,2,3,4]
>>> l[1:2]
[2]

For array indices both work fine and which one to prefer is mostly an issue of habit. In spoken language both approaches are used: “Have the Software done until saturday” probably means right-open to the client and right-closed to the coder.

But if you are working with keys that are more than array indexes, then right-open is much easier to handle. That is because you have to *guess* the biggest value you want to get. The Wiki at http://wiki.apache.org/couchdb/View_collation contains an example of that problem:

It is suggested that you use
startkey=”_design/”&endkey=”_design/ZZZZZZZZZ”
or
startkey=”_design/”&endkey=”_design/\u9999″
to get a list of all design documents

This breaks if a design document is named “ZZZZZZZZZTop” or “\9999Iñtërnâtiônàlizætiøn”. Such names might be unlikely but we are computer scientists; “unlikely” is a bad approach to software engineering.

The think what we really want to ask CouchDB is to “get all documents with keys starting with ‘_design/’”.

This is basically impossible to do with right-closed intervals. We could use startkey=”_design/”&endkey=”_design0″ (’0′ is the ASCII character after ‘/’) and this will work fine … until there is actually a document with the key “_design0″ in the system. Unlikely, but …

To make selection by intervals reliable currently clients have to guess the last key (the ZZZZ approach) or use the fist key not to include (the _design0 approach) and then post process the result to remove the last element returned if it exactly matches the given endkey value.

If couchdb would change to a right-open interval approach post processing would go away in most cases. See http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ for two real world examples.

At least for string keys and float keys changing the meaning to [startkey, endkey[ would allow selections like

* “all strings starting with ‘abc’”
* all numbers between 10.5 and 11

It also would hopefully break not to much existing code. Since the notion of endkey seems to be already considered “fishy” (see the ZZZZZ approach) most code seems to try to avoid that issue. For example ’startkey=”_design/”&endkey=”_design/ZZZZZZZZZ”‘ still would work unless you have a design document being named exactly “ZZZZZZZZZ”.

Building a Track and Trace Application with CouchDB

Sunday, December 28th, 2008

Background

In my compay we use a logistics application called huLOG. Part of huLOGs (award winning) functionality is to aggregate track and trace events from about two dozen sources. The events are schema free in there nature: some might contain a ZIP code, some may geographic coordinates attached, some relate to a certain packet or pallet (called “movable unit” in huLOG-spreak) some relate to a certain shipment (which is a group of movable units). Some have file attachments, e.g. Images of the packages, signatures proofing delivery or pictures of the number plates of the trucks.

My first approach was an SQL database. Later I coupled it with a self-designed Document store called DoDoStorage. For background on that project see here, here and here.

In summer 2007 we experienced severe performance problems with DoDoStorage. I started a rewrite in Erlang and came across Damien Katz and his then obscure CouchDB project. It was just in the transition from XML to JSON and Damion assured me that “it will not be production ready for an other two years”.

So we decided to solve the DoDoStorage speed issues with more hardware for the time being.

In Fall 2008 the landscape for CouchDB hat changed: it was now the hot thing in database technology and everybody’s darling. CouchDB 0.9 came around and it started to look usable for serious use. We hired Jan Lenhardt, one of the CouchDB core team to work with us on moving huLOG from PostgreSQL and DoDoStorage to CouchDB.

In December we started migrating services and data over to a CouchDB based system. While CouchDB has its wards we are very happy with it so far.

Data Model

As stated above tracking data comes in many different flavors ad colors. It might reference an MUI (”movable unit ID”), a shipment or both. Let’s concentrate on events referencing a MUI. Some typical tracking events might look like this:

{
   "_id": "01420000000378-20061005T064500.000000",
   "mui": "01420000000378",
   "shipment": 572,
   "message": "Processing, 0132-Vlotho, Route 0132, Code 101",
   "code": "410",
   "timestamp": "20061005T064500.000000",
   "facility": "DPD Depot 0132",
   "ort": "Vlotho (DE)",
   "plz": "32657"
},
{
   "_id": "01420000000378-20061005T085400.000000",
   "mui": "01420000000378",
   "shipment": 572,
   "message": "proof of delivery",
   "code": "421",
   "timestamp": "20061005T085400.000000",
   "plz": "32634",
   "_attachments": {
       "POD.pdf": {
           "stub": true,
           "content_type": "application/pdf",
           "length": 136446
       }
   }
}

One thing about my choice of keys (the _id field). I have choosen a “meaningful” ID over traditional “random” UUIDs. The IDs we use consist of MUI-Timestamp The Timestamp is the time the event was generated (e.g. the pallet was loaded). The good news is that this automatically keeps my database free of duplicates. If I import the same file with events twice, the events will have the same IDs during both imports and thus the second import will overwrite the first one: exactly what I want.

The bad news is that while physically there can’t really happen two events at once due to clock drift etc. I might get two different events for a MUI with exactly the same timestamp. This would result in the same key being generated for both events and thus the second event would overwrite the first one. With the relatively sparse populated space of timestamps I expect this to happen once in 10.000 events or so.

For huLOG it is acceptable to use one in 10.000 events – especially if it is the earlier one. We only get about 95% of the events we should get due to problems in the track and trace infrastructure of the freight companies. As long as the all important “has been handed to the customer” messages arrive, our system has to be able to handle missing messages.

Data Access by MUI

With that we just need a few view functions to work with the data. Most obvious we want to get all documents for a certain MUI.

This is easy since our document IDs already contain the MUI. We only have to get a list of all IDs starting with MUI. The CouchDB Document API already provides that functionality:

$ curl -s 'http://localhost:5984/hulog_events/_all_docs
  ?startkey=%22094147562251-0%22
  &endkey=%2209445147562253-9%22
  &include_docs=true' 

{"total_rows":184708,"offset":184705,"rows":[
{"id":"09445147562251-20081114T165600.000000",
 "key":"09445147562251-20081114T165600.000000",
 "value":{"rev":"463392510"},
 "doc": {..., "message":"Einrollung, 0147, Route 0280, Code 461 105",
        "mui":"09445147562251", "shipment":124524}},
{"id":"09445147562253-20081114T165700.000000",
 "key":"09445147562253-20081114T165700.000000",
 "value":{"rev":"2436256439"},
 "doc":  {..., "message":"Einrollung, 0280, Route 0234, Code 461 105",
          "mui":"09445147562253", "shipment":124524}}
]}

Some things to note: startkey and endkey must be set to valid JSON objects. We want a string containing the mui so we have to put “quotes” arround it. After URL-encoding we end up with %22094147562251%22.

The returned documents are within the interval [startkey:endkey]. If we choose a startkey which is guaranteed same or smaller than the key we want to retrieve. That’s easy: 09445147562251 is smaller than any string which has more characters and starts with 09445147562251. Endkey is somewhat more tricky. What is the biggest value? 09445147562252? This might catch one record to much, because there actually might be a key 09445147562252.

A common idiom used in the Erlang community is to use something like 09445147562251Z as the endkey. But then what is if there actually is a key 09445147562251Zabc? So use 09445147562251ZZZZZ, but …

An other suggestion is using a “high unicode character” like \u9999. But at least there is no highest unicode code point and you are well of in considering the sorting rules for obscure unicode characters as indetermistic. See the CouchDB Wiki for further information.

In our case all this is no problem. The MUI is followed by a ISO 8601 timestamp. So
startkey=094147562251-0, endkey=09445147562253-9 should work fine until the end of year 8999.

Data Access by Shipment

That was easy. Now we want to access data based on something which is not encoded in the document ID. Say the “shipment” number. For that we need a map function which gives us shipment numbers.

map: function(doc) {
    if(doc.shipment) {
        emit(doc.shipment, null);
    }
}

We permanently save this function in the server to allow CouchDB to play its clever optimization tricks. See the CouchDB wiki for more information on the HTTP View API. You could use the Futon GUI at http://localhost:5984/_utils/ for that or curl:

 curl -X PUT -H 'Content-Type: applicatioe": "javascript",
  "views": {"all":{"map":"function(doc){if(doc.shipment)
                          {emit(doc.shipment, null);}}"}}}
  ' http://localhost:5984/hulog_events/_design%2fsendung

{"ok":true,"id":"_design/sendung","rev":"23237918"}

Now we can query the view and with the startkey and endkey parameters. Here we get again into trouble for choosing the endkey. The hack we used for the IDs did work because there was a timestamp of well known format in the ID so we construct a “slightly bigger” value. Shipments are numeric. So what is bigger than 128996? 128997 is obviously bigger but would get us documents for two shipments (128996 and 128997). Since the shipments are numeric, a trick like “128996Z” does not work out.

But numeric comparing of numeric values in JavaScript work nicely with different numeric types. Shipments are integers. If we use a float key we nicely can place it to be bigger than 128996 but smaller than 128997. We use 128996.1 as our endkey.

$ curl 'http://localhost:5984/hulog_events/_view/shipment/all
  ?startkey=128996&endkey=128996.1'

{"total_rows":184707,"offset":184674,"rows":[
{"id":"09445122481336-20081223T105212.491567","key":128996,"value":null},
{"id":"09445122481336-20081223T155200.000000","key":128996,"value":null},
{"id":"09445122481336-20081223T202600.000000","key":128996,"value":null},
{"id":"09445122481337-20081223T105216.377305","key":128996,"value":null},
{"id":"09445122481337-20081223T155300.000000","key":128996,"value":null},
{"id":"09445122481337-20081223T202500.000000","key":128996,"value":null}
]}

You can also add &include_docs=true to the query – this will get you the complete documents, not only the IDs.

At this point we have a scalable, elegant datastore for our Track & trace related events ad files.

Usage

Currently we are running CouchDB with subset of our tracking archives. This subset is about a quarter million documents of wich 20% or so have attachments resulting in a database size of about 5 GB. No complains so far.

How map/reduce works in CouchDB

Saturday, December 27th, 2008

I have huge trouble how CouchDBs system of views actually works.

By experimenting and reading the source I came up with thisdescription in pseudo Python:

def mapstep(alldata):
    # the map is applied to every document
    # and the result is collected in two lists of rows
    k_rows = []
    v_rows = []
    for _id, doc in alldata:
       k, v = mapfun(doc) # actually mapfunc uses emit() not return()
       k_rows.append([k, _id])
       v_rows.append(v)
    return k_rows, v_rows

def reducestep(keys, values):
    # now several reduce steps follow. For this example
    # we randomly chose two
    # all even elements
    tmp1 = reducefun(k_rows[::2], v_rows u[::2], False)
    # all uneven elements
    tmp2 = reducefun(k_rows[1::2], v_rows u[1::2], False) 

    # finally several rereduce steps follow.
    # For this example we use only one.
    return reducefun(None, [tmp1, tmp2], True)

result = reducestep(mapstep(alldocs()))

If you call the view with group=true the map step stays the same, but the server applies grouping and calls the reduce step for each group. It looks like this:

def reduce_with_grouping(keys, values):
    gdict = {}
    # create dictionary mapping values to keys
    for k, v in zip(keys, values):
        gdict.setdefault(k, []).append(v)
    ret = []
    for k, values in gdict.items():
        ret.append([k, reducestep(k*len(values), values])
    return ret

result = reduce_with_grouping(mapstep(alldocs()))

If you experiment with views keep in mind that the the Futon Web-Client silently adds group=true to your views and that group=true is ignored if you don’t provide a reduce function.

CouchDB broke my Box (not)

Saturday, December 20th, 2008

I tried to see how much beating CouchDB can take. So I installed it on a modest box (1.8GHz, 512 MB RAM, Debian) and started at 12:00h pouring data in it. At about 22:00h I asked for the computation of a simple view while still dumping data into it. At that Time it contained about 400.000 Documents of with about 10 % contained an Attatchment. DB size was on Disk was about 8 GB.

And an hour later (now) I can’t reach the box anymore.