How map/reduce works in CouchDB

I have huge trouble how CouchDBs system of views actually works.

By experimenting and reading the source I came up with thisdescription in pseudo Python:

def mapstep(alldata):
    # the map is applied to every document
    # and the result is collected in two lists of rows
    k_rows = []
    v_rows = []
    for _id, doc in alldata:
       k, v = mapfun(doc) # actually mapfunc uses emit() not return()
       k_rows.append([k, _id])
       v_rows.append(v)
    return k_rows, v_rows

def reducestep(keys, values):
    # now several reduce steps follow. For this example
    # we randomly chose two
    # all even elements
    tmp1 = reducefun(k_rows[::2], v_rows u[::2], False)
    # all uneven elements
    tmp2 = reducefun(k_rows[1::2], v_rows u[1::2], False) 

    # finally several rereduce steps follow.
    # For this example we use only one.
    return reducefun(None, [tmp1, tmp2], True)

result = reducestep(mapstep(alldocs()))

If you call the view with group=true the map step stays the same, but the server applies grouping and calls the reduce step for each group. It looks like this:

def reduce_with_grouping(keys, values):
    gdict = {}
    # create dictionary mapping values to keys
    for k, v in zip(keys, values):
        gdict.setdefault(k, []).append(v)
    ret = []
    for k, values in gdict.items():
        ret.append([k, reducestep(k*len(values), values])
    return ret

result = reduce_with_grouping(mapstep(alldocs()))

If you experiment with views keep in mind that the the Futon Web-Client silently adds group=true to your views and that group=true is ignored if you don’t provide a reduce function.

  1. Have a look at http://svn.apache.org/viewvc/couchdb/trunk/share/server/main.js?view=markup It implements the receiving end of the map, and reduce functions and might give you some more insight.

Leave a Reply