CouchDB: Extending SharpCouch with group, skip and limit
The major benifit of CouchDb, over relational databases, is that it forces you to use Map/Reduce, which offers “out-of-the-box”, an easy path to scalability, in terms of raw data storage, performance and redundancy.
However, 10 CouchDB instances may be quicker than 10 SQL server boxes, but certainly, one instance is terribly slow on ~100K records at ~1 GB in size (total). In fact, it’s unusably slow 😦
Here’s an example I put live (http://couchdb.webtropy.com/), which returns 100 records at random from Delicious data – which is stored on a couchdb at couchone.com. However, the query was too slow to run a Group-by clause, nor could it even do a select, on so many records.
I modified SharpCouch to allow for limiting, grouping and skip functionality, as follows:
/// <summary>
/// Execute a temporary view and return the results.
/// </summary>
/// <param name=”server”>The server URL</param>
/// <param name=”db”>The database name</param>
/// <param name=”map”>The javascript map function</param>
/// <param name=”reduce”>The javascript reduce function or
/// null if not required</param>
/// <param name=”startkey”>The startkey or null not to use</param>
/// <param name=”endkey”>The endkey or null not to use</param>
/// <returns>The result (JSON format)</returns>
public string ExecTempView(string server,string db,string map,string reduce,string startkey,string endkey, bool Group, int Limit,int Skip)
{
// Generate the JSON view definition from the supplied
// map and optional reduce functions…
string viewdef=”{ \”map\”:\””+map+”\””;
if(reduce!=null)
viewdef+=”,\”reduce\”:\””+reduce+”\””;
viewdef+=”}”;string url=server+”/”+db+”/_temp_view”;
if(startkey!=null)
{
url+=”?startkey=”+HttpUtility.UrlEncode(startkey);
}
if(endkey!=null)
{
if(startkey==null) url+=”?”; else url+=”&”;
url+=”endkey=”+HttpUtility.UrlEncode(endkey);
}
if (Group)
{
url += “?group=true”;
}
if (Limit > 0)
{
if (Group) url += “&”; else url += “?”;
url += “limit=” + Limit;
}
if (Skip > 0)
{
if (Group || Limit>0) url += “&”; else url += “?”;
url += “skip=” + Skip;
}return DoRequest(url,”POST”,viewdef,”application/json”);
}
That at least gave me basic functionality.
I was hoping to do a group by author as follows:
function(doc) { emit(doc.author,1); }
function(keys, values, rereduce) {
return sum(values);
}
but that was too slow, and so too was a filter:
function(doc) {
if (doc.author == \”morgand\”)
emit(doc,1);
}
Overall, I was initially excited by it, and I was glad I could extend SharpCouch, but overall disapointed that it’s speed was awful.