Playing with CouchDB

by Zef Hemel

I spent some more time with CouchDB yesterday. Something useful to do while my work project is compiling ;) I’m considering building some simple applications with CouchDB to figure out what the applications of CouchDB are and its limitations. Maybe a blog application or something like that. For now I’m dabbling a bit, trying to create some reasonable databases together. I wrote a little Python script that imports an Outlook CSV file (coming from Gmail — don’t worry) into an address book database. Now I run simple queries on this data set, such as:

function(doc) {
  if(doc.lastname == ‘Hemel’) {
    map(doc.firstname, {’name’: doc.firstname + ‘ ‘ + doc.lastname, ‘email’: doc.email});
  }
}

Which gives me back the full names of all people in my family and their email addresses. Not particularly useful, but hey, it’s something. Currently I have a script running that pulls the last 20 messages from twitter every minute and dumps them in a database. This stuff is easy to do because CouchDB’s documents are free form, you can dump any simple hashtable into it, including the one that the Python Twitter API returns. Here’s my script:
import twitter, couchdb, time

s = couchdb.Server(’http://localhost:5984′)
db = s['twitter'] # use ‘twitter’ database
ta = twitter.Api()
while True:
  for s in ta.GetPublicTimeline():
    dict = s.AsDict()
    try: # Hacky way to check if message is already in the database
      dummy = db[str(dict['id'])]
    except:
      db[str(dict['id'])] = dict # If not, store it
      print ‘Added new message: %d’ % dict['id']
  time.sleep(60)

Why am I doing this? Dunno, need some data to play with and have to see what interesting information I can extract from this. For instance, to figure out which twitterers are from the Netherlands, I have the following query:

function(doc) {
  if(doc.user.location.toLowerCase().indexOf(’netherlands’) != -1) {
    map(doc.user.screen_name, doc);
  }
}

One application of CouchDB I found already it simply a database to play with data without putting too much thought into. Dabbling.


Rss Commenti

1 Commento

  1. Heya Zef,
    good to see you playing with CouchDB. I did set up the exact same thing, but shut it down due to time-reasons :)

    I scraped the public timeline as well, stored it into CouchDB and let users define their own views. It worked pretty well given CouchDB’s current limitations, but I didn’t have the time to maintain it.

    I’m interested in what else you come up with :)

    Cheers
    Jan

    #1 Jan

Lascia un Commento