I had a little time to test out SlideRocket yesterday, not having high expectations already having tried out other presentation web apps such as Google’s one. However Sliderocket is really cool. It’s implemented with Adobe’s Flex (which I think is flash on the client side and some server-side components). It eats up quite some CPU cycles, but it looks very nice. Very much like Apple’s Keynote, the presentations look stylish too. Here are two sample slides:

SlideRocket is still in private beta right now and very much under construction, but it looks extremely promising. Who would’ve thought that the way to make the most beautiful presentations under Linux would be a web app?

Links for 2008-03-22

by Zef Hemel

Centralwings

by Zef Hemel

In May, Justyna and I would fly out to Gdansk, Poland to visit her family. We would fly with Centralwings. Justyna flew with Centralwings before and her flight was canceled and flew a few days later. Now it turns out that centralwings has huge debts and is closing down lines left and right. In fact, currently it is not possible to book the flight from Amsterdam to any other destination in Poland other than Crackow. A friend from Poland, who was going to come here in April, called that her flight from Warsaw was cancelled. We are trying to figure out if ours in May is also cancelled, but it’s difficult because if you check on the website you always get a “database is too busy” page, if you call the international number it is never picked up and if you call the local numbers they either don’t answer or there huge queues.

Their slogan:

Anyway, my point is: only fly with Centralwings if you don’t really care whether you’re going or not.

First Paper Accepted

by Zef Hemel

This morning we got an e-mail saying that our paper entitled “Code Generation by Model Transformation. A Case Study” has been accepted to the International Conference on Model Transformation ‘08. Which means that the first paper I co-wrote will be published! The paper is about the implementation of WebDSL and the two dimensions of modularity that we applied to organize the WebDSL generator and make it extensible.

The paper will be presented in Zurich, Switzerland this July, I’m not sure who of the three of us will be there. We’ll see.

Update: Abstract: The realization of model-driven software development requires effective techniques for implementing code generators. In this paper, we present a case study of code generation by model transformation with Stratego, a high-level transformation language based on the paradigm of rewrite rules with programmable strategies that integrates model-to-model, model-to-code, and code-to-code transformations. The use of concrete object syntax guarantees syntactic correctness of code patterns, and supports the subsequent transformation of generated code. The composability of strategies supports two dimensions of transformation modularity. Vertical modularity is achieved by designing a generator as a pipeline of model-to-model transformations that gradually transforms a high-level input model to an implementation. Horizontal modularity is achieved by supporting the definition of plugins which implement all aspects of a language feature. We discuss the application of these techniques in the implementation of WebDSL, a domain-specific language for dynamic web applications with a rich data model.

Superlanguages

by Zef Hemel

Imagine a language that can be both generic and domain specific. A language that is extensible is every way imaginable. You can define your own syntax. You can extend the type system. It would be a kind of, a kind of… superlanguage!

In short this is what XMF is. And yes, as I typed that name I mistyped it as XML, but do not worry, it has nothing to do with XML, thank god. You can download a free book on the XMF superlanguage. I’m reading it now. It’s not just for fun for me, it’s work — it’s an alternative approach to building DSLs which needs to be investigated. That’s what we, scientists do.

That, and make the world a better place. With super languages for instance.

Links for 2008-03-15

by Zef Hemel

For the past few days I’ve been dabbling with CouchDB. Trying to figure out what it can do and how it’s different than traditional relational databases. According to the site:

CouchDB is designed for document-oriented applications. A typical real-world document oriented activity, if it weren’t computerized, would consist mostly of physical paper documents. These documents would need to get sent around, edited, photocopied, approved, denied, pinned to the wall, filed away, buried in soft peat for six months, etc. They could be simple yellow sticky notes or 10,000 page legal documents. Not all document-oriented applications have real world counterparts.

Some examples of document-oriented applications:

  • CRM
  • Contact Address/Phone Book
  • Forum/Discussion
  • Bug Tracking
  • Document Collaboration/Wiki
  • Customer Call Tracking
  • Expense Reporting
  • To-Dos
  • Time Sheets
  • E-mail
  • Help/Reference Desk

Looking at this list I’m like, what application is not document-oriented? It seems that the applications I use: address books, email, blogs, twitter, calendar and so on are all document-oriented applications. So, I decided to look a bit deeper. What makes CouchDB different than, say, MySQL? This presentation gave me the best answer to that question:

SQL CouchDB
Predefined, explicit schema Dynamic, implicit schema
Uniform tables of data Collection of named documents with varying structure
Normalized. Objects spread across tables. Duplication reduced. Denormalized. Docs usually self contained. Data often duplicated.
Must know schema to read/write a complete object Must know only document name
Dynamic queries of static schemas Static queries of dynamic schemas

CouchDB seems to tell you: forget everything you learned about database design and be pragmatic. Don’t normalize — aggregate, don’t plan ahead — evolve. To play around with these ideas I decided to port my blog to CouchDB. It’s not really done yet, but I moved over most of the data and have a basic index page view now. Let me tell you how the approach I took with CouchDB differs from the one taken for SQL databases in WordPress.

Unlike MySQL and other SQL databases, you access CouchDB through a web service API, a RESTful service API in fact. The protocol is extremely simple and that’s also why there are client “libraries” in about every language imaginable. It also comes with a convenient browser admin interface that allows you to create new databases, create documents, edit them, remove them and so forth. This is what my blog database looks like (click for a larger version):

As you can see every document has a Document ID, this is similar to a primary key in SQL. When you click on a document you will see the content of that document (click for a larger version):

As you can see a documents consists of a number fields with associated values. What field names you use is entirely up to you, but there are a couple that have a special meaning. Every document has at least two fields: “_id” (which contains the document’s ID) and “_rev” (which contains the document’s revision number). Indeed, CouchDB keeps old revisions of all your documents, which is really cool and useful. It means that it’s almost trivial to produce a revision history of your blog posts, or if you implement a wiki system with CouchDB — a revision history of pages is very simple to obtain.

A field’s value can be of any JSON type, so: a number, a string, a list or a hash table.

In my blog application I defined a number of other fields, they are:

  • author (containing the post’s author name)
  • tags (a list of tags associated with the post, note that this is a list
  • comments (a list of comments, each of which is a hash table containing the comment’s date of posting, the content of the comment and the name, URL and email address of the poster)
  • content_parser (this is a string saying what format the post’s content is in, for example wordpress’s HTML with newline preservation)
  • content (the actual content of the post)
  • date (the date of the posting)
  • title (the title of the post)
  • type (for a post always “post”)
  • slug (the post slug)

People who ever designed a database schema for a relational database will look at the tags and comments fields and think: what the? In SQL databases this is absolutely not-done, partly because you don’t usually have list types, let alone lists of hashtable types in SQL databases, but also because it’s not normalized and very hard to query.

Now I will admit, this denormalization has its problems. For instance, if I change the name of a tag I would have to run through every single post and change its name — very inefficient, whereas in a SQL database there would be a “tag” table with a tag ID and tag name, and I would simply change the name. Also, it’s not very efficient to store data in this manner, because you have data duplication all over the place. However, querying this nested data is not a problem, because CouchDB has views.

A view in CouchDB is yet another document adhering to a couple of conventions, first of all, its Document ID should start with “_design/” and the document should have a “views” field with a hashmap that maps view names to Javascript functions (click for larger version):

Now let’s have a look at the two views that are defined here. The first one is the simplest one: “latest_posts”. Here is the Javascript code:

function(doc) {
  if(doc.type == ‘post’) {
    map(doc.date, {’title’: doc.title, ‘author’: doc.author,
                   ‘content’: doc.content, ‘tags’: doc.tags,
                   ‘comment_count’: doc.comments.length})
  }
}

The idea with views is simple. You provide CouchDB with a function in some language (Javascript out of the box, but other languages can easily be supported, there are means to write those functions in Python for instance). This function takes one argument: the document. The function decides whether or not this document will be in the view, and if so what the key will be and what shall be the contents of the entry for this document. They key can be used to sort and be filtered on. The content can be any JSON type, but typically it’s a hash map (as in this case). The function to call to tell CouchDB what to put in the view is called map. Now, if you have done some functional programming this will seem odd to you, because typically map is a function that applies some function to a list of values. In this context however map refers to acting as the map bit of the map/reduce algorithm using which CouchDB is implemented. Personally I’m not a big fan of map as a name, as users don’t really care about map/reduce, they care about what map does. In the Python support this function is using Python’s yield keyword, which, in my view, is more descriptive of what it does.

Now you might wonder, why use something like map or yield, why not simply use return? Well, the interesting thing is that this view function doesn’t have to return 0 or 1 view entry, it can return any number of view entries. In the “latest_posts” example this didn’t make much sense. But let’s have a look at the view function for “latest_comments”. As you will remember comments are not separate documents in my blog model, they are contained in a field of a post document. Now how would you retrieve a list of the latest comments? Obviously the answer is a view, and this is what it looks like:

function(doc) {
  if(doc.type == ‘post’) {
    for(var i = 0; i < doc.comments.length; i++) {
      var comment = doc.comments[i];
      map(comment.date, {’post_title’: doc.title, ‘post_id’: doc._id,
                         ‘author’: comment.author,
                         ‘content’: comment.content});
    }
  }
}

What happens here is that for every comment in every post document, map is called returning information about the comment and using comment.date as the key so that it is possible to sort based on that.

Using these two views you already implemented quite a bit of the application. To create a post, simply add a document. For the front page simply use the “latest_posts” view, for the individual post page simply retrieve the document, which contains all the information you need. If you want a list of latest comments in the sidebar, you can use “latest_comments” view. Put a simple frontend to it, and there you go, a CouchDB powered blog (click for bigger version):

Next Page »