PyCon

by Zef Hemel

PyCon, the Python conference, took place I think a couple of days ago. A couple of guys at Google “have maintained a blog”:http://pycon.blogspot.com which contains some notes on the different talks. Scattered around on this blog are some links to a couple of interesting Python projects:

* “MindRetrieve”:http://www.mindretrieve.net: personal web history searcher, which is written using
* “PyLucene”:http://pylucene.osafoundation.org: the, with GCJ compiled, version of one of the best free search libraries: “Lucene”:http://lucene.sf.net. “GCJ”:http://gcc.gnu.org/java/ is a Java to native machine compiler as you may know. Python bindings for PyLucene were created using
* “SWIG”:http://www.swig.org: software that makes it easier to write extension modules for various languages, such as Python, Perl and Ruby
* “PyPI”:http://www.python.org/pypi: the Python Package Index
* “Chandler”:http://www.osafoundation.org/Chandler_Compelling_Vision.htm: PIM software written in Python.

The Graham Digest

by Zef Hemel

I got my computer-law test today, so once again I don’t have too much time for an extensive post. Multiple people have been sending me suggestions on what to write about throughout the past few weeks, I want to thank them a lot for that, I really appreciate it. Trouble with those suggestions is that they need more than a couple of minutes of research, so I’ll postpone them until I have a little more time. Just be patient, I didn’t forget about them, so keep sending me those suggestions!

For today I’d like to point you to Paul Graham’s latest four stories; none of which I’ve had time to read yet, but that shouldn’t stop you, you may have a little more time on your hands.

“A Unified Theory of VC Suckage”:http://www.paulgraham.com/venturecapital.html:

A couple months ago I got an email from a recruiter asking if I was interested in being a “technologist in residence” at a new venture capital fund. I think the idea was to play Karl Rove to the VCs’ George Bush.

I considered it for about four seconds. Work for a VC fund?
Ick.

“More Advice for Undergrads”:http://www.paulgraham.com/undergrad2.html:

I asked several friends who were professors and/or eminent hackers what they thought of “Undergraduation”:http://www.paulgraham.com/college.html. Their comments were so good that I thought I’d just give them directly to you.

“Wrting, Briefly”:http://www.paulgraham.com/writing44.html:

A lot of people ask for advice about writing. How important is it to write well, and how can one write better? In the process of answering one, I accidentally wrote a tiny essay on the subject.

I usually spend weeks on an essay. This one took 67 minutes– 23 of writing, and 44 of rewriting. But as an experiment I’ll put it online. It is at least extremely dense.

“Return of the Mac”:http://www.paulgraham.com/mac.html:

All the best hackers I know are gradually switching to Macs. My friend Robert said his whole research group at MIT recently bought themselves Powerbooks. These guys are not the graphic designers and grandmas who were buying Macs at Apple’s low point in the mid 1990s. They’re about as hardcore OS hackers as you can get.

The reason, of course, is OS X. Powerbooks are beautifully designed and run FreeBSD. What more do you need to know?

As you may know I’ve been using purely Ubuntu Linux om my PC for the last week or two. I’m pretty happy with it, but there was one major problem: Word documents.

Of course there’s a great project called “OpenOffice.org”:http://www.openoffice.org that can work with Word documents, and it works fairly well. However it still messes up more complex Word documents, so I can’t use it, as I don’t want to be resposible for messed-up documents that others still have to work with. So… what to do?

At our university we use CrossOver Office to run Word 2000, Powerpoint etc. under Linux. It works pretty well. Problem is that CrossOver Office isn’t free. However, it is based on a free project called “Wine”:http://www.winehq.com (which is a recursive acronym meaning Wine Is No Emulator). What Wine does is implement the Windows API under Linux. Normal Windows system calls are translated to Linux system calls. Using this technique it is possible to run Windows applications on Linux. Problem is that reimplementing all those Windows DLLs is a lot of work. Not all of Wine’s DLLs work very well. Getting a big beast like Word to work isn’t easy. Which is also the attaction of CrossOver Office, it runs Word instantly.

However, there’s another option: “WineTools”:http://www.von-thadden.de/Joachim/WineTools/. After you’ve installed a Wine version (and don’t necessarily use the newest one, version 20041019 seems to be best), you can install this free tool. What WineTools will do is configure Wine for you and offer you a convenient installer from which you can install all kinds of Windows software.

WineTools
(Click to enlarge)

The fun thing is that when you run software under Wine, it’s really like you’re just running them under Windows. With installers and everything:

Installing IE
(Click to enlarge)

And voila, there’s IE6 running under Linux:

Running IE
(Click to enlarge)

But then why I started all this: will Word run? Wine only supports Word 97, 2000 and maybe XP. Currently I only got a Word 97 and 2003 CD lying around, so I installed Word 97. And lo and behold, it runs perfectly! Arguably it even runs faster than under Windows:

Word on Wine
(Click to enlarge)

And here’s what I really like. The trouble with backing up everything in Windows is that your files are all over the place. There are registry entries, DLLs in the Windows directory, files in Program Files and who knows where. What Wine does is simply create a directory called .wine in your home directory in which everything is stored:

zef@ubuntu:~/.wine$ ls
c                   quiet-installed-software
config              system.reg
dosdevices          system.reg.preIE6install
drive_c             userdef.reg
fake_windows        user.reg
installed-software  winetools.log

Yep, the registry is stored in plain text files. And if we look in the fake_windows directory:

zef@ubuntu:~/.wine/fake_windows$ ls
autoexec.bat    Mijn documenten  Program Files  tmp
config.sys      My Documents     Programme      windows
Local Settings  My Music         temp

It’s just like a normal Windows installation, all in one directory. This means that if I’m happy with my installation I can just zip up the whole .wine directory and I’m done (my zipped version, including Office 97 and IE6, is just over 100MB). If I mess something up, I just remove the old .wine and extract my backupped one. Great isn’t it?

The cool thing is that I even got the “Allofmp3.com Explorer”:http://www.allofmp3.com to work, so I can even keep on download cool music from Linux.

Happy Easter

by Zef Hemel

Happy easter everybody!

Nope, no images of happy bunnies or nicely coloured easter eggs here. Instead, an image of what easter really means:

Jesus' resurrection

“The resurrection of Jesus Christ”:http://en.wikipedia.org/wiki/Easter. Hardly anybody knows that these days.

And if you insist on having eggs, find them here:
* “The Easter Egg Archive”:http://www.eeggs.com
* “Easter Egg Heaven 2000″:http://www.eggheaven2000.com

On this last day of the interregional software development week I’d like to talk about editting text files with other people, simultaneously. I mentioned earlier that version control systems can merge changes to text files, but what I’ll talk about is even cooler.

Imagine this: you and a couple of others are on a Skype call and all have an editor in front of you. In this editor you can see the other people’s cursors moving and you can see them type. You’re all working on the same file simultaneously and you can see and talk about what you’re editting. Wouldn’t that be cool?

Well, it’s possible. There are two software products I know of that do this: “SubEthaEdit”:http://www.codingmonkeys.de/subethaedit/ for the Mac and “MoonEdit”:http://me.sphere.pl/indexen.htm for Windows and other Unix platforms. Both are free for non-commercial use. If you want to use it for commercial use SubEthaEdit costs $35. MoonEdit’s commercial price is unknown, you’d have to contact the author.

*SubEthaEdit*
“SubEthaEdit”:http://www.codingmonkeys.de/subethaedit/ is the pretty boy of the two and also has quite some more features than MoonEdit:

SubEthaEdit

In order to use SubEthaEdit all you need is a Mac for each team member and an internet connection. Those who are on the same network can connect to eachother using Rendezvous (Apple’s auto-discovery networking module), others probably have to type in an IP of one of the other users (but I’m not really sure).

On its own SubEthaEdit has quite a lot neat editting features, such as colour coding, code auto completion and regular expression search.

The trouble is that it’s Mac only, which leaves you (if you’re not lucky enough to own a Mac) with MoonEdit.

*MoonEdit*
“MoonEdit”:http://me.sphere.pl/indexen.htm is a much simpeler editor, written using a very weird-looking UI kit:

MoonEdit

MoonEdit functions either through a shared file (for example if you’re on the same network, using NFS shares) or through a MoonEdit server. One of the users starts an instance of a MoonEdit server, others can connect to it, there is no configuration to be done (usually).

I’ve used MoonEdit myself a couple of times and it works fine. It’s a bare-bone editor so don’t expect fancy features like colour coding or code completion. But the good thing is that it works and works on many platforms.

There’s something that I like to call the documentation paradox: if there’s something that developers don’t like doing it’s documentation. Yet, if there’s something that developers need, it’s exactly that: documentation.

This is, or at least sounds like, a paradox because it reasons from the developers seen as a group, and that’s exactly where the problem lies: we’re dealing with a group of individuals. An individual developer has no interest in documenting his or her own piece of code; they understand it, they know their code reads like literature, so why document it? Others have to document their piles of crappy code for it to make any sense, but not them. It would’ve been nice to, en passant, give a solution to this problem, but the margin is just too small to contain, as “Fermat would say”:http://en.wikipedia.org/wiki/Fermat%27s_last_theorem.

Instead I’ll focus on a tool that lowers the barrier of actually documenting something as much as possible. This tool is a wiki. Wikis are relatively new and its applications still have to be explored, but I think software documentation can very well be one of them.

What is a wiki again? To start off, officially they’re called wiki wikis, which is Hawaiian for “quick quick” if I’m correct. But for convenience, and because “wiki wiki” is impossible to market, people usually abbreviate it to just wiki. Wikis are a bunch of linked, unstructured webpages that everyone (who has access to it) can edit. At each page there’s a “Edit” button using which you can change the content of that particular page. Wiki pages are written in Wiki-codes, which I briefly discussed “a while ago”:http://www.zefhemel.com/archives/2004/09/20/post-formatting. So you don’t need to know HTML or something to contribute. It is extremely simple to add pages and to link to them.

Wikis also have version control, just like the version control systems that I talked about yesterday, therefore you can see exactly who changed what pages and revert to older versions if necessary. Most wikis can also merge changes, if pages are editted simultaneously.

There are many wikis around already. The biggest one (with now over 1 million pages) is “WikiPedia”:http://www.wikipedia.org, it is a free encyclopedia which anybody can make changes to. The very first one, the one started by Wiki’s inventor (Ward Cunningham, now working for Microsoft), is quite big as well: “c2.com wiki”:http://c2.com/cgi/wiki. There is some kind of wiki software in nearly every language these days. Ranging from Ruby to C# and from Java to PHP. I’ve some experience with “PhpWiki”:http://phpwiki.sf.net and “MediaWiki”:http://www.mediawiki.org (which is the same on WikiPedia uses). “Here’s a list of other ones”:http://c2.com/cgi/wiki?WikiEngines.

That all sounds very cool, but what should it used for? I think it can be best used as a central information dump. I think wikis would be a good place to store information such as
* design decissions (in the architecture phase, “we had to choose between single-threaded and multi-threaded and we chose … because …”)
* programming problems and their solutions (”On several places we needed this very weird sort function. I implemented this function and put it in this and this library”)

If you have to do some research, for example to figure out which architectural pattern you should use to solve some problem, you can just dump any information you find on the subject in there. The formatting and structure doesn’t matter at first, you (or anybody else for that matter) can tidy up things later. The important thing is that write down problems that you found and how you solved them. If forum discussions took place about certain problems, just copy that discussion in the wiki as well. It’s a central storage of thoughts, problems and decissions that every team member can find and store information on.

Personally I haven’t used wikis in interregional software projects yet, but I still think it’s an interesting area to explore. If you start your project, just think about what wikis have to offer and how you can use them in your project.

The possibility to exchange files is an absolute necessity when developing software. There are many ways to do it of which version control systems are probably the most helpful.

A couple of ways to exchange files:
* Send them through e-mail: works, but you end up with a big mess of e-mails and attachment of different versions of files. It’s not very friendly to you as a user. Also, only the people who you send the file have access to it.
* FTP: set up an FTP account somewhere and let everybody store his/her files there: works, but older versions of files disappear if you’re not making backups.
* Forum attachments: some forum systems have the ability to attach files to messages. This has the same disadvantages as e-mail, but the advantage is that everybody (that can access the forum) has access to it.
* Version control systems: the option I want to talk about today.

Version control systems are not used nearly enough. They’re not only useful in big-ass million-people projects, but even if you’re working on something alone. How often has it happened to you that you removed a piece of code and saved it, only to remember that you needed that piece of code for another purpose. If you use a version control system it’s easy to retrieve an older version of the file with the removed code still in.

Within interregional projects, version control systems are not only useful for versioning purposes, but also for distribution purposes. But before I get into that, I’ll first explain how a normal client-server version control system works. Let’s start with a picture:

Version Control System

In the middle is the server. A version control server can serve multiple so-called repositories. A repository is just a tree of directories and files. Usually you use one repository per project, but there are reasons to use more. In each repository all the current versions of files are stored, but also previous versions, so you can always request an older version.

Clients have a copy of a repository stored locally. This copy can be obtained with a so-called check-out. A check-out is an initial download of all directories and files to your local disk. After that you can keep your repository copy up-to-date with the update command. It’s important to realize that this is a local copy, you’re not editting files directly at the version control system.
Once you got a copy of the repository you can then edit the files, add new ones ore remove some. When you’re done, or you think it’s a good idea to store the changes in a safe place, you synchronize the changes you made with the version control server. This is called a check-in (or commit). The version-control client can see what files have changed and will submit new revisions of those files to the server.

When multiple people are working on the same file, problems can occur. If you’re working with text files, many of those problems can be fixed automatically. For example if person A is working on a subroutine and person B is working on another subroutine in the same file, and both check-in their changes, these changes can often be merged. If the changes don’t conflict, they can be both applied. Note however, that this, in the systems I know, only works on text files. It doesn’t work on images, UML diagrams or Word documents.

Now, for which files should version control be used and for which shouldn’t it be used? Personally I’m in favour of using it for all kinds of files, both source code and (Word) documents. People argue that, because Word documents can’t be merged, it’s not very useful, but I beg to differ. Indeed, Word documents can’t be merged so you have to figure out a mechanism to prevent two people working on it at the same time, but it still has many advantages:
* Version control: that’s why we were considering this in the first place, wan’t it? Old versions of documents should still be retrievable, even if you can see the differences between the version in an as pretty fashion as with text files.
* It’s a convenient way for the distribution of files. Version control systems are easier to use than e-mail, forum attachments or FTP.

The best way to prevent two persons working on a document simultaneously, I’ve seen yet, is just to have a “What are you working on?” topic on your forum. If someone’s going to work on a document, let him or her, post a message stating the status of this work. When you want to edit a document you first check if somebody else is not already working on it. It’s not ideal, but it works.

*Software to use*
OK, you decided to use version control. Great choice! Now you still need software to accomplish this. Personally, and I’m not at all alone in this, I’m very fond of “Subversion”:http://subversion.tigris.org. Subversion is a successor to the well-known CVS(Concurrent Versioning System) with some issues fixed. There are both servers and clients available for most platforms. For Windows there’s a very easy-to-use client, called “TortoiseSVN”:http://tortoisesvn.tigris.org, that integrates nicely into Windows Explorer.

Next Page »