Saturday, June 6, 2009

SVN+SSH and the post-commit hook

To aid my Python programming, I have recently set up a subversion (SVN) repository on one of the spare computers in the office. SVN is basically a way of maintaining versions (or revisions) of a particular project. Messages/comments can be added to files or updates that are "committed" to the repository and differences between revisions can also be viewed making it particularly suited for programming projects. I set up SVN to also use SSH as a secure method of accessing the code available on the computer. However, on its own, SVN isn't terribly exciting so to jazz things up, I added a Trac interface to the repository. Trac maintains tickets (for correcting or adding parts to the project), project components (useful for modularising a project) and has an in-built wiki which could contain documentation for the code. Trac can also close and create references to tickets by reading the commit messages when a user commits code to the repository. This is done through hooks which form part of the SVN repository and hooks are executed on particular events such as before files are committed and after files are committed. I made available the post-commit shell script, copied the documented Trac scripts and made it available to run on the server, but the Trac database would never update with tickets that were closed or referenced.


My SVN process was always running as a root user, so permissions aren't exactly a problem for it. The post-commit file had execution permissions enabled for the user and group (set to Apache and a user group actively using SVN). I had even added a line to temporarily store the Python egg scripts it uses, but still it wouldn't update the Trac database. To top things off, running as the Apache user to 'check' revision 100 (sudo -u apache env - /svn/repo/MyRepo/hooks/post-commit /svn/repo/MyRepo/ 100) would return no errors and complete the database update successfully.


I was about to give up when I remembered that most people I saw demonstrate the post-commit were reliant on using a SVN username or were doing this through Apache's WebDAV (so the Apache user). As I was using SVN+SSH as the protocol of choice, we were logging in through our own user accounts. It is through our user accounts that the post-script is activated and, thus, meaning that the group would need sufficient permissions to run this file. However, as stated earlier, the post-commit script had the sufficient privileges to execute, but as it called the Python script "trac-post-commit-hook", that too needed to have the appropriate group execute and read privileges. This was amended, but still didn't update the Trac database.


I thus ran the post-commit script under my username where I received a few error messages. It was here I saw the offending error: the Trac database couldn't be written to. The solution here was to make the Trac database (defaults to trac.db) and its working directory have write permissions to all users that use the SVN. Upon making this change, the Trac database now updates.


Read more on this article...

Contents of Objects in Python

Lately, I've been programming a lot in Python which has proven to be quite enlightening for me. My method of learning a new programming language has been to solve a problem through software and learn syntax, functions and layout on my journey. Normally, I'd look at pre-existing libraries or even source code in that language to get an idea of the things that are available. Python was no different, but to aid my learning, I chose to use an Interactive interface to Python: iPython.


One thing always puzzled me: if I created an object, say, myObject as instance A (that is to say A = myObject()), if I typed in A, how could I control the output without viewing something like <instance 'myObject' object at 0xlocation> but instead show something more useful?


Nowhere could I find it explicitly written although now I know what to look for, I see documentation on this. When calling A in either iPython or Python, the interpreter looks for A.__repr__() to display some useful information. I took the opportunity to display a summary of the data contained by the object. This is done by returning a string object (or str in Python) when creating the class. Likewise, if I were to write print A, Python would first look for A.__str__() and, if it was undefined, reverts to returning A.__repr__(). Thus, it is sometimes useful to display more information in the __repr__() call and leave the __str__() for minimalist information (although this will depend on the application). For example:


class myObject:
    def __init__(self, contents):
        self.contents = contents

    def __repr__(self):
        return "Contents of container: " + str(self.contents)

    def __str__(self):
        return str(self.contents)


Indeed, one may notice the difference between the __repr__() function of a NumPy array call and one from __str__(): the former including the words "array" at the start and encloses the output from the __str__() function call.


Read more on this article...