Friday, March 1, 2013

Importing 16-bit TIFFs into NumPy

A colleague approached me today asking whether I've tried importing TIFFs into Python for image processing with SciPy. My image processing experience has been mainly focussed within a MATLAB environment, so we set about going through this together. Unfortunately, none of the images we tried to process with imread and imshow would work properly and would only display a white result. Inspection of the imread output showed that it returned a matrix of 255s. This was despite the source TIFF containing a greyscale image of something more interesting. Further investigation was thus warranted.

I firstly tried looking at the properties of the greyscale TIFF image and saw it was saved in 16 bits. I initially overlooked this thinking that this was the default. To cut a long story short, I realise now that this was the key fact as downsampling the file to 8 bits imported the TIFF file without any problem. So we were faced with a dilemma. Do we: 1. convert an entire library of TIFF files from 16 bits to 8 bits, 2. revert back to using MATLAB, or 3. try and get imread to open 16 bit TIFFs. Option 1 was immediately rejected as the library was enormous leaving options 2 and 3. We decided to continue with Python and investigate a solution.

A quick search on the Internet revealed others had experienced this problem, but few offered workable solutions. One such place was from another blogger, Philipp Klaus, who listed several methods to overcome this problem. Installing some of the packages required to accomplish this task proved too challenging on the Mac with its limited Python library, so I stopped. Fortunately, my colleague picked up where I left off and found an interesting comment on this site, posted by Mathieu Leocmach. This included a snippet of code that my colleague had tried on the data but did not work as expected. Here's the code in full:


import numpy as np
import Image

def readTIFF16(path):
"""Read 16bits TIFF"""
im = Image.open(path)
out = np.fromstring(
im.tostring(),
np.uint8
).reshape(tuple(list(im.size)+[2]))
return (np.array(out[:,:,0], np.uint16)<<8)+out[:,:,1]



Overviewing the code, I noticed the variable out consists of three dimensions. The third dimension translates to which one of the two bytes to use. According to the code, the most significant byte is in the index [:, :, 0], whereas the least significant is at [:, :, 1]. When we tried it out, the result of this sum produced mostly noise but certain elements looked to have some order. Further analysis revealed the more interesting aspects of the image were contained with [:, :, 0] instead, and the noise was contained within the other. However, there was also some order to index [:, :, 1], despite being mostly noise, so we couldn't simply reject this byte. This indicated a possible byte sequence shift which may be the result of Endian-ness. Changing the code to the following allowed us to see the image:


import numpy as np
import Image

def readTIFF16(path):
"""Read 16bits TIFF"""
im = Image.open(path)
out = np.fromstring(
im.tostring(),
np.uint8
).reshape(tuple(list(im.size)+[2]))
return (np.array(out[:,:,1], np.uint16)<<8)+out[:,:,0]



But if we're writing code like this, there's no reason why we're splitting the original image into two byte sequences. We could accomplish the same task by immediately using np.uint16 instead of np.uint8 and removing the need for a third dimension in the output. I settled with the following bit of code that worked well for us. Note that this code could, of course, be tidied up further, but I won't do that here.


import numpy as np
import Image

def readTIFF16(path):
"""Read 16bits TIFF"""
im = Image.open(path)
out = np.fromstring(
im.tostring(),
np.uint16
).reshape(tuple(list(im.size)))
return out



Now we're able to use 16 bit greyscale images within Python without having to resort to converting entire image libraries or using other packages or software. There's the possibility that the original code will work for some people and our version looks like white noise. I believe this is possibly due to Endian differences between the data and the computer. Hopefully future versions of Python will address this issue and making the necessary checks to TIFF source files so that we don't have to find workarounds. Not that we don't enjoy finding and creating these solutions!

Wednesday, February 24, 2010

A Problem when Importing in Python

Last night, I had some problems importing a Cython created object within Python. The silly thing being I remembered encountering this problem before on a Linux-based setting and didn't make a note of what I had done to overcome the problem. What didn't help was the error message being given by Python in saying that one of the functions in the created C file was problematic, and what was worse is that one computer worked okay whilst this one didn't despite following the same procedure.

It was time to delve into the potential problems: permissions, distribution incompatibility, some bug with the script or something else. The first I had encountered when using my Ubuntu machine – it was particularly unfriendly in that I had to ensure everyone had execution privileges to run my library. I didn't have to sudo on my Mac, so privilege issues would be surprising more than anything. The second was equally unlikely as I was using the same Python "kitchen sink included" distribution (Enthought's EPD) on my other Mac which worked unhindered. Implicitly, it ruled out the third option which only left "something else" which didn't really help.

After getting frustrated, replacing the compiler (Apple's XCode tool) and was still in the same mess, I attempted to copy a pre-built library from one Mac onto the other (they're virtually identical machines, so shouldn't have been a problem), yet still, I encountered the error message mentioning a problem with the C code despite the library having been built successfully.

It was only after having used iPython I began to notice that the first time I imported my Cython objected I would encounter the error – the second time would always work unhindered. This suggested to me there was perhaps a conflict in the libraries somewhere. Sure enough, in the directory I was executing the code, a misplaced (and ancient) .so compiled library file was there and not in the build directory as it should have been. Upon deleting it from the base directory, importing the library worked first time.

I'm mainly writing this so that when I encounter this problem again (and I'm very likely to), I can hopefully remember at least I wrote about it somewhere. It should have been one of the first things I looked for, but failed to as it worked perfectly on one computer with the same suite of applications. Poor excuse, I know!

Saturday, June 6, 2009

SVN+SSH and the post-commit hook

To aid my Python programming, I have recently set up a subversion (SVN) repository on one of the spare computers in the office. SVN is basically a way of maintaining versions (or revisions) of a particular project. Messages/comments can be added to files or updates that are "committed" to the repository and differences between revisions can also be viewed making it particularly suited for programming projects. I set up SVN to also use SSH as a secure method of accessing the code available on the computer. However, on its own, SVN isn't terribly exciting so to jazz things up, I added a Trac interface to the repository. Trac maintains tickets (for correcting or adding parts to the project), project components (useful for modularising a project) and has an in-built wiki which could contain documentation for the code. Trac can also close and create references to tickets by reading the commit messages when a user commits code to the repository. This is done through hooks which form part of the SVN repository and hooks are executed on particular events such as before files are committed and after files are committed. I made available the post-commit shell script, copied the documented Trac scripts and made it available to run on the server, but the Trac database would never update with tickets that were closed or referenced.

My SVN process was always running as a root user, so permissions aren't exactly a problem for it. The post-commit file had execution permissions enabled for the user and group (set to Apache and a user group actively using SVN). I had even added a line to temporarily store the Python egg scripts it uses, but still it wouldn't update the Trac database. To top things off, running as the Apache user to 'check' revision 100 (sudo -u apache env - /svn/repo/MyRepo/hooks/post-commit /svn/repo/MyRepo/ 100) would return no errors and complete the database update successfully.

I was about to give up when I remembered that most people I saw demonstrate the post-commit were reliant on using a SVN username or were doing this through Apache's WebDAV (so the Apache user). As I was using SVN+SSH as the protocol of choice, we were logging in through our own user accounts. It is through our user accounts that the post-script is activated and, thus, meaning that the group would need sufficient permissions to run this file. However, as stated earlier, the post-commit script had the sufficient privileges to execute, but as it called the Python script "trac-post-commit-hook", that too needed to have the appropriate group execute and read privileges. This was amended, but still didn't update the Trac database.

I thus ran the post-commit script under my username where I received a few error messages. It was here I saw the offending error: the Trac database couldn't be written to. The solution here was to make the Trac database (defaults to trac.db) and its working directory have write permissions to all users that use the SVN. Upon making this change, the Trac database now updates.

Contents of Objects in Python

Lately, I've been programming a lot in Python which has proven to be quite enlightening for me. My method of learning a new programming language has been to solve a problem through software and learn syntax, functions and layout on my journey. Normally, I'd look at pre-existing libraries or even source code in that language to get an idea of the things that are available. Python was no different, but to aid my learning, I chose to use an Interactive interface to Python: iPython.

One thing always puzzled me: if I created an object, say, myObject as instance A (that is to say A = myObject()), if I typed in A, how could I control the output without viewing something like <instance 'myObject' object at 0xlocation> but instead show something more useful?

Nowhere could I find it explicitly written although now I know what to look for, I see documentation on this. When calling A in either iPython or Python, the interpreter looks for A.__repr__() to display some useful information. I took the opportunity to display a summary of the data contained by the object. This is done by returning a string object (or str in Python) when creating the class. Likewise, if I were to write print A, Python would first look for A.__str__() and, if it was undefined, reverts to returning A.__repr__(). Thus, it is sometimes useful to display more information in the __repr__() call and leave the __str__() for minimalist information (although this will depend on the application). For example:

class myObject:    def __init__(self, contents):        self.contents = contents     def __repr__(self):        return "Contents of container: " + str(self.contents)    def __str__(self):        return str(self.contents)

Indeed, one may notice the difference between the __repr__() function of a NumPy array call and one from __str__(): the former including the words "array" at the start and encloses the output from the __str__() function call.

Saturday, May 24, 2008

External commands in MATLAB

For some time now, I have often wondered why MATLAB highlighted text starting with a ! in a different colour. I thought nothing of it until recently when I was, as always, randomly entering code into MATLAB and had accidentally put a exclamation mark in the command window (I can't remember the exact reason why I was using exclamation marks). The result was quite interesting: it sent the command to the command interpreter outside of MATLAB which, of course, flagged this as an error. In seeing this, I stopped what I was doing and started to play about with this newly discovered shortcut.

My first command was !bash which, being in Linux, should run the BASH shell... which it did. I could then use MATLAB as a glorified terminal window. Whilst in BASH, I decided to check for updates for the machine which would require me to use sudo followed by the update command (and as I use Ubuntu, that's simply turns out to be sudo apt-get update). As always (with the exception of doing this in MATLAB), I was prompted for my password. Interestingly, MATLAB behaved like the terminal window and didn't display my password which normally happens should one Telnet their POP3 server (a hobby of mine). I guess this would be due to a so-called seamless "pipe" between MATLAB and, in this case, BASH.

From here, one can then go a step further an build m-files that run external commands. Of course, there's the issue of platform dependency here (mainly for the UNIX group) but one can work around this issue. A nice feature (at least in the UNIX version) is the fact that ! commands run in the current working directory (try !pwd and await a response) which means navigation isn't a problem. One minor niggle that I haven't been able to resolve just yet is trying to take a result from the external command without resulting to taking output manually using the command > output.log method.

So let's create something that works out what platform we're running on from within MATLAB and obtains the flavour of Linux using Linux command uname

if isunix% In Unix, determine OS type!uname -s > unixname.txt% Now read this file:fid = fopen(unixname.txt, 'rt');if feof(fid) == 0 % Determine if at end of fileosname = fgetl(fid); % Get lineelsedisp('File ended early');endelseif ispcdisp('I''m in MS Windows');elsedisp('Impossible: You''re not running either Windows or UNIX?');end

Hopefully you are able to see the potential here in executing external applications based on what operating system environment is currently being used. Both Windows and UNIX support the command > output.log method (which you may remember from the Queue system). In fact, I frequently create script files using a combination of echo and the "append" verb >> where I can get away with it.

Sunday, May 11, 2008

LaTeX in MATLAB

Not too long ago, I was entering random commands into MATLAB (read: I was bored, so wondered if certain commands existed). I finally came across a few "games" which would have a practical side to them:

• xpbombs for a Minesweeper equivalent

• fifteen for a puzzle game

• sf_tictacflow for naughts and crosses/Tic-Tac-Toe

• sf_tetris for Tetris

It was after this search, I wondered into the LaTeX command which was cunningly called latex. This command makes it possible to port equations and matrices from the MATLAB workspace to your LaTeX document.

The latex command makes use of the Symbolic Toolbox which is available with MATLAB (providing you have the appropriate licence). The Symbolic Toolbox can easily calculate integrals and Taylor expansions algebraically (however it doesn't show workings if you're looking for the intermediate steps). It can even convert matrices and vectors into LaTeX providing you convert the matrix or vector into a symbolic object first.

To convert an ordinary double object into symbolic object, one simply has to encapsulate the double object in a sym environment, i.e. output = sym(x). The outputted object can then be processed for LaTeX conversion using the latex(output). In the case of matrices, the output even puts the dynamic brackets that resize depending on its contents (i.e. the \left[ and \right]) which, of course, can be changed to whatever you want once its imported into your document.

Taylor expansions? No problem. Simply assign a variable like x to be symbolic by either typing syms x or x = sym('x'), then output = taylor(exp(x)) should you wish to see the Taylor expansion of e^x. As before, LaTeX-ify the output by using latex(output) which you can then copy into your document.

If copy and pasting is getting a little too much, you could use the inbuilt copy command which will move the output into clipboard for you. All you'll have to do then is paste the contents of clipboard into the application of your choice. Easy. Here's the code: clipboard('copy', latex(output)).

What obviously comes next is a quick script that does this all for you depending on the input. Like many of my scripts, I simply append the word "it" to the original function name; so in this case I created a latexit.m file:

function output = latexit(input, copyToClipboard)output = [];if nargin == 1 copyToClipboard = true;elseif ~islogical(copyToClipboard) warning('Second argument should be logical. Assuming value is false'); copyToClipboard = false;endif isa(input, 'sym') % Symbolic object output = latex(input);else % Another type of object try output = latex(sym(input)); catch error('Could not process input. Try converting object to a symbolic form manually and then reprocess'); endendif copyToClipboardclipboard('copy', output);end

That should be it. A script file that could convert matrices and output them as LaTeX in addition to moving the LaTeX code into the clipboard (which can be disabled by supplying false as the second argument). If you were really lazy, you could wrap the LaTeX code in an equation environment too, but there's little point in doing that here.

Further investigation into how this works reveals that it goes deeper than the Symbolic Toolbox as it turns out it uses the Maple engine to turn the symbolic object into LaTeX output. Why reinvent the wheel when someone's already done a fine job?

Tuesday, May 6, 2008

Linux, Compiz and MATLAB

Recently, a new version of Ubuntu was released and, being an Ubuntu user, I installed this latest version. For some time now, I've not been able to run MATLAB with the fancy window graphics switched on without displaying an almost blank (grey) screen. That is until finding a helpful post on the Ubuntu Forums.

For those that don't know, the "fancy window graphics" within Ubuntu and, from what I gather, other distributions of Linux are produced by a component known as Compiz which is capable of producing many interesting effects such as water ripples, a desktop cube plus many more effects.

The reason why MATLAB displays a grey box when Compiz is enabled is down to the way Java attempts to interact with window managers. According to one post I read, the problem arises due to the Java toolkit code assuming that all window managers re-parent windows. If a window manager (such as Compiz, Compiz Fusion, etc.) is running, then the Java toolkit waits for new windows to be re-parented before it starts handling certain events on them. Since Compiz and other window managers do not re-parent windows, the Java-based applications end up waiting forever.

There was a time, a year or two ago, where the fix consisted of altering the Java source which was awfully messy and didn't work for every distribution of Linux. This time, there's a better workaround which doesn't involve the AWT_TOOLKIT patch (which I, personally, could never get working). This one is a simple and makes use of the latest version of Sun's Java (as Sun eventually got around to patching this problem). MATLAB, by default, is shipped with its own Java Virtual Machine (JVM) and so is independent of the version that exists (or doesn't exist) on your workstation. We therefore have to override this information by typing in the following within the Terminal:

export MATLAB_JAVA=/usr/lib/jvm/java-6-sun/jre/

This code simply exports the variable MATLAB_JAVA where it then exists for the lifespan of the Terminal window. Note that you should replace the path I've used to the latest Sun version of the Java Runtime suite.

From here, one can type in matlab -desktop or, simply, matlab (I am the proud owner of a broken installation which requires me to enter the -desktop argument). Typing this in every time you want to open up MATLAB could be described as arduous, so the creation of a script file that could do this for you sounds quite enticing.

If, like me, you enjoy creating random scripts that do numerous things (like connect via SSH with key authentication), you would probably have a folder where you store all your scripts. If not, it's not too early to start: create a new folder within your home folder by typing mkdir bin. If you're lucky, this should be added automatically to your \$PATH variable for you (as it's defined within your .profile within your home directory. Next, type in cd bin where bin is the location to your script files. From the top of your head, think of a name to call MATLAB which will load everything for you; I'll pick matlabgo. Once you've decided, continue and replace matlabgo with the name of your choice. Type:

echo "export MATLAB_JAVA=/usr/lib/jvm/java-6-sun/jre/" > matlabgoecho "matlab -desktop" >> matlabgochmod u+x matlabgo

So we've basically created the file (without use of any editor, mind you) and then changed the permissions of our link so that we can execute (and, hence, use) it. You should now be able to type in matlabgo into the Terminal and even link up shortcuts that use this script file.

For those that are interested in my source, head to: http://ubuntuforums.org/showthread.php?t=635142 for more information.