Document thread safety!

I recently wasted almost an entire day tracking down memory corruption caused by a thread safety issue.  I was working with libMemcached, whose website says right on the front page that the library is thread safe.  Their documentation also says that the library is thread safe.  However, halfway down the page, it says “memcached_st structures are thread-safe, but each thread must contain its own structure” (emphasis mine).  First of all, what does the author think thread safety means, if threads can’t share data?  Second, this whole experience highlighted an important fact: properly documenting thread safety (or lack thereof) is critical.

A responsible developer may verify documentation.  If the docs say that a function returns -1 when passed an invalid argument, the developer can easily just try it out and see.  Thread safety, however, is more difficult.  First of all, even with thread safety, multithreaded programs are not necessarily deterministic, so a simple deterministic test can’t always be written.  Second, thread safety issues may not always be obvious.  A problem may occur in only 1 out of 100 cases, and only when using many threads.  This is why it is vitally important to document thread safety.  Not just whether or not a function (or class, or whatever) is thread safe, but what are the assumptions, and what are the guarantees.  This is always a good idea for any aspect of the code, but it is absolutely necessary for properties (like thread safety) that are difficult to test empirically, and where the developer must rely on the documentation.

MCT now open source

The software that I worked on at NASA, Mission Control Technologies (MCT), is now open source and available on GitHub.  It’s a data monitoring platform built in Java using OSGi.  Check it out!

Finding an object file defining a symbol

I was working on a project that needed a tiny amount of functionality from another, much larger, project.  Rather than linking in all the object files, I wanted to keep to the minimum.  Unfortunately, some of the functions I called had very generic names, and I had no idea which object files they were coming from.  Shell scripting to the rescue:

#!/bin/bash
for I in *.o; do
    objdump -twC "$I" | awk '{if($2=="g") print}' | grep "$@" | sed "s/^/$I\t/"
done

To break it down:

  • objdump, as you might guess, dumps the contents of an object file.  The -t argument tells it to dump the symbol table, the -w (wide) tells it not to truncate lines, and the -C tells it to demangle C++ symbols.
  • The awk part searches for a “g” in the second field, which flags the symbol as a global.
  • grep searches for the symbol you want.  Note that you can use regexes, and you can pass grep options such as –color=always, -i (for case insensitive), or -E (for extended regexes).
  • Finally, sed sticks the filename in front of the line so you know which file it came from.

Works both for functions and global variables.  Now when I get “undefined reference to `foo'”, I know where foo is.

RTI Viewer on GitHub

I’ve  ported an RTI viewer from C++/Qt to Java/Android.  It’s available on GitHub, so check it out.  It’s still in the alpha state, so expect bugs.  So far, it has only been tested on a Kindle Fire, and it certainly requires a decent graphics processor.

Reflectance Transformation Imaging (RTI) allows the user to change the lighting of an image interactively.  Others have written desktop software for the same purpose.  I’d suggesting trying it out first to see what RTI really is.

SVN GNOME keyring issues

I recently had issues checking out an SVN repo.  The problem went something like this:

adam@nynaeve:~$ svn co https://server/path
Password for ‘default’ GNOME keyring:
svn: OPTIONS of ‘https://server/path’: authorization failed: Could not authenticate to server: rejected Basic challenge (https://server)

This was a little puzzling, not least of which because I don’t use GNOME.  The problem seems to be a bug, which the SVN guys say is a GNOME issue, and the GNOME guys say is an SVN issue.  To make a long story short, the answer was to delete ~/.gnome2/keyrings/default.keyring.  The GNOME keyring left me alone, SVN prompted for my credentials, and the day was saved.

Wedding page added

Megan and I are getting married on December 17th.  I put up a page with details.

Viewing deleted-but-open files on Linux

On Linux, a file may be deleted (removed/unlinked) while a process has it open.  When this happens, the file is essentially invisible to other processes, but it still takes on physical space on the drive.  To find out how much space is taken up by these files, run:

sudo lsof | awk '/deleted/ {sum+=$7} END {print sum}'

lsof will list all open files.  awk then searches for the deleted ones and sums up the file sizes (in bytes).

It turns out that you can use lsof and the /proc filesystem to recover deleted files, as long as some process has them open.

Unit testing graphics code

I recently worked on a project that involved heavy use of custom graphics.  Specifically, lots of lines.  The code was sufficiently important and complex that it needed to be unit tested.  However, I had never unit tested drawing code before.

A simple strategy would be to paint to an image, then compare the result with a target image.  However, this has a few disadvantages:

  • It’s impossible to tell if unnecessary graphics calls were made (outside the clipping region, for example).
  • Graphics may render slightly differently on different platforms.
  • This can tell you where the output image is wrong, but it does not automatically tell you which part of the test failed.

I decided to mock a Graphics object and check that various calls were correct.  To avoid writing a lot of boilerplate code to keep track of color, line style, clipping, etc., I created a GraphicsProxy class, added a base field, and used Eclipse to generate methods that pass the call on to the delegate.  (This would have been perfect for the Proxy class, but Graphics is an abstract class, not an interface.  Poor foresight on the part of the original writers, but I digress.)

This proxy class is then subclassed to count calls to drawLine and drawPolyline, and to keep track of the line segments drawn.  I also wrote a LineChecker class to compare the expected line segments with the actual line segments.  This is slightly trickier than it might sound, because the segments may be drawn in a different order, and the endpoints of individual segments may be switched.  All in all, a test method looks like this:

public void testPaint() {
	BufferedImage image = new BufferedImage(200, 200, BufferedImage.TYPE_3BYTE_BGR);
	Graphics2D imageG = image.createGraphics();
	CountingGraphics g = new CountingGraphics(imageG);

	// draw here

	LineChecker c = new LineChecker();
	c.require(19, 180, 19, 160);
	c.require(19, 160, 39, 160);
	c.require(159, 40, 159, 20);
	c.require(159, 20, 179, 20);
	c.check(g.lines);
}

Here’s the code:
unit-testing-graphics.tar.gz

Block unwanted sites in Google searches

Google is adding a feature to let you block sites from your searches.  Yay, Google!  Goodbye, Experts Exchange.

Forcing extreme supersampling with POV-Ray

I recently worked on a project that involved rendering images with the POV-Ray raytracer.  In the particular scene I was rendering, every pixel was expected to be mostly black with a very small, very bright white spot.  I needed very high supersampling for the output pixel to be the correct shade of gray.

POV-Ray has built-in support for anti-aliasing only via adaptive supersampling.  Unfortunately, that doesn’t work with my scene, because most initial rays will come back the same color (black), and the adaptive algorithm decides that further sampling is not necessary.

Another way to approach the problem is to render at a larger resolution and reduce the final image.  However, that does not work either because of clamping.  As I said, most values would be black, but a few would be very bright.  Let’s say we have 8 rays that return black and one ray that returns a value of 1000.  When that gets saved to the image, the white value gets clamped to 255.  The reduced image would then have a value of 255/9 or 28 for that pixel.  Using the correct value of 1000 would give 1000/9 or 111.  I tried getting around the clamping issue by saving to a high dynamic range format such as EXR.  Either POV-Ray or ImageMagick (which I used to reduce the images) was still clamping values, though.

In the end, I used a new feature in the POV-Ray beta called a mesh camera.  The basic idea is to create a mesh in the scene with each polygon mapping to a pixel in the output image.  To supersample, I used multiple meshes, each offset by a small amount.  The values from the corresponding polygons in each mesh can then be averaged before being output.  Without further ado, here is the code I used:

// Our camera is orthographic
#declare ca_mesh =
  mesh {
    #local px_inc = 1; // Distance between vertices in the mesh
    #local py_inc = 1;
    #local row_count = 0;
    #while (row_count < image_height)
      #local col_count = 0;
      #local d_y = row_count * py_inc;
      #while (col_count < image_width)
        #local d_x = col_count * px_inc;
        triangle {
          <d_x, d_y, -1>
          <d_x + px_inc, d_y + py_inc, -1>
          <d_x + px_inc, d_y, -1>
        }
        #local col_count = col_count + 1;
      #end
      #local row_count = row_count + 1;
    #end
  }

// Our sample grid per pixel is sampleCount by sampleCount.
// In other words, sampleCount is the square root of the number of samples per pixel.
camera {
  mesh_camera {
    sampleCount * sampleCount
    0 // distribution #0 averages values from multiple meshes as described
    #local i = 0;
    #while(i < sampleCount)
      #local j = 0;
      #while(j < sampleCount)
        mesh {
          ca_mesh
          translate <i / sampleCount - .5, j / sampleCount - .5, 0>
        }
        #local j = j + 1;
      #end
      #local i = i + 1;
    #end
  }
}