OBriens tower
Musings on software development, Linux and business

Gone next door …

August 22nd, 2008 by stephen mulcahy

Hi. If you’re looking for my blogs on all things Linux, I’ve moved them to our other blog - http://www.atlanticlinux.ie/blog/ = please join me over there!

Quiet Noise and a 60 Minute Challenge

June 18th, 2008 by Robert Fuller

Developing good software takes time and much thought. It’s brain work. Building complex applications well is difficult to do at the best of times, but I imagine it would virtually impossible say in the middle of a busy shopping centre or on a construction site. Noise and much activity create concentration breaking distractions. A quiet room is most conducive to developing software well.

We encourage a quiet office environment for our software engineers. We’ve separated the noisy management/support/sales types from the quiet area reserved for hard-working software engineers and I find it refreshing to walk into a respectfully quiet room of people ‘hard at work’.

But wait, I can see that it is not as quiet as it used to be. It sounds quiet but I can SEE that it’s not so quiet. 20 applications open. Popup messages showing new incoming mail mail. Three simultaneous conversations in instant messaging. Banner ads flashing. How can you concentrate with all that quiet noise?

If you are a software engineer with 20 applications running on your desktop including email and instant messaging, I offer you the sixty minute challenge the next time you have a software problem to solve.

Here’s the challenge:
Move to a clean desktop with no email, no instant messaging, no ‘latest news’, no programming buddy, no mp3; just you and your IDE. Put a ‘do not disturb’ sign on the side of your desk. For one hour focus all your concentration on the problem at hand.

If you take the challenge I’d be interested in your observations, so post a comment afterwards.

Quantifying simplicity in code

May 14th, 2008 by ndunne

One of the key values here at Applepie is simplicity. As a member of Applepie I strive whenever possible to deliver lean, readable, manageable code that meets its requirements and is a pleasure for other developers to use. It’s not easy. To help ‘keep it simple’ we follow a number of practices, such as code peer reviews, which at there heart have the question: “Is this simple?”. The outcome of these practices leads to better code that we have qualified as manageable and elegant. Code we feel is simple.

However, Qualifying or feeling something to be simple is often not enough, we want hard fact, we want to quantify how well a body of code exhibits simplicity. One answer, is to analyse the code base and generate metrics that can support us in our quest to simplify. One of the best metric I’ve found for this is Cyclomatic Complexity.

Cyclomatic Complexity (CC) measures the complexity of your code by counting the number of paths through your code. The number of paths is the CC number. The bigger the CC number the more likely that code is difficult to conceptualize and also less likely you can unit test that code effectively. It’s a bit like navigating a road, a Y junction fine, a cross roads OK, a four-way interchange is (well for me anyway) reaching my cognitive limit.

As a rule of thumb, Cyclomatic Complexity Numbers are as follows:

  • Simple - 11 or less is optimal
  • Manageable - 11-21, may be problematic and will require large unit tests.
  • Complex - 21-50 will be problematic and certainly will not be easy to test.
  • Forget it - 50+ cannot be tested and requires self-actualized yogi chess grandmaster to understand.

So how to profile your code for CC. I’m going to focus on Java though similar tools exist for C#, PHP, Ruby, Python. The tool I’ve used for Java is PMD (works with the IDE of your choice). PMD generates many static metrics on your code base including Cyclomatic Complexity.

To install PMD follow the instructions on this PMD onJava article, or if you only have 5 minutes here’s a quick install guide for eclipse. Pop the exploded zip downloaded from PMD into your plugins folder. After you have installed the plugin you need to activate the metrics for your project. Go to the project properties and select the Metrics option and then select the enable metrics. That’s it! The metrics are then calculated and a metric view is presented.

McCabe Cyclomatic Complexity (CC) is the PMD metric we are interested in. PMD generates these CC statistics for the entire project. It allows you to drill down a tree of statistics and quickly ascertain the highest CC for each package, class and method. It even highlights items over a threshold in red (defaults to11, but is configurable). Now at last I have the CC count for every piece of code I work on.

These generated statistics gives a great overview of where code needs to be simplified and (hopefully) helps a bit more in answering “Is this simple?”, allowing me to back up my gut feel with some quantifiable statistics.

If you want to know more on Cyclomatic Complexity see here. If your interested in other metrics supporting simplicity see Operands on an operation.

Java job vacancies in Galway, Ireland

May 10th, 2008 by Robert Fuller

Galway is a great place to live and work.

When I arrived to Galway from British Columbia in 1993 both Linux and Java were in their infancy. I had heard of neither - and why should I have - I was a carpenter.
Engaged to a Galway girl (there ain’t nothin’ like them, lads!), I was at that time entitled to a work permit, and went down to the Mill Street Garda station to get one. The conversation with the garda went something like this:

me: Hi, I am engaged to a Galway girl and I’d like to get a work permit.
garda: Where are you from?
me: Canada.
garda: What do you do?
me: I’m a carpenter.
garda: Go back to Canada, there’s no work here.

The following year I traded hardwood and softwood for hardware and software. Wow that was almost fifteen years ago.

The job situation is different here now. There are jobs for Java developers in Galway, and yes foreigners are welcome (permission to work in Europe is required).

Some of the companies I am aware of who have recently been hiring Java developers in Galway include:
Applepie Solutions (us)
ATFM Solutions
Celtrak
Cisco Systems
Duolog
Fisc Ireland (Fidelity Investments)
Nortel Networks

Know of other companies looking for java developers in Galway? Let me know and I’ll add a link here.

What’s wrong with this java code?

April 16th, 2008 by Robert Fuller

Can you spot the bug in this code?

01   Connection conn=null;
02   Statement st = null;
03   ResultSet rs = null;
04   try{
05    conn = getConnection();
06    st = conn.createStatement();
07    rs = conn.executeQuery("select foo from bar");
08    ...
09  }finally{
10    if(rs!=null) rs.close();
11    if(st!=null) st.close();
12    if(conn!=null) conn.close();
13  }

Answer: If an SQLException is thrown at line 10 or 11, line 12 will not be executed. If line 12 is not executed some resources may be lost.

Yes, the likelihood of an exception being thrown at line 10 or 11 is low, but good java programmers will avoid leaking resources by defensive use of try-catch blocks. Use the pattern of starting the try block immediately after allocating a resource. Here’s a better way to write the same block of code:

01   Connection conn =  getConnection();
02   try{
03     Statement st = conn.createStatement();
04     try{
05       ResultSet rs = conn.executeQuery("select foo from bar");
06        try{
07         ...
08        }finally{
10          rs.close();
11        }
12     }finally{
13       st.close();
14     }
15   }finally{
16     conn.close();
17   }

Getting a thread dump from Tomcat running as a Windows service

April 14th, 2008 by Albert MacSweeny

We’re supporting a java application deployed on Tomcat, which is running as a windows service. On Friday the logs showed that parts of the application were frequently timing out while trying to aquire a DB connection from a pool. We wanted to get a thread dump to see if any threads holding connections were deadlocked. If Tomcat had been started from a console this would be straightforward, unfortunately it wasn’t, and we didn’t have the option to re-start it on the production server.

One useful tool is the free web start version of stack trace. We had no joy with this either though. Our remote desktop session was not the account from which the service was started. Stack trace helpfully suggests using Start->run->”mstsc /console” to start the remote desktop session in this case, but this would have terminated other sessions that were open to the server, and therefore wasn’t an option for us.

Cue a moment of inspiration from Rob, which resulted in a simple jsp that will output a thread dump. Note that your applicaiton must be running on at least java 5.0 for this to work. Just make a simple jsp with the following snippit as the body of the page, and drop it in the web root of your application. Then fire up a browser, navigate to the jsp and view the dump without even having to restart Tomcat!


<body>
<center><h1>Thread Dump</h1></center>
<pre>
<%

  StringBuffer sb = new StringBuffer();
  Map  st = Thread.getAllStackTraces();
  for (Map.Entry  e : st.entrySet() ) {
    StackTraceElement[] el = e.getValue();
    Thread t= e.getKey();
    sb.append(”\”" ).append( t.getName() ).append( “\” ” );
    sb.append( t.isDaemon()?”daemon”:”" ).append( ” prio=” ).append( t.getPriority() );
    sb.append ( ” Thread id=” ).append( t.getId()  ).append( ” ” ).append( t.getState()  );
    sb.append( “\n” );
    for (StackTraceElement line: el) {
      sb.append(”\t”+line + “\n”);
    }
    sb.append(”\n”);
  }

% >
<%=sb.toString() %>
</pre>
</body>

Feeling better after a good stack dump

March 27th, 2008 by Robert Fuller

Earlier this week we encountered performance problems on one of the production systems we developed and help support. After having migrated several hundred clients onto the high throughput java based system, the users began to notice some strange slowness appearing.

Log files are great. By studying the production logs we were determined that the output side of the application was no longer keeping up with input. Having already developed some performance enhancements for a future release, we backported some of these and generated a patch which we tested then deployed into the production system. The system was fast again.

Or so we thought. Infact it was much faster for two days until the the strange slowness suddenly reappeared. The logs revealed that things had slowed down, but no indication why. I generated a stack dump (java on linux) using kill -3. I created a three column spreadsheet, then skipping idle threads belonging to the web application container created one row for each application thread. The columns are:

  1. Thread name
  2. Kind of thread (input, output, etc.)
  3. what the thread is doing

This took a little time as the application has more than 100 threads, but as I did it a pattern began to emerge… I could see many threads waiting to lock a statically synchronized method of a date parsing utility component. I investigated and found that the method had been statically synchronized because it relies on java.text.SimpleDateFormat, a class which is not synchronized and relatively expensive to create.

Studying what others have written about this problem, the development team is now reworking the implementation to use ThreadLocal instances of the SimpleDateFormat rather than statically shared instances. The stack dump was very useful in helping to find the blockage. I hope the fix resolves the problem!

Setting priority of software development tasks

March 21st, 2008 by Robert Fuller

I was asked yesterday by a software engineer how to decide what task to tackle next on a project. I was happy to hear the question; it told me that the developer taking ownership and responsibility for the project.

The general principle I like to follow in setting priority is this: do first what is important and easy, do last what is unimportant and difficult.

Here’s how to determine the priority of the tasks to be done:

  1. Write down the lists of tasks
  2. Rate the relative importance of each task using a number between 1 and 5 (1=important)
  3. Rate the relative difficulty of each task using a number between 1 and 5 (1=easy)
  4. Calculate priority as importance multiplied by difficulty

The calculation will give you low numbers for those tasks which are important and easy and high numbers for those which are unimportant and difficult. Tackle first the items with priority 1.

Cheap and cheerful java object persistence using Lucene

March 18th, 2008 by Robert Fuller

I took advantage of the the St. Patrick’s long weekend to experiment with using Lucene as a simple java object store. The context of the research was to determine whether it is feasible to create with Lucene a simple persistence layer to be used in a project currently holding an increasing number of disconnected java objects in an in-memory map.

I came to considering Lucene as an object store having already investigated using persistent maps and caching components such as jcs and ehcache. One of the main issues I encountered with these was that searching for objects based on some criteria other than the key required either indexing the sought objects at an application level, or putting up with a lot of I/O when iterating through a large volume of stored objects. I deemed hibernate to be an option, but avoided it primarily due to concerns about increasing the complexity of an already-complex-enough project.

While the practice of indexing java objects with lucene has been around for a while, the option of easily persisting the objects themselves in lucene is newer. A recently added feature provides the ability to store fields containing binary content - perhaps a suitable place for storing java objects? Grant Ingersol, one of the committers on the Lucene project recently blogged,

I even use it in things that 5 years ago I would never have thought I would use it for (object stores, etc.)

There are several features about my java objects which make them suitable for indexing and storing in lucene:

  • They already implement java.io.Serializable.
  • They are essentially data holders.
  • They are disconnected - they do not hold references to other objects which will also be in the repository.
  • They have get* methods which can be used for accessing most anything I will want to search on.
  • Each object already has a unique identifier

The result of the weekend’s work was a single java class which implements persistence in lucene. I called it Lucos - Lucene object store. It is available for download here.

The basic functionality is to put/get an object in/out of the store in a manner similar to how an object is stored in a map. Here is an example:

Person fred = new Person("Fred Flinstone");
Lucos lucos = new Lucos();
lucos.put("fflinstone",fred);
Person x = (Person) = lucos.get("fflinstone");
//NB: x is a COPY of fred
assertEquals(fred,x);

Putting an object in the class using the put(String id, Object value) method, creates indexed fields for all of the no-arg get* methods on the value class. It also create indexes on all the value class and all the classes it extends or implements. Put changes are committed immediately to the index. Subsequent gets (or searches) reload the index (if necessary) to retrieve the latest changes.

To find all the instances of person in the repository:

EntryIterator it =
lucos.findInstances(Person.class);
System.out.println("Found "+it.length+" persons");
while(it.hasNext()){
String id = it.getKey();
Person person = (Person) it.getValue();
...
}

Providing search functionality was one of the features I required in order to overcome the issues already identified with searching a persistent map. One of the difficulties I encountered in doing this was that where fields were stored tokenized an exact match did not seem possible, and where stored untokenized, a partial match did not. To overcome this difficulty, I indexed fields in both tokenized and untokenized format, appending ‘.exact’ to the name of the untokenized field. Given that my Person has method String getName(), I can search my objects with any of these:

// find all persons named fred using a TermQuery
lucos.findInstances(Person.class, "name", "fred");
// find all persons named fred using lucene syntax query and the installed Analyzer
lucos.findInstances(Person.class, "name:fred");


// find all persons named Fred Flinstone using a TermQuery
lucos.findInstances(Person.class, "name.exact", "Fred Flinstone");

If you want to use a query not parsed using the Lucos analyzer, parse the query first, then pass it to findInstances:

QueryParser parser =
new QueryParser("name.exact", new KeywordAnalyzer());
Query query = parser.parse("\"Fred Flinstone\"");
it = lucos.findInstances(Person.class,query);

Here’s how to create a Lucos instance which uses file persistent storage:

String folder = "{path to folder}";
Directory directory = FSDirectory.getDirectory(folder);
Lucos lucos = new Lucos(directory);

Finally, don’t forget to close() lucos when finished with it. This will release the lucene write lock:

lucos.close();

I still need to do volume and load testing with some production data to verify the solution will provide memory/performance trade-off in reducing the size of my in-memory map. For the moment I’m satisfied that it is feasible to use Lucene as a java object store. The solution adds minimal complexity to the project introducing only one additional (lucene) jar file. For a future iteration it might be worth considering adding a dependency on xstream, removing the requirement that objects placed into the repository implement the serializable interface, and also possibly making them more generally searchable.

If you would like to add cheap and cheerful java object persistence into your project, I hope that Lucos might provide you with some code for thought and perhaps the basis for a solution. The code and a test class for Lucos is available for download here.

Comments are welcome!

Toolbox for a Java craftsman

July 26th, 2007 by Robert Fuller

Back in the 80’s, the “olden days”, before I was a software guy I was a builder guy. I drove around in an rusty Ford econoline van with scaffolds and ladders on the roof; tools and materials in the back of the van, techniques in the back of the head. Tools and materials and techniques. Most jobs required all three, and like all tradesmen I accumulated some of each over the years. In those days I would arrive on a project ready to hit the ground running - I brought the basics with me.

Now with more than a decade of industrial software engineering behind me, I’m pelased with the collection of tools and techniques I carry with me. I don’t mean the phone and the laptop. I mean tools like putty and firefox and scite and cvs, mstsc, thunderbird, skype, gaim, open office, gimp, password safe, igal and vi to name a few.

The contents of the toolbox change over time. I no longer use cygwin and though emacs is still in there but doesn’t come out so often. Subversion is in there now, though I haven’t used it enough to make it my own.

Some tools are best not left behind. To any java project I always bring eclipse, ant and junit, and usually log4j.

Re-usable suitably-licensed open-source software components are a big part of my toolbox. Here are some of the most tried and trusted components that I have included in various java projects over the years and am happy to recommend:

  • Logging: log4j (Notice I’m mentioning it for the second time)
  • Working with xml: dom4j provides useful functionality.
  • Text indexing: lucene is an extremely well done component.
  • Templating: velocity is proven.
  • Scheduling: quartz is very robust and reliable.
  • Working with pdf: PDFbox is worthwhile. I’d like to try a using pdfbox with velocity sometime, for templating pdf documents.
  • File identification: ffident provides mime type identification. I added some (what I thought were useful) new features which I sent to the author, but he doesn’t seem to have folded them in.
  • Http requests: http client is reliable.
  • Http file uploads: Commons file upload is helpful for handling files uploaded to your servlet or jsp page.
  • cleaning up html: nekohtml is very useful if you want to take html pages from the wild and convert them to xml in order to run xpath expressions against them.
  • Working with excel spreadsheets: poi is handy for reating and writing html.
  • Charting: JFreeChart is very useful.
  • Embedded sql: hsqldb is a useful sql database written in java which can be embedded into your application to run in-memory or persisting to the file system.
  • Scripting: BeanShell, Rhino, and BSF if you can’t decide between them ;-)
  • JNDI: Commons naming provides jndi setup for your application using tomcat 4 style configuration. (I don’t know why this is not more readily available, if you do, please let me know!)

Maybe some of these tools will be useful to you. If you have some more tried and trusted components for java applications, maybe post a comment.

When I moved from the West of Canada to the West of Ireland in the early 90’s I left most of the tools behind me, the saws and the ladders and the compressors and the welder. I still have my old hammer. I’m a bit rusty on some of the techniques, but I’ll probably never forget the how to dry a paint brush, taught to me by my mentor Lorenzo Quarenghi, son of Walter: after cleaning the brush in water or thinner (as appropriate), go outside wearing your old shoes or boots. With your heel on the ground and your toe pointing up, hold the handle of the brush and tap the metal edge on the top of your toe. The spray goes onto the ground and onto the sole of your shoe, the brush becomes clean and dry!