Many repeated lines of text in Java is a sign that there is a more elegant way to solve the problem.  In the following example, the developer needed to convert any letter of the alphabet from lower to upper case:

if (theChar == 'a') theChar = 'A';
if (theChar == 'b') theChar = 'B';
if (theChar == 'c') theChar = 'C';
:
// 22 lines removed from here
:
if (theChar == 'z') theChar = 'Z';

Repetitive code like this works but should never be seen for the same reason you don't drive your car to the toilet – there is usually a simpler way.

After the second line the developer should have felt the impending tedium of twenty-five more lines and asked one of their colleagues if there was a better approach, which there was:

theChar = Character.toUpperCase(theChar);

There is no shame in asking for advice, but make sure you show you spent some research time trying to look for solutions first and remember the answer for next time.

“Brian had the job of writing importers for each of the XML data feeds to load third party information into our application. Without consulting other more experienced developers, he decided immediately that a manual XML parsing approach was the best approach. After all, he could write it using his extensive knowledge of String.indexOf() and tokenization.  

But Brian should have used one of the many open source XML parsing libraries that were available.

Parsing XML using String tokenization is not easy.  By the tenth call to String.indexOf() Brian may have started to wonder if he had the right plan of attack.  But he soldiered on and created a thousand lines of code for each feed. The parsers worked acceptably until the XML supplier started manually editing the streams, adding bugs. Despite the XML no longer validating, Brian added fault tolerance to his parsers by manually fixing up the offending lines, replacing Strings he knew had the wrong text in them. Next the XML supplier added invalid characters to the feeds so Brian did a find/replace of the characters with new ones. It was not for Brian to wonder why the odd characters were coming in or to realize that the encoding of the XML was not UTF-8.

Thus changes were made in reaction to each new invalid character or broken tag that was found.  A few months later the XML supplier found problems with their feed and fixed them all at once so the XML validated correctly. Brian's parsing libraries would not accept the valid XML. By then Brian was off the project and nobody could maintain his bespoke solution.”

Brian's solution

  • Manual code written for parsing all XML. Large methods containing lots of Java code.

  • Did not support different character encoding and used String replace functions to filter unexpected characters.

  • No validation of incoming XML and attempts to fix badly formed XML which should have been rejected.

  • Written in a month; needs optimization due to poor performance.

  • Difficult to change – developed for Brian to understand only.  The wiki page for the XML parser still reads “Brian to fill in this section”.

The Open Source solution

  • Uses open source XML parsing libraries.  Handlers are written only for the XML leaf – very little actual code required.

  • Readily supports different character encoding.

  • Validates incoming XML against a schema and raises errors when it fails.

  • Libraries were already written over a period of years so is optimized for speed and small memory footprint.  Far quicker to implement than Brian's solution.

  • Maintainable – uses well documented industry standard libraries, so understood by any Java developer who has coded for XML before.

blog comments powered by Disqus