<?xml version='1.0' encoding='UTF-8'?><rss xmlns:atom='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' version='2.0'><channel><atom:id>tag:blogger.com,1999:blog-9531034</atom:id><lastBuildDate>Fri, 18 Apr 2008 10:04:42 +0000</lastBuildDate><title>Adept Software Development</title><description/><link>http://marringtons.com/Adept/blog/Software.Development/</link><managingEditor>Paul Marrington</managingEditor><generator>Blogger</generator><openSearch:totalResults>31</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-6716332785354237126</guid><pubDate>Mon, 27 Aug 2007 05:52:00 +0000</pubDate><atom:updated>2007-08-27T16:53:34.140+10:00</atom:updated><title>Disk Caches and Notebooks</title><description>Now that I am working in a research job again I will be reviving my blogging. As Software Manager I just did not have time or enough excess energy. While I am thinking up a list of topics I will start off with a complaint session.

With the new contract I bought a Macbook Pro with wireless broadband so I could access things like Safari Books Online without worrying that a net-nanny had decided that programming was bad language.

I am not a mac-o-phile, but when I did the analysis this time the top of the line Macbook turned out to be the best value for money. My notebook before that was an IBM running Windows 2000 and then XP. With 384Mb of RAM I expected a fair bit of hard disk activity.

The Mac has 4Gb, so why does the hard disk run all day? Fortunately "Activity Monitor" has a disk activity pane. With the system idling I see a write every 15 to 30 seconds. No reads, as expected. Unfortunately the monitor did not tell me who was writing to disk. So, for fun (?) I started killing off tasks one at a time. I hoped to find a culprit. No such luck. The space apart between right became wider. Once I had removed most applications it was as long as 4 minutes. I had the drive wind-down set at 5 minutes and it powered down once.

The truth is that in this modern day of multi-thread programs it makes sense to have threads running in the background to monitor the context and save it in case the computer closes down unexpectedly. Or to do housekeeping that either writes do disk directly or makes changes to the virtual memory balance. It could be something as simple as a log being updated to say that no activity has taken place.

My gut feeling is that most hard disk spin most of the time. I wonder how many tons of carbon a day that equates to for the world's desktops? Or closer to home, how much longer I can run on battery without the hard disk spinning all the time.

I was going to wait for a solid state drive, but they weren't quite main-stream. To show that others are thinking on the subject - it is touted as one of the benefits of hybrid drives.

I don't think we need fancy hardware to improve things today. I don't even think it needs much of a software change. I think all modern operating systems have write-behind caches. How about giving me a power saving option that does something like:

&lt;pre style="font-style: italic; font-family: monospace; font-size: small;"&gt;
if no user activity in 5 minutes
if no program using significant constant amounts of CPU time
if write-behind cache is less than 100Mb (or possibly even 10Mb).
if battery power is not low (notebook or UPS)
then
  Turn off drive and cache writes in memory until one of the conditions above change.
&lt;/pre&gt;

It would have to be a power setting in the control panel. There are probably some systems out there where these background tasks are critical and the risk too high. I probably wouldn't use it on a desktop without a UPS. Given the low cost of a UPS these days I would probably buy one just to have my hard disk powered down for a large portion of the day.</description><link>http://marringtons.com/Adept/blog/Software.Development/2007/08/disk-caches-and-notebooks.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112720497662620240</guid><pubDate>Tue, 20 Sep 2005 08:28:00 +0000</pubDate><atom:updated>2005-09-20T18:29:36.633+10:00</atom:updated><title>JavaScript Events - Part 4 - Event Library Source</title><description>As promised, here it is. Refer to the earlier articles if you want to know the hows and whys.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
/*
 * Created on 5/11/4
 *
 * Copyright 2004 Paul Marrington
 * All rights reserved - http://marringtons.com
 * PROPRIETARY/CONFIDENTIAL
 * Use is subject to license terms - paul@marrington.net
 */

/**
 * Add all the events in an event object
 * inside the group given.
 */
Events.addEvents = function( element, group)
  {
    for (var name in group.events)
      Events.add(
        element, name, group.events[name]);
  }

Events.addSameEvent = function(
  name, action, elements)
  {
    for (var i = 2;  i &lt; arguments.length;  i++)
      Events.add( arguments[i], name, action);
  }

Events.add = function( element, name, action)
  {
    name = name.toLowerCase();
    var on = "on" + name;

    if (! element.events)
      element.events = new Object();
    
    var list = element.events[name];
    if (! list)
      {
        list = element.events[name] = new Array();
        if (element[on]) list.push( element[on]);
        /*
         * This is an interesting bit of obscure code.
         * If there were any other way, I would not do
         * it, but IE makes it essential. The event handler
         * is inline so that it keeps a handle to the surrounding
         * function/object. This way we can set event.owner
         * to the same as the calling element when add
         * function was called - even though the event
         * handler is called asynchronously later,
         * long after the add function is dead and buried.
         * Nasty. It also means we are hanging on to an
         * unknown amount of stuff. The W3C DOM uses
         * evt.currentTarget, but this is the ONLY way
         * we can get IE to know the owner. This is
         * because evt.target is the actual element
         * clicked on or whatever, while the owner 
         * of the event can be further up the DOM tree.
         */
        element[on] = function( evt)
          {
            /*
             * Normalise the event object between
             &amp; browsers and retrieve the list
             * of actions to take.
             */
            evt = evt || event;
            if (! evt.target)
              evt.target = evt.srcElement;
            /*
             * We get element from the owner object.
             */
            evt.owner = element;

            var list = this.events[evt.type];
            /*
             * Run each event in the list until
             * one returns false. This breaks the chain.
             * Events are run from the most
             * recently added to the oldest.
             */
            for (var i = list.length - 1;  i &gt;= 0;  i--)
              {
                evt.action = list[i];
                if (! evt.action( evt))
                  return false;
              }
            return true;
          }
      }
    
    list.push( action);
  }

Events.newList = function( name)
  {
    var newList = function( evt)
      {
        for (var j = evt.action.events.length - 1; 
          j &gt;= 0;  j--)
            if (! evt.action.events[j]( evt))
              return false;
        return true;
      }
    newList.eventName = name;
    newList.events = new Array();
    newList.add = function( event)
      {
        this.events.push( event);
      }
    return newList;
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/09/javascript-events-part-4-event-library.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112659445357609939</guid><pubDate>Tue, 13 Sep 2005 06:53:00 +0000</pubDate><atom:updated>2005-09-13T16:54:13.586+10:00</atom:updated><title>JavaScript Events - Part 3 - An Event Library</title><description>My first attempt at an event library allowed a single event to be attached to multiple elements, as well as multiple event methods to a single event on a single element. This is very useful when you want to track the mouse since you need to register the same even for multiple windows or frames. However, it did require  the creation of a special event object and then to have that object attached. Consequently - except for the special mouse tracking case - most of my events were set in the time honoured way:

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
element.onclick = function( evt)
  {
    evt = evt || event;
    if (! evt.target)
      evt.target = evt.srcElement;
    alert( "clicking " + evt.target
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

The problems grew:
&lt;ol&gt;
&lt;li&gt;Every event function had to have the first two lines to ensure basic data conformity between browsers.
&lt;li&gt;If more than one piece of independent code wanted to set the same event, they had to know about each other or the event call was lost. 
&lt;li&gt;The &lt;i&gt;target&lt;/i&gt; field pointed to the element that triggered the event, not necessarily the element that owned the event. The latter isn't available from the event object for IE.
&lt;/ol&gt;

I wanted to create a library that:
&lt;ol&gt;
&lt;li&gt;Could be called to set for an element event that already had others functions attached - either by HTML or due to earlier calls. Events needed to be called from most recent to oldest set.
&lt;li&gt;Would provide an event object already normalised for browser differences.
&lt;li&gt;Would provide an owner field that worked across all browsers.
&lt;li&gt;Could ease coding by setting multiple events for a single element, and multiple elements with the same event.
&lt;/ol&gt;

&lt;h4&gt;Usage&lt;/h4&gt;

The core method is called, surprisingly enough, Events.add(). It requires a reference to the element, the event name and a function to run when the event is triggered.
&lt;ul&gt;
&lt;li&gt;The event name, as with the DOM equivalent, does not start with "on". In other words, use "click", not "onclick".
&lt;li&gt;It will call multiple action methods for a single event.
&lt;li&gt;If the HTML had an event set, this event will be added as well.
&lt;li&gt;The action function has a normalised event object as its one parameter. 
&lt;li&gt;It includes valid &lt;i&gt;target&lt;/i&gt; and &lt;I&gt;owner&lt;/I&gt; fields.
&lt;/ul&gt;

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
Events.add( menu, "click",
  function( event)
    {
      Panel.menuHover = false;
      menu.style.display = "none";
      Panel.hideShield( menu.panel);
      return true;
    });
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

This can still cause a tedious amount of coding when a lot of events need to be set for a single element. I use an &lt;i&gt;addEvents()&lt;/i&gt; method for these cases. It still sets events on a single element, but now it picks up these events from the methods of an object called &lt;i&gt;events&lt;/i&gt;. This makes for some clear self-documenting code.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
Events.addEvents( tabBarTextNode, Panel.tabActions);
...
Panel.tabActions = {events : {}};

Panel.tabActions.events.click =
  function( evt)
    {
      var panel
        = Elements.getPanel( evt.target).panel;
      Panel.setFocus( panel);
    }

Panel.tabActions.events.contextmenu = 
  function( evt)
  {
    var panel = Elements.getPanel(
                evt.target).panel;

    var menu = Panel._contextMenu(
                panel, Mouse.location( evt));

    menu.style.top =
      Panel.frame.height - menu.offsetHeight;
      
    return false;
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

Less common - but just as important -  are the situations where the same event and action need to be set on many elements. This following method has to use the parameters in a different order with the elements at the end, so that we can add as much as is needed.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
Events.addSameEvent( "contextmenu",
  function() {alert('not allowed');},
  panel.titleBar, panel.resizeBox,
  panel.shadowRight, panel.shadowBottom);
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

This last method is going to take some describing - as is always the case when objects are used. It's used if you need to add an event to multiple elements that can be used to call multiple actions. None of the methods above will do the job. We can add multiple events to a single element, but we can't add a new event and have it be available to more than one element. The object &lt;i&gt;newList&lt;/i&gt; resolves this deficiency: it creates a method that's also an object, so it can have state as well as function. The state is used to point to a list of events, rather than attach them to the element as we do above.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
// Called once to create event function objects.
Mouse.events = {};
Mouse.events.mousemove
  = Events.newList( "mousemove");
  
Mouse.events.mouseup
  = Events.newList( "mouseup");

Mouse.events.blur
  = Events.newList( "blur");
...
// Called for each object we want
// to trigger the event list for.
// Note that the events added are
// function object, not just functions.
Mouse.initialise = function( w)
  { Events.addEvents( w, Mouse); }
...
// Elsewhere we add new events that will be
// called for each registered element
Mouse.events.mousemove.add( Panel.onmousemove);
Mouse.events.mouseup.add( Panel.onmouseup);
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

The mouse object here is the only occasion so far  when I've used this more complex structure. Every window registers itself for Mouse events by calling &lt;i&gt;Mouse.initialise()&lt;/i&gt;. Other modules that need to do something special on these actions register themselves with the action function objects using add(). To put it simply, &lt;i&gt;Panel.onmousemove()&lt;/i&gt; will be called for each registered window.
&lt;p&gt;
Now, I suppose you want the library code. Well you can't have it yet. This article is already too long. I'll release it next week, so nyah.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/09/javascript-events-part-3-event-library.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112588147534486474</guid><pubDate>Mon, 05 Sep 2005 00:50:00 +0000</pubDate><atom:updated>2005-09-05T10:51:15.353+10:00</atom:updated><title>JavaScript Events - Part 2 - The Event Object</title><description>I've no intention to describe the event object in detail. After all, the subject could easily fill a book (and it has). Given the depth to which I use this object, I will describe the browser differences I've found and how I've overcome them.

&lt;h4&gt;Retrieving The Event Object&lt;/h4&gt;
In Mozilla the event object is passed to the event function as the one and only parameter. IE, however, has a nasty global variable called event. Does that make IE single threaded? Probably. I don't want to even think of the havoc multiple threads running the same Java code would cause. I think Mozilla must handle it. I can set a breakpoint in an event manager and have setTimeout() JavaScript trigger while still stopped in the debugger.
&lt;p&gt;
When an event is set from a HTML tag, Mozilla hides the difference between the two schemes.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
&amp;lt;a id="test" href="" onclick="runOnClick(event)"&gt;
&amp;lt;script&gt;
alert( document.getElementById( "test"));
&amp;lt;/script&gt;
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

generates different implicit JavaScript depending on the platform. Mozilla will create:

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
function( event) { runOnClick(event); }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

While IE will generate:

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
function() { runOnClick(event); }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

This is quite clever. This means that events set in HTML tags can all get the event in the same way.
&lt;p&gt;
Events set on an element by JavaScript is not so lucky. They have to account for the different platforms:

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
document.getElementById( "test").onclick = function( evt)
  {
    evt = evt || event;
    ...
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

This is the simplest working method for retrieving the event object. We try and pass it in. If the event caller does not have an argument we use the global event object used by IE. For this reason we have to call the argument &lt;i&gt;evt&lt;/i&gt; so as not to hide the global one called &lt;i&gt;event&lt;/i&gt;.

&lt;h4&gt;The Event Target&lt;/h4&gt;

The event target is the HTML element that captures the event. This isn't necessarily the element that event has set up. It's more often a child of that element. You can, for example, set an onclick event for a table. The target returned could be the TD cell element. IE does not have a &lt;i&gt;target&lt;/i&gt; field in it's event object. It uses &lt;i&gt;srcElement&lt;/i&gt; instead. The easiest way to normalise the object is to always have code like:

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
evt.target = evt.target || evt.srcElement;
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;h4&gt;The Event Owner&lt;/h4&gt;

The owner is the element that has the event that was triggered as an attribute. This is more often of use than the target, because it is the element we most associated with the even in our code. The problem is that IE does not provide any direct way to get this item. In Mozilla &lt;i&gt;this&lt;/i&gt; refers to it, as does the field &lt;i&gt;currentTarget&lt;/i&gt;.
&lt;p&gt;
There are a number of ways around this. If the event is set in the HTML you can call a function giving the ID of the calling element. More general code could walk up from the target element until it finds an element with the correct event. The most general solution uses some of the more obtuse properties of the JavaScript OO system to set the owner. If the event is set functionally, you can have an inner function that refers to the element being set that is an argument to the outer function. This becomes a dirty circular reference - so that the event data is not released when the outer function returns - but it does the job. This last method isn't obvious, so perhaps an example will clear it up. The code calls an action method with a normalised event object. Note that the &lt;i&gt;element&lt;/i&gt; argument for the add function is used inside a event function that can be called at any arbitrary time later. It's magic.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
Events.add = function( element, name, action)
  {
    ...
    element["on"+name] = function( evt)
      {
        evt = evt || event;
        evt.target = evt.target || evt.srcElement;
        evt.owner = element;
        action( evt);
      }
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/09/javascript-events-part-2-event-object.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112529217834272717</guid><pubDate>Mon, 29 Aug 2005 05:09:00 +0000</pubDate><atom:updated>2005-08-29T15:09:38.350+10:00</atom:updated><title>JavaScript Events - Part 1 - Setting Events</title><description>This is part 1 of a 3 part series.
&lt;li&gt;Part 1: Setting Events
&lt;li&gt;Part 2: The Event Object.
&lt;li&gt;Part 3: A General Usable Event Manager.

  I've just refactored the JavaScript events system for Adept. I initially chose to implement the menu system by generating HTML rather than by building objects. Bad choice. It highlighted the operational inconsistencies between 3 completely different ways to add an event to an element.

  &lt;h4&gt;Three Event Addition Methods&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;In HTML&lt;/b&gt; as in &lt;i&gt;&amp;lt;a onclick="myfunc"&gt;&lt;/i&gt;.
&lt;li&gt;&lt;b&gt;As an Attribute&lt;/b&gt; as in &lt;i&gt;element.onclick = myfunc;&lt;/i&gt;
&lt;li&gt;&lt;b&gt;Using the DOM Model&lt;/b&gt; as in &lt;i&gt;element.addEventListener( "click", myfunc, false);&lt;/i&gt;.
&lt;/ol&gt;

&lt;h4&gt;The HTML Event&lt;/h4&gt;

You can set an event for an element directly in the HTML using the event name preceded by "on". The code generates a function and compiles the attribute value text as the function body. This means that you are not running in the same context as when you programatically attach an event, since your code is running inside a method body. One browser inconsistency is overcome by this method: IE generates &lt;i&gt;function(){yourcode}&lt;/i&gt; while Mozilla generates
&lt;em&gt;function(event){yourcode}&lt;/em&gt;. This allows both event types to access the event object as &lt;i&gt;event&lt;/i&gt; even though it is passed in in Mozilla and is a global for IE. &lt;b&gt;Warning&lt;/b&gt;: Don't use &lt;i&gt;this&lt;/i&gt; to reference the element firing the event. It works for Mozilla, but not for IE.

&lt;h4&gt;Event Attribute&lt;/h4&gt;
In the pre-DOM world, attributes were just fields on the HTML element. For backwards compatability they still are and always will be (for HTML anyway). So,
&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
element.onclick = function() { alert( "click"); }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

will work. The problem is that a single element can only have one event function per event type. This makes it difficult to produce library type code for JavaScript.

&lt;h4&gt;DOM Events&lt;/h4&gt;

If events are set using the &lt;i&gt;addEventListener()&lt;/i&gt; method they are stacked and all events added are fired. Perfect, except that IE (as of 6) does not support this part of the DOM.

&lt;h4&gt;Summary&lt;/h4&gt;

So, what are the problems?
&lt;ol&gt;
&lt;li&gt;Event setting that's portable across browsers does not allow more than one event per type per element.
&lt;li&gt;addEventListener() is not platform independent.
&lt;/ol&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/08/javascript-events-part-1-setting.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112486599453213058</guid><pubDate>Wed, 24 Aug 2005 06:45:00 +0000</pubDate><atom:updated>2005-08-24T16:46:34.543+10:00</atom:updated><title>Coding Standards and Breaking the Rules - Part 3, Exceptions</title><description>While we are talking about going overboard while following a standard, let's talk about exceptions. Exceptions were introduced late into the C++ standard and so for many years were virtually ignored. Existing code had other mechanisms, libraries did not use them and few developers understood them. Java was developed after that. The development team decided to implement and use (or dare I say overuse) exceptions.
&lt;p&gt;
&lt;h4&gt;What is an Exception?&lt;/h4&gt;
What do you do if some code deep in the bowels of a system comes across a problem it can't handle? What, for example, should a routine to convert a string to a number do if given a non-numeric input string? I saw many exceptions in the days prior to exceptions. Some libraries set a global error number - a definite problem if you have multiple threads. I even worked on a C project not that long ago (1999) that passed an error structure around to each and every call in the system.
&lt;p&gt;
Then came exceptions. Just create an appropriate Throwable instance and &lt;b&gt;throw&lt;/b&gt; it. None of the standard contiguous code that follows will be executed until some other code further up the calling graph catches exceptions based on the type you threw. That is not strictly true. Each method up the calling graph can have a &lt;b&gt;finally&lt;/b&gt; block that will execute no matter what exceptions have been thrown. This is especially good for closing resources.
&lt;p&gt;
The theory is that code that does not know what to do with an error condition throws a named exception. Somewhere further up the call graph the code will know what to do or correct and will catch and deal with the problem.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
public int convert( String string)
  {
    if (! aNumber( string))
      throw new NumberFormatException( s+" cannot be made into a number");
  }
...
public void process()
  {
    ...
    int result = 0;
    try
      { result = convert( entry); }
    catch (NumberFormatException numberFormatException)
      {
        /* if we can't get a conversion, make it large enough to work. */
        result = 1000;
      }
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

Notice that in the example, &lt;i&gt;convert()&lt;/i&gt; does not know what to do if the conversion cannot be done. The calling method, however, knows what sort of default to apply - and does so. In the real world it should at least write the problem to a log for later evaluation.

&lt;h4&gt;Problems with Exceptions&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;B&gt;Overuse:&lt;/B&gt; Current thinking is along the lines of 'if in doubt, throw an exception'. An exception is for an unexpected situation that cannot be dealt with where the problem occurred. Make sure you can't handle the problem immediately. If so, avoid an exception. Then consider how the calling code will handle it. In application code it is often the invoker who will deal with the problem. In this case there are often simpler methods that throwing exceptions to process the situation. If so, once again, no exception.
&lt;li&gt;&lt;B&gt;Overuse:&lt;/B&gt; Yes, I know it is technically the same problem - but it's so prevalent that I have decided to repeat it. So, please read the item above again.
&lt;li&gt;&lt;b&gt;Dangerous Program Flow&lt;/b&gt;: An exception causes program flow to jump out of line with no visual indication in the code. The lack of visual queues makes it difficult to be aware of when important operations will not happen.
&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
file = new AppFile();
file.write( info);
file.close();
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;
In the example above, if the write throws an exception then the file is not closed. If the code is called again it will fail because the file is open and locked.
&lt;li&gt;&lt;b&gt;Insufficient Clarity&lt;/b&gt;: Exception pundits are quick to point out that the code immediately above should be written as:
&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
file = new AppFile();
try
  { file.write( info); }
finally
  { file.close(); }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;
That's immediately more difficult to read. The natural program flow has been interrupted by the implementation
need for an exception. I prefer the earlier version.
&lt;/ol&gt;

&lt;h4&gt;Types of Exceptions&lt;/h4&gt;
To be accurate,  this section describes the types of errors exceptions are designed to address.
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Fatal errors.&lt;/b&gt; Have you noticed that when Windows 'blue screens' you can't even reboot from the keyboard. The machine needs a hard reset. This is because the system did something that is totally unrecoverable. The code that caught the error cannot even trust the keyboard, so it displays as much as it can then freezes. In client server systems the situation can be even worse. If the server cannot trust the connection to the client, the client can't even be told how or why before the freeze. Needless to say that fatal errors should be rare, since they indicate fragility in critical single-source-of-failure code. An out-of-memory exception is the most common among applications servers. While a disk-full error can be handled, it is such a rare situation that cannot be corrected by the application that it is usually treated as fatal also.
&lt;li&gt;&lt;b&gt;System errors.&lt;/b&gt; These are the errors that are out of our control. They include disk I/O failures, unexpected closed connections and the like. Many are so unusual that it is unlikely that the code attempts to deal with it. What does your code do if a disk write returns an IOException? The conservative approach would be to treat a disk write IOException as fatal since further writing is likely to corrupt data. Most systems just write such errors to the log and report a generic system error to the user. The question is do you attempt to write again or save the data elsewhere. Again the problem is rare enough that accounting for it in the code is often overkill. Unexpected connection closures (database, socket, etc) are also system errors. In most cases they are dealt with by logging the surprise and grabbing a new connection. The worst the user will experience is a slight delay. Of course if the problem isn't monitored there is a risk of an epidemic of closures and a slow and unresponsive application.
&lt;li&gt;&lt;b&gt;Inability errors.&lt;/b&gt; These are errors when a piece of code us unable to process with the data provided. &lt;i&gt;NumberFormatException&lt;/i&gt; is one of these. They are almost always checked exceptions thrown with the expectation that somewhere up the calling tree code will recognise the problem and know what to do. Never rethrow an exception or pass it up the call graph without considering if the code at this point knows what to do. GUI support will recognise &lt;i&gt;NumberFormatException&lt;/i&gt; and know to return an error message to the user. If thrown during a file read the decision may be harder. The best we can often do is log the problem and provide a reasonable default value. Too many developers treat this as a system error.
&lt;li&gt;&lt;b&gt;Programming errors.&lt;/b&gt; These should always be unchecked. They should not have happened and the code is not built to handle it. The best solution is to undo all actions and tell give the user enough information that they can give support a full picture. I usually return an error code that is unique in the log so that support can find logging around the time of the error. &lt;i&gt;NullPointerException&lt;/i&gt; is the way-too-common example of this class of error.
&lt;/ol&gt;
&lt;h4&gt;So How Do I Break the Rules?&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;I always use unchecked exceptions for programming errors. Why complicate the code for situations that cannot be handled anyway.
&lt;li&gt;I use unchecked exceptions for problems I need to communicate directly to the user. Why have to explicitly use "throws" up a tree when the target is always the top?
&lt;li&gt;I only throw/catch exceptions when it is clearly the best or only way. In most cases an error condition will be more useful. If you look at my XML code in the Adept library you will see that it propulates and returns a Messages object. If there are no problems, the list will be empty. This is more useful than an exception in this situation because the code can accumulate problems rather than give up on only one.
&lt;li&gt;An expected condition is not an exception. I prefer:
&lt;pre&gt;
if (! open( file))
  tryAnother();
&lt;/pre&gt;
to
&lt;pre&gt;
try
  { open( file); }
catch (FileNotFoundException fileNotFoundException)
  { tryAnother(); }
&lt;/pre&gt;
It's a lot more readable. My XML code works both ways to handle both sorts of clients. After processing you can ask it to throw an exception if errors were found.
&lt;li&gt;I do not consider exceptions part of the tiering system. I do not catch exceptions and rethrow them just to keep exception types within one tier.
&lt;/ol&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/08/coding-standards-and-breaking-rules_24.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112287792606171159</guid><pubDate>Mon, 01 Aug 2005 06:30:00 +0000</pubDate><atom:updated>2007-08-23T12:47:32.113+10:00</atom:updated><title>Coding Standards and Breaking the Rules - Part 2, Beans</title><description>Standardisation isn't just about styles. Very soon after the advent of Java, Sun introduced the concept of the bean. The original idea was to define components in Java. A component is a reusable building block. In its purest form you should not know or care even which computer a bean is running on.
&lt;p&gt;
Computer independance implies an interface such as that defined for Corba or COM/DCOM. This means methods only, not direct data access. If a calling class asks for data it is a method call that could be serviced locally or sent down the wire to another computer.
&lt;p&gt;
The bean provides the same facilities by convention, not by language restriction. The convention requires that data always be private and accessed by getter and setter methods. Groovy, a Java spinnoff that creates Java class files generates getter and setter calls whenever the data is accessed.
&lt;p&gt;
Originally beans were created for reusable components - often SWING GUI - such as buttons. Later the concept was extended to server code and the EJB (Enterprise JavaBean).
&lt;p&gt;
The militant took the concept of the bean and made it law:
&lt;ol&gt;
&lt;li&gt;Thou shalt not allow outside access to data. Always make it private.&lt;/li&gt;
&lt;li&gt;Thou shalt always access data with getters and setters. No exceptions, heathen!&lt;/li&gt;
&lt;li&gt;Thou shalt burn in oil anyone who does not follow rules one and two at all times. Toss in some fries will you? I do so love those. &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
The claimed benefits are almost universally believed today, and yet hold less water than the average funnel.
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Making data public is a security violation. Anyone could change it&lt;/b&gt;. Yep, that's true. Then why does everyone add setters to all data as a matter of course? And if you choose not to have public setters, how do you set the fields? You could load them in the constructor, but this quickly becomes unwieldy unless there are 6 or less fields to set. You could make the setters package private, but this is limiting in the wrong way. The C++ concept of 'friend' would be useful here.
&lt;li&gt;&lt;b&gt;You can provide a consistent interface that will not change with implementation changes&lt;/b&gt;. Again there is a kernel of truth. A getter that retrieves data could do so from a field, make a call to a database, or cache data from an external source. Very cool, except that everyone assumes that a getter is retrieving from a field with minimal overhead. I don't know how often I have seen code that calls a getter multiple times in a method rather than calling it once and saving it in a local variable. Yes it's bad practice, but it's commonplace and  has some real risks. If the getter is converted to a more complex retrieval (or down a wire as in COM), the calling method that calls it multiple times becomes very inefficient. I prefer a system that separates simple retrievals from the more complicated. See below.
&lt;/ol&gt;

&lt;h4&gt;Class Differentiation&lt;/h4&gt;

I don't treat all classes the same. At a gross level, I divide classes into groups:
&lt;ol&gt;
&lt;li&gt;&lt;B&gt;Beans&lt;/B&gt;. These are classes that interact with other packages, libraries or frameworks that expect that standard bean protocol. Giving credit where it is due - the bean protocol is clear and works well.
&lt;li&gt;&lt;B&gt;Data Transfer Objects&lt;/B&gt;. Commonly called DTOs or DAOs for Data Access Objects. When talking between sub-systems, tiers or layers, it's almost always necessary to pass specific information back and forwards. Because this information is passed between disparate groups they neither want nor need processing internals. Sometimes a single DTO can be used for multiple interfaces. Mostly this will lead to DAO classes that are only partially filled or relevant to the receiving class. I commonly make the DAO a static inner class of the owning logic class and use deep copy routines where possible to ease data transfer.
&lt;li&gt;&lt;B&gt;Implementation Objects&lt;/B&gt;. These are the objects that do the work - be it define business logic or provide validation. For these items, being an object is just a convenience thing - allowing for working data encapsulation.
&lt;li&gt;&lt;B&gt;Object Objects&lt;/B&gt;. Early OO design documention talked about object that represented a car, with object that extend that for a type of car. In the 'real' world of appplication development, it is not all that common to create objects that truly represent something.
&lt;/ol&gt;

&lt;h4&gt;How I Break the Rules&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;B&gt;Beans&lt;/B&gt;. I don't, obviously. A bean has a rigid interface definition that must be adhered to if it's to work with outside components. I would go so far as to say that when I use beans, they are only beans. They're an interface object that does just that and only that - interface. Beans should use implementation and object objects to do the 'real' work. A clear separation of functionality is a clear and very useful architectural principle to apply. It would be acceptable for the bean to consolidate information and implementation from various parts of the system to provide the results required. A bean that responds to the press of a GUI button may validate and then act.
&lt;li&gt;&lt;B&gt;Data Transfer Objects&lt;/B&gt;. The clear architectural imperative to separate functionality applies double to DTOs. This is because you may want to send a DTO down the wire or write it to a file or database. To help this, minimise the use of sub-objects and make sure they are also DTOs. If they are to be persisted or used remotely, make very sure they do not contain time and space sensitive data such as handles or security packets. One of the funniest I've seen was a system that kept a reference to a session in a DTO send to a JMS queue. Everything was fine until the system was restarted with items still on the queue. Then, kaboom. Back to breaking the rules... For me, a pure DTO has no methods whatsoever - just public data. I can hear the pundits screaming from here! But what benefit getters and setters? If you have both then you might as well have public fields. If you only provide getters and set in the constructor, then this is the same as using final public fields.
&lt;li&gt;&lt;B&gt;Implementation Objects&lt;/B&gt;. In simpler cases the are often just a collection of static methods, such as the library &lt;I&gt;Collections&lt;/I&gt; class. Often an implementation object is more complex where the client needs to call multiple methods on common data to produce the correct results. In these cases a new instance is created - but only for convenience. It provides a container for working information. If you accept the definition of an Implementation Object, then I do not break the rules. Methods are an action and often return a reference to the object so that calls can be chained:
&lt;pre&gt;
new ManufactureProcess()
  .rawMaterial( m).cut( c)
  .bend( b).paint( colour)
&lt;/pre&gt;
  More often they will be separate statements to account for optional operations.
&lt;li&gt;&lt;B&gt;Object Objects&lt;/B&gt;. Remember your early training in OO. I treat data storage and retrieval the same for Implementation and Object objects. Firstly I make it private when I can, package private when it is only needed in that sphere and only public when I must.This isn't for security reasons - it just makes the class easy to use. Private and package private are implementation artifacts. When writing a client you need not have them cluttering the documentation nor feel you can use them inappropriately. So, what happens when setting data is a little more complex than copying its reference? It may be cached or it may require unit conversion. I'll make the field private and create a setter and getter, but using the same name as the field without prepending it with 'set'. This way I know that it's not a bean and that working with this field takes work. Oh dear, I can hear those pundits screaming again. What, they say, now every client that uses my class will need to be changed. That is exactly what I want. The writer of the client will have to investigate how the change effects their code. It all depends on the data lifetime that the client requires. They may be able to use the getter to retrieve one copy in the running of the program. More likely they will take a local reference and a single call outside all loops and passes. But, only the client writer can be sure of the best valid lifetime. It may be that they do have to retrieve it directly every time because the returned result is just that volatile. By using the same name for the data and the getter and setter methods, the changes made to the client are minimal and obvious.
&lt;/ol&gt;

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
private int idleTimeInSeconds = -1;
public int idleTimeInSeconds()
  {
    if (idleTimeInSeconds &lt; 0)
      idleTimeInSeconds(
        Properties.get( "idleTimeInSeconds"));
    return idleTimeInSeconds;
  }
public void idleTimeInSeconds( int newIdleTime)
  {
    if (newIdleTime &lt;= 0  ||  newIdleTime &gt; 1000)
      throw new Exception( "Oops"):
    idleTimeInSeconds = newIdleTime;
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/08/coding-standards-and-breaking-rules.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112202898645155243</guid><pubDate>Fri, 22 Jul 2005 10:37:00 +0000</pubDate><atom:updated>2005-07-22T20:43:06.456+10:00</atom:updated><title>Coding Standards and Breaking the Rules - Part 1, Layout</title><description>&lt;h1&gt;Coding Standards and Breaking the Rules - Part 1, Layout&lt;/h1&gt;

&lt;h4&gt;Code Layout&lt;/h4&gt;
I have always been a bit of a heretic when it comes to coding standards. I like,
for example, to have the braces on their own lines with a double-index before and after.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
    if (instanceData == null)
      {
        instanceData = new InstanceData();
    /* more code goes here */
      }

    /* Code outside the if */
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

I have never had any trouble reading code using differed layout styles - except in that some styles are intrinsicly a little harder to scan.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
    if (instanceData == null) {
        instanceData = new InstanceData();
    /* more code goes here */
    }

    /* Code outside the if */
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

I do concede, however, that a single style should be used in a file or at least method body. Otherwise it can be very confusing.
&lt;p&gt;
Having said that, I recognise that many people are militant about enforcing layout styles, and it is really not so important to me that I will make an issue for them.
&lt;p&gt;
Of course if you want a particular style, all the IDEs will reformat a file to your specifications. Don't do it to others. Many people get annoyed when their file is changed. More importantly, don't do it during the release phase of an iteration. Tracking changes is much harder when the whole file has changed.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/07/coding-standards-and-breaking-rules.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112116379159324583</guid><pubDate>Tue, 12 Jul 2005 10:21:00 +0000</pubDate><atom:updated>2005-07-12T20:27:11.206+10:00</atom:updated><title>What and where to document</title><description>Software developers believe in the holy grail of self-documenting code. Sorry, Sir Gawain - but it doesn't exist. Any developer who has gone back to maintain code written a month ago will often have as much difficulty understanding it as if it had been written by another. And yet still the developer will go to any length to write code but not comments. I know, I was one of the worst 'self-documenters' - and most developers who have worked for me still are.
&lt;p&gt;
My journey to code documentation nirvana started when I decided to port a 10 year old C++ library to Java. The names I had thought so descriptive then just baffled me now. They told me what, Clear code structure told me 'how' - but nothing told me 'why'.
&lt;p&gt;
Every now and then over the years I find myself fixing code written by my team. It's a good productive form of code review. I used to just put a comment against the end of changed lines with my initials, date and a comment on what was changed in one short sentence. I felt that this way I was not confusing the flow of the code. Silly. Why a problem occurred and how it was fixed provide important information that must be placed in the code. Otherwise the next person in line will reverse your changes to 'fix' another problem.
&lt;p&gt;
So I love JavaDoc. I have always believed that good documentation should be in the code. It's the only place of any use to maintenance staff - and later users of the class. Hava a look at my Adept library at &lt;a href="http://library.marringtons.com/doc/javadoc"&gt;http://library.marringtons.com/doc/javadoc&lt;/a&gt;. I hope you can see the difference between my documentation and the standard comments you see out there for open source projects. And internal commercial code documentation is not as good! Every good IDE uses the generated JavaDoc. When you pass the mouse over a method, you can get a description of why and how it's used - assuming it doesn't simply pop up with 'TODO.
&lt;p&gt;
Good JavaDoc is great for 'how', but it's still not enough for 'why'. 'Why' is the result multiplied by 1.1512? 'Why' are we looping through a list to get a single value rather than asking the database for it directly? For almost every active line of code in an application there is a story. And next time you need to maintain that piece of code, the story will be of immense value.
&lt;p&gt;
Don't try and go back to document all that old code. Whenever you make a change, however, give us the gossip. You won't regret it.
&lt;p&gt;
I'll leave you with an example of how I write code now.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
/**
 * Collect common instanceData for all Panel commands.
 * Called before any command is executed.
 * @see com.marringtons.adept.action.Action#setup()
 */
protected void setup()
  {
    /*
     * First we retrieve the session scope panel data -
     * including a list of panels (opened and closed).
     */
    sessionData
        = (SessionData) request.session.get( panelCacheKey);
    /*
     * The panel ID can be from target or id HTML tag
     * parameters.
     */
    String id = parameters.get( "target", parameters.get( "id"));
    /*
     * If the command is not related to a particular
     * window then we cannot continue.
     */
    if (id == null)
      return;
    /*
     * We cannot be sure whether the id starts with
     * 'panel.' or not,It depends on where it was
     * generated in the JavaScript.
     */
    if (id.startsWith( "panel."))
      id = id.substring( 6);
    /*
     * We need to remove the ID from the parameters
     * so it does not add panel. back.
    parameters.remove( "id");
    /*
     * Given the id of a panel, retrieve it from a
     * panel instanceData structure in the session.
     * This instanceData can be updated and persists
     * between program runs.
     */
    instanceData = (InstanceData) sessionData.portals.get( id);
    if (instanceData == null)
      {
        instanceData = new InstanceData();
        instanceData.id = instanceData.title = id;
        instanceData.url = "";
        instanceData.showTab =
            instanceData.showBorder =
            instanceData.showShadows =
            instanceData.allowResize = true;
        instanceData.x = (nextX += 50);
        if (nextX &gt; 600) nextX = 0;
        instanceData.y = (nextY += 50);
        if (nextY &gt; 400) nextY = 0;
        instanceData.width = instanceData.height = 200;

        sessionData.portals.put( id, instanceData);
        instanceData.focusOrder = instanceData.openOrder
            = ++sessionData.focusCounter;
      }

    /*
     * Move parameters from the command line into
     * the instance data.
     */
    ObjectScraper.fromProperties( instanceData, parameters, null);
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/07/what-and-where-to-document.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-112026845318857338</guid><pubDate>Sat, 02 Jul 2005 01:39:00 +0000</pubDate><atom:updated>2005-07-02T11:41:34.406+10:00</atom:updated><title>Application Servers vs CGI - why there is no clear winner</title><description>With CGI becoming redundant as time passes and technology moves on, it may seem a little dated to be comparing it with application servers. Bear with me - because now is the perfect time to look at the differences and to clearly understand what we are leaving behind. A retrospective will help us make best use of current and future technologies. For the sake of completeness I'll describe the technologies - from my point of view., of course.
&lt;h4&gt;What is CGI?&lt;/h4&gt;
When the web was new and we still printed out on stone tablets, the first HTML web browsers and servers were created (and Al Gore looked onto all that he thought he had created, and it was good). The first servers just sent files directly to the browser for interpretation. This worked fine while the web was full of static content such as documents and images. Information dissemination is still the primary purpose of the internet. However, the original web was created and used by technophiles who soon wanted their web servers to do something dynamic - like show the time of day, or list information from a database.
&lt;p&gt;
What would be the simplest and most flexible way to do this? Ah yes! To treat the browser as just another terminal client to the Unix server. If the command that the browser gave the web server was of a registered 'CGI' type, it ran as a standard operating system command, just as if it were typed in at a command prompt. The output that would normally be sent to the terminal was fired right back to the browser. The benefits were immediately apparent:
&lt;ol&gt;
&lt;li&gt;The creator could use whatever computer language they were comfortable with: shell scripts, C, C++, Basic, Perl or any other program that can be run from a command line.
&lt;li&gt;Debugging was easy - one could simply run the script in the command line and look at the output.
&lt;li&gt;It worked the same way as other programs and scripts these early implementors were used to.
&lt;/ol&gt;
The only down-side was that the CGI program had to be bright enough to format its responses in a way that the browser would understand. This meant it needed to send a correct HTTP header as well as the HTML content. This was (and still is) simple stuff, easily solved with trivial library routines.

&lt;h4&gt;What is an Application Server?&lt;/h4&gt;
Well, technically even the first web server was an application server. It knew how to interpret a browser command, read files from disk, add an appropriate HTTP header and send it back to the browser. The one thing missing was programmability - the ability to give it instructions that will change the data to be sent to the browser.
&lt;p&gt;
The first attempt at programmability was with CGI as described above. There were real and perceived problems with CGI that led to the invention of the first true application servers.
&lt;p&gt;
An application server is a program that runs continually, in parallel with the web server. The web server knows how to interpret requests for itself and any application servers attached to it. The application server interprets string commands sent to it into a call to specific methods, functions or code with attached parameters.
&lt;p&gt;
The term application server was coined by the corporate world. Microsoft sport an ASP (read embedded VB) front end with C# for computational work in the .NET framework. IBM. Sun and others sport J2EE, being Embedded Java JSP and Java on an J2EE-compliant application server for the heavy stuff.
&lt;p&gt;
The smaller Internet sites prefer a simpler approach. PHP is an embedded-only system that is still, by definition, an application server. While Perl is normally used for CGI, there is a module for the Apache web server called mod-perl that turns it into a basic application server.

&lt;h4&gt;CGI Benefits&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Easier to develop in.&lt;/b&gt; Since each browser/server exchange runs a unique program, it can be implemented and tested in almost complete isolation. This makes it simple to implement a web site page by page.
&lt;li&gt;&lt;b&gt;No garbage to collect.&lt;/b&gt; Each browser/server exchange runs a separate program that does it's job and exits in a fraction of a second. There are no opportunities for memory leaks or memory hogging code to clutter up a system and bring it to a grinding halt.
&lt;/ol&gt;

&lt;h4&gt;CGI Disadvantages&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Resource Intensive.&lt;/b&gt; Each exchange runs a new program! For an operating system to run a program, it must execute some expensive operations. Every type of operating system has a latency when a program starts, so for CGI this latency adds to each exchange. However, it's important to note that the latency is minimised because the CGI interpreter will be cached when loaded so regularly.
&lt;li&gt;&lt;b&gt;No Pooling.&lt;/b&gt; Because CGI runs an independent program for each exchange, there is no opportunity to pool resources that are expensive to get. The classic example is database connections. On large commercial relational databases getting a connection can take seconds. Most application servers pool connections and reuse then when needed. CGI provides no such facilities.
&lt;li&gt;&lt;b&gt;Limited Session Data.&lt;/b&gt; The only way to store data to be used between exchanges is in cookies held by the browser and sent as part of the HTTP header in each exchange. The amount and format of information stored this way is limited both by practicality and by the browser. A typical restriction is 20 cookies of no more than 4kb each.
&lt;/ol&gt;

&lt;h4&gt;Application Server Benefits&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Session Data Available.&lt;/b&gt; Most application servers use a single cookie or URL parameter to hold a unique session key. This key can be used to return a session data structure on the server. Size is limited by system only. If you expect 100 simultaneous sessions to be running on a 2Gb server with 1Gb allocated to the application server, then you will not want sessions to exceed 10Mb. In practice unique session data is far smaller than that. More importantly, because the data is held in server classes (whether Java, C# or whatever), it is not subject to the restrictions placed on text-only cookies.
&lt;li&gt;&lt;b&gt;Pooling is possible.&lt;/b&gt;  Outside services such as databases, workflow engines or external services are expensive to connect to in that a connection can take seconds to do. Most application servers implement a pool for this situation. Once a session or conversation is finished with a connection it's returned to the pool for someone else to use.
&lt;li&gt;&lt;b&gt;Fast Exchange Startup,&lt;/b&gt; A CGI exchange requires that a program be run every time the browser requests a conversational exchange. For an application server it is merely the interpretation of an command to call an internal method or function. In theory, it should be quite a bit faster.
&lt;li&gt;&lt;b&gt;Scalable.&lt;/b&gt; The additional control provided by an application server allows it to be designed to work across one to many physical servers.
&lt;/ol&gt;

&lt;h4&gt;Application Server Disadvantages&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Memory and Resource Leaks.&lt;/b&gt; Typically an application server is a program that runs for days or even months between restarts. If a program has memory leaks (and, yes this is possible with garbage collection), it can cause a system to run slower as time goes by. Resource leaks can happen occasionally in rarely occurring logic conditions, causing problems that only occur in production.
&lt;li&gt;&lt;b&gt;RAM Fragmentation.&lt;/b&gt; Garbage collectors cause RAM fragmentation. Like memory leaks it causes the program to run slower over time. A good garbage collector will clean itself up, but it's next to impossible to avoid increasing fragmentation over time as long-lived objects break up the physical RAM.
&lt;/ol&gt;

&lt;h4&gt;Which is Faster?&lt;/h4&gt;
Common sense tells up that an application server should be much faster than CGI. Imagine running a program every time we have an exchange between browser and server. Empirical evidence does not support the theory. In practice CGI systems doing similar jobs provide similar levels of performance. Why? The overheads of program start in CGI is offset by the speed of running 'clean' programs. Also the CGI writers write for performance when dealing with data. The simple hash databases such as berkely-db are much faster than large relational systems. A CGI writer is also more likely to create static pages offline if they change once a day or less rather than generating them from data on the fly.

&lt;h4&gt;Which is Better?&lt;/h4&gt;
All my architectural training tells me that application servers are the way to go. So, why can I produce a CGI system more quickly that is easier to release and maintain?

&lt;h4&gt;And the Winner is...&lt;/h4&gt;
Application servers &lt;em&gt;do&lt;/em&gt; come out on top, but for the most because they're politically correct. The availability of session data is a convenience, but resource pooling can sometimes be an essential. There are valid work-arounds for both problems in CGI but these cannot overcome the pressures of 'correct' architecture.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/07/application-servers-vs-cgi-why-there.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111817852294527320</guid><pubDate>Tue, 07 Jun 2005 21:07:00 +0000</pubDate><atom:updated>2005-06-08T07:08:42.953+10:00</atom:updated><title>Synchronized - Dos and Don'ts</title><description>This is not a training session on multi-tasking in Java. I'll leave that hot potato for the more qualified technical writers. This is a starting set of guidelines for things to watch in code, that can be run by more than one thread, at any one time. In a J2SE environment this can only happen if you create and start threads. In any application server environment, from Tomcat to Weblogic, each client has at least one thread - and they can all be running in common code. So:
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Do&lt;/b&gt; look closely at each and every piece of static data in the system. Threads can only clash when sharing static data, or data with a static element somewhere in the reference chain.
&lt;li&gt;&lt;b&gt;Don't&lt;/b&gt; put static data in library routines, as you can't be sure when they'll be used by threaded code. A classic problem is a cache, which is usually a synchronised static map. Because any data retrieved from such a cache has a static element in the reference chain, there is a risk of it being accessed by more than one thread. In my code this happened with the session cache. A single browser inside a single session can post two almost simultaneous requests serviced by two separate server threads - both accessing the same session data. 
&lt;li&gt;&lt;b&gt;Do&lt;/b&gt; use &lt;i&gt;syncronized&lt;/i&gt; to wrap any code that can be run by more than one thread and accesses common data. It will cause second and subsequent threads to wait until the first thread to have a &lt;i&gt;syncronized&lt;/i&gt; block to leave said block.
&lt;li&gt;&lt;b&gt;Don't&lt;/b&gt; synchronize everything. It's a common beginner's mistake to synchronize everything in a class  to forestall any problems. In the first case it won't stop problems as deadlocks are still possible, and in the second case it can make a program impossibly slow, by creating massive bottlenecks.
&lt;li&gt;&lt;b&gt;Do&lt;/b&gt; keep synchronised code sections to a minimum. I realise that technically this is the same point as the one above - but it's so important that it bears repeating. I can't believe how often I have seen synchronised methods that call logging functions. Logging, like disk or network I/O, can cause delays - not to mention the CPU time required to turn data into human readable form. If it's a popular piece of code, system response drops off while many threads wait for one to complete. Telltale symptoms of this mistake, then, are low CPU usage with long response times.
&lt;li&gt;&lt;b&gt;Don't&lt;/b&gt; use synchronised maps and lists unless they are absolutely necessary. There is no need to synchronise a map that's inside synchronised code in your application. Unlike items synchronised to the same semaphore, doing this will incur a considerable overhead to no advantage. The older Java containers were synchronised by default, whereas the new aren't.
&lt;li&gt;&lt;b&gt;Do&lt;/b&gt; think carefully on any synchronised code - because it's virtually impossible to unit test. Problems usually only manifest themselves in production with heavy varied load. These problems aren't easy to reproduce on request..
&lt;li&gt;&lt;b&gt;Don't&lt;/b&gt; assume that because code always works on a single-CPU desktop or test machine that it will on a multi-cpu server. Even desktops will be vulnerable once the new multi-core CPUs hit the market in quantity. A single CPU system is still linear; even when it looks like it's multi-tasking it's really just swapping between tasks very quickly. A multi-CPU system can have completely separate processors running the same code and accessing the same data. Certainly synchronize should work the same way in both cases, but the completely different mechanisms employed means that your code will be exercised differently. The end result is that code that works perfectly well on a single-CPU system may fail randomly on a 2 or 4 CPU server.
&lt;/ul&gt;

Enough, enough already. I think you get the gist: multi-tasking is a specialist field and any demonstration should come with a 'don't try this at home' tag. Application servers attempt to make the multi-tasking invisible at the expense of performance.
&lt;p&gt;
I always like to give an example. Unfortunately, thread awareness is such a complex issue that it's impossible to give a simple valid example. The best I could find in my code was in a double-hash database index. Don't try and understand the code out of context - download it at &lt;a href="http://library.marringtons.com"&gt;The Adept Library&lt;/a&gt; if you want to do that. This is an extreme case for minimising synchronised sections. If the method were synchronised it would lock out for a long time. Instead, only the bucket change is synchronised. 
&lt;p&gt;Note that the test is made twice - once to see if we need to do it and once inside the synchronised block to check that it has not changed. The odds of two threads deleting the same index at exactly the same time is probably billions to one, but it is not hard to protect against if you think on it. Because only the change is synchronised, care is taken in the rest of the code that reads are consistent even if the data under them changes. This takes thought, but is a lot more efficient than synchronising everything.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
boolean delete( int hash, int record) throws IOException
  {
    int bucket = getBucket( hash);
    IntegerStack possibles = findInBucket( bucket, hash);
    int deleted = 0;
    int possible;
    while (! possibles.isEmpty())
      {
        possible = possibles.pop();
        if (index.file.getInt( possible) == record)
          {
            index.file.putInt( possible, -1);
            int secondaryBucketLocation
              = (hash &gt;&gt; primaryShift) &amp; primaryMask;
            
            if (secondaryBucketCounts[secondaryBucketLocation] &gt;= 0)
              &lt;b&gt;synchronized(this)&lt;/b&gt;
                {  // only if sharing secondary bucket hash
                 if (secondaryBucketCounts[secondaryBucketLocation] &gt;= 0)
                  {
                      secondaryBucketCounts[secondaryBucketLocation]--;
                      secondaryBucketLocation
                        = primaryBuckets[secondaryBucketLocation];
                      
                      if (secondaryBucketLocation == -1)
                        // in the middle of a split
                          secondaryBucketLocation = afterSplitLocation;
                      
                      index.file.putInt( secondaryBucketLocation,
                        index.file.getInt( secondaryBucketLocation) - 1);
                    }
                }
            deleted++;
          }
      }
    return deleted &gt; 0;
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/06/synchronized-dos-and-donts.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111766939124928652</guid><pubDate>Wed, 01 Jun 2005 23:42:00 +0000</pubDate><atom:updated>2005-06-02T09:43:11.256+10:00</atom:updated><title>Cooperative Multi-Tasking - Yesteryear and Today</title><description>I've been developing for and with multi-tasking systems since what feels like the dawn of time, but in actuality was the late seventies. Those were the last days of total dominance by mainframe or mini-computer on the desktop. Even the smallest computer was too expensive to be dedicated to one user or program. IBM were doing great things with virtual machines - making it appear as if each user had a whole IBM mainframe at their personal disposal. So-called minicomputers used multi-tasking so that individual applications (including the operating system itself) could have a share of CPU and memory.
&lt;p&gt;
Then came the micro-computer and PC. The users loved the freedom from expensive shared computers, but the first ones lacked the grunt for true multi-tasking - although Digital Research gave it a go with MPM and Concurrent CPM. This was when Microsoft first won the race by providing DOS to IBM. No multi-tasking. I can remember people starting spreadsheet calculations that would take up to two days - during which time their $5,000 computer could only be used as a paperweight.
&lt;p&gt;
Microsoft 'to the rescue' again with Windows 3 (the first non-geek popular version of Windows). Windows had originally been developed for the 8086 with little or no hardware support and an ability to make very small yet powerful programs. Only when the 80386 chip was available did system have memory and hardware support for multiple programs.
&lt;p&gt;
Windows 3 and it's even more popular networking offspring 3.11 were still built on the old 16 bit code. They inherited a multi-tasking method called co-operative multi-tasking. In this model a process or program truly does own the CPU until it calls an operating system service to release control and schedule the next task waiting for CPU. Those of us used to working with 'true' multi-tasking were fairly unimpressed with such a primative system. Yet, it worked amazingly well. This is because most programs spend most of their time waiting for user input. As long as they followed the well documented standard of having a primary loop that released the CPU for others while waiting, the used very little CPU.
&lt;p&gt;
And here's the rub. With the release of Windows 95 and true multi-tasking we see more inconsistent responsiveness and more user delays than in the 'primative' cooperative multi-tasking days. Even today with machines more than 10 times faster with almost 10 times as much memory, I can suffer noticable delays waiting for a web page to render while doing a large Java compile.
&lt;p&gt;
Certainly one reason is that we expect a modern computer to do many things at once, but the other reason - and the one I want to talk about here - is the sense of responsibility. With co-operative multi-tasking, the developer was overtly responsible for end-user performance. Even when doing a complex calculation or compile, a good developer would ensure that the system yielded control often enough so that the user interface did not become sticky.
&lt;p&gt;
With a return to the Unix fold of time-sharing multi-tasking, the responsibility goes back to the operating system and the developer is taught to ignore it. The former is good since the operating system can do a much better job of it - and it simplifies the code. The latter is not so good. The operating system attempts to discern user interface code and give it priority. This is why the mouse does not usually stick no matter how hard the computer is working. It's much harder to differentiate user interface calculations within a program from less important batch operations. This is why my browser rendering is delayed by the Java compile.
&lt;p&gt;
The resolution in native code is relatively obvious. Before a processor intensive operation, reduce the priority of the process with an operating system service and restoring it to normal when returning to interactive mode. This way other user interface programs should get priority and be less likely to be delayed.
&lt;p&gt;
This doesn't work if your code is further removed from the operating system. Don't mistake light-weight Java  threads with OS level multi-tasking. It's light-weight because it shares the JVM cpu time between threads. Since the JVM is just a program to the operating system, it takes it's normal slice of computer resources and then divides it up among it's active threads.
&lt;p&gt;
The only way I can think of to release cpu resources to other programs during a long calculation or compile would be to sleep for one millisecond fairly regularly. I don't like it. Firstly it reeks of polling, and secondly your process will be unnecessarily delayed if no-one else needs the CPU.
&lt;p&gt;
In the case of a compile or build there is another option. They are such large processes that they can spawn their own independant JVM. It should be possible to tell the operating system to run these JVMs at a lower priority.
&lt;p&gt;
In truth modern CPUs are so over-powered that no-one except extreme power users would even notice. The larger bottleneck is around sharing disk access time rather than CPU time. This is a discussion for another day.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/06/cooperative-multi-tasking-yesteryear.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111620290910797043</guid><pubDate>Mon, 16 May 2005 00:20:00 +0000</pubDate><atom:updated>2005-05-16T10:21:49.113+10:00</atom:updated><title>Threads and Multitasking</title><description>Integrating thread support in the base Object class and in the language syntax wasn't what I'd call a great technological achievement. We've been doing the same for years with very similar procedures in C++. No, the real magic was in making multi-tasking available to everyone. Without it Java would not have been suitable for application servers and probably would never have survived.
&lt;p&gt;
While we are discussing historical imperatives, threading wouldn't have been as useful or popular without garbage collection. While creating a threaded system is still in the domain of the specialist, all programmers who develop for application servers are developing code to be run in a thread. Without garbage collection, minor memory leaks in a stand-alone application become much worse on an application with many threads (all quite possibly running the same code).
&lt;p&gt;
Did I mention that multi-tasking is still a rare thing only used by a few specialists in the field? This isn't only because it's harder than normal development, but also because there aren't many occasions where the additional expense of task switching is worth the benefits of appearing to do two things at once. The operative word here is 'appearing'. On a single processor desktop, only one thread can be doing anything at one time. Only on a multi-processor server can you hope that multiple threads will execute in parallel. Even then this isn't guaranteed as the CPU time must be shared between multiple programs, not just threads.
&lt;p&gt;
So, when is multi-tasking of value?
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Application Servers&lt;/b&gt;: where one server services multiple clients or browsers.
&lt;li&gt;&lt;b&gt;Communications&lt;/b&gt;: where messages need to be received when they arrive even if an earlier one is still processing.
&lt;li&gt;&lt;b&gt;Spawning&lt;/b&gt;: Running external programs and not having to wait until they complete.
&lt;li&gt;&lt;b&gt;Housekeeping&lt;/b&gt;: Just like at home, collections of data get messier over time. Faster response times can be had by quickly changing and throwing your discarded clothes on the floor. You can then come back and clean up when you have spare time (and I certainly hope my wife isn't reading this). This is why garbage collection with all its complex overheads can provide more responsive systems than C/C++ with malloc/free. Don't clean up memory as you finish with it, leave it and wait until you have free time. Application software can be the same. I have a cache with an ageing feature. It holds data like a HashMap, but if it is more than X minutes old it is discarded. It is, of course, checked on retrieval, but what of the case where the data is never retrieved again. I have a separate housecleaning thread that walks through the list when the system is not doing anything else looking for out-of-date data and removing it. Such data is closed if it has a Closeable interface. This is very useful for caching connections since you don't want them held open indefinitely if they are not being used.
&lt;li&gt;&lt;b&gt;Prioritised Processing&lt;/b&gt;: is a close relative to housekeeping. My object database system has threads for deleted records, indexing and reorganisation. When you ask to delete a record it is flagged and the call returns. A background thread actually deletes the record and frees the space. The same goes for indexing. This can be expensive with multiple indexes, but why delay the calling program?
&lt;li&gt;&lt;b&gt;Scheduled Tasks&lt;/b&gt;: You don't really want everything to freeze waiting for a specific time or time interval do you? A classic example is displaying a clock on a gui application. Start a thread that waits 1 second then redisplays the clock before waiting again and you have a timepiece with a minimum of system impact.
&lt;/ul&gt;
&lt;p&gt;
The use of threads can be divided into two diametrically opposite groups. Application servers and communications are examples of making a single service appear committed while actually serving many masters. Spawning, housekeeping and prioritised processing are all examples of providing more responsiveness by deferring tasks to a less busy time.
&lt;p&gt;
The reason these two use groups both work well with Java lightweight threads is that they both involve more waiting than working. While one thread is waiting, others can be busy without appearing to slow anything down.
&lt;p&gt;There's that word again: On desktop systems, it appears that appearance is everything.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/05/threads-and-multitasking.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111515763634661904</guid><pubDate>Tue, 03 May 2005 22:00:00 +0000</pubDate><atom:updated>2005-05-12T18:09:13.613+10:00</atom:updated><title>Mainframe to Stand-Alone to Client/Server to Web - The Full Circle</title><description>This is an observation that may seem so obvious that it does not need writing. Then again, I have not read it elsewhere, so here goes...
&lt;p&gt;
In the sixties and seventies the corporate computing environment was dominated by mainframes. These big monsters required regular feeding of funds and staff. As business began to rely on the information processing they provided, the emerging IT departments gained more and more power. Is this reading like a bad fantasy novel yet? 
&lt;p&gt;
Then came 1980 (It was hard not to notice, the hair was so bad). IBM had looked at the geeky micro-computer market and seen a cheaper and more flexible terminal to connect to their mainframes. Those of us using micro-computers at the time were not impressed. Their hybrid 8/16 bit 8088 was much slower than the second generation Z80s we were using - and way more expensive. 
&lt;p&gt;But we were stupid. The thing was, it wasn't about the technology - that can always be improved - it was about the culture. The guys with suites bought them by the dozen: they saw freedom from the control the IT department had been exerting. By the time Lotus had taken the Visicalc idea and made it work with big sheets of data, the market was set. Rather than the intelligent mainframe workstation with a bit of word processing thrown in that IBM had envisaged, office workers were running their own programs to get the work done locally.

&lt;p&gt;
The two armies faced off, and the battle raged. Desktops sprouted databases (DBASE) and 4GL solutions. The mainframer fought back with client server applications, a sort of compromise that would leave IT departments back in charge. Client/server was a failure. It was a lot more expensive to develop than mainframe-only and had severe reliability problems.  While it had better looks it was as slow as we were used to for mainframe applications.
&lt;p&gt;
And then the popular front did themselves in. They got all excited about an information presentation technology called the Internet. This is not a criticism of the Internet, but where we saw information at our fingertips, the IT Department saw servers under their control sending information to terminals under their control (called browsers this time).
&lt;p&gt;
Almost every enterprise application in the last 7 years has been web-based - meaning application servers with nothing but a browser on your super-comptuter of a desktop.
&lt;p&gt;
Is it just me, or have we come full circle in the last 25 years? Sure the browser has replaced the dumb terminal and the IBM mainframe by the Sun Solaris server, but what else has really changed? It looks a lot prettier, but we are waiting just as long for a page to load now as we did then. We work on enterprise systems where 4 seconds between pressing a button and getting a display is acceptable. How would you like it if Word or Excel behaved this way?
&lt;p&gt;
Whatever happened to distributed processing?</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/05/mainframe-to-stand-alone-to.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111456072203056842</guid><pubDate>Wed, 27 Apr 2005 00:11:00 +0000</pubDate><atom:updated>2005-05-12T18:17:37.576+10:00</atom:updated><title>The Semaphore</title><description>Java makes multi-tasking easy with the &lt;i&gt;synchronised&lt;/i&gt; keyword. This can be applied to a method as part of the signature or used to wrap a code block, by providing an instance to synchronise against. If a second thread attempts to enter a synchronised section, it's blocked until the first one completes.
&lt;p&gt;
This is all well and good, but Java prior to 1.5 did not provides semaphores. A semaphore is a flag that will cause a thread to wait until another thread tells it to continue. I use semaphores as much as, if not more often than, synchronised blocks.
&lt;p&gt;
I use semaphores in CGI so that the HTTP server thread can wait until the external program completes. The HTTP server itself has a &lt;i&gt;waitUntilClosed&lt;/i&gt; method that can be used to block the main thread until the server itself signals completion. Semaphores are also invaluable in the database code so that index generation, physical writes and housekeeping can be done in the background, yet concerned threads can be put on hold when necessary.
&lt;p&gt;
Before semaphores it was common, although always considered bad practice, to write a polling loop. This is a loop that just keeps checking for a desired result until it comes out positive. Polling loops are always a bad idea because they eat CPU time like it's going out of style, and do very little by comparison. 
&lt;p&gt;
All Java objects have &lt;i&gt;wait&lt;/i&gt; and &lt;i&gt;notify&lt;/i&gt; methods that provide all the functionality. In the &lt;a href="http://library.marringtons.com"&gt;Adept Library&lt;/a&gt; I have chosen to wrap these in a &lt;i&gt;Semaphore&lt;/i&gt; object for convenience - and, as always, to make the code easier to read. This object handles interrupts and allows for cases where &lt;i&gt;resume()&lt;/i&gt; can be called before &lt;i&gt;pause()&lt;/i&gt;.
&lt;p&gt;
If a thread is core to the interactive components of an application, it's a good idea for it to not block forever. Perhaps the thread that was to resume the process has died or become lost - in multi-threading, anything can happen. Of course this generates code similar to a polling loop, but if the wait period is long enough there will not be a serious performance cost.
&lt;p&gt;
In summary, if you need to wait for something to happen, use semaphores rather than polling . For a pack of good examples, head over and download the &lt;a href="http://library.marringtons.com/marringtons.com.library.zip"&gt;Adept Library&lt;/a&gt;.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/04/semaphore.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111379892713299451</guid><pubDate>Mon, 18 Apr 2005 04:34:00 +0000</pubDate><atom:updated>2005-04-18T14:35:27.136+10:00</atom:updated><title>Exceptions and Tiers</title><description>I recently worked on a project where each tier had its own exception class - messy. To make things worse, each call between tiers was wrapped in a try-catch to translate it into the exception class of the calling tier. This made the code bloated and harder to read.
&lt;p&gt;
The only valid reason to catch and translate an exception is when you have additional important information to add. For example: converting a string to an integer can throw a library exception saying what happened, but only the calling function has enough information to tell which field failed. Of course if it's a programming error, the first exception is adequate as the stack trace will tell where it comes from. If, however, the data is from an untrusted source (user, XML stream, etc) - the user will require the field name and source. In this case the calling method will trap the conversion exception and add the additional information.
&lt;p&gt;
Exceptions can be divided into a limited set of groups.
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Development Errors&lt;/b&gt; (aka bugs): Occur when the code fails because of faulty coding (or possibly design). It is hoped that all of these will be eliminated before a system goes to production. Realistically, a few will always hang around. In Java they are most commonly of type NullPointerException. Where they are explicitly created they should be unchecked exceptions since they need not be dealt with in the normal course of events. It's not uncommon for developers to throw these all the way back to the GUI as unsightly error pages. This isn't a good solution as the user loses their place and gets a very negative impression about the stability of the application. Equally unsuitably, some commercial applications just silently swallow these sorts of error - at best just logging them. While this is less scary for the user, it can often mean that they will not get the results they expect - and that the problem will never get dealt with. I've found that the best way of reporting them is to treat them as special validation messages. The user will see the form with a "sorry we seem to have a problem message" in one of the validation message fields. At least this way they can try for a work-around.
&lt;li&gt;&lt;b&gt;Validation Errors&lt;/b&gt;: At the other end of the exception are errors caused by the interface user. These can be as simple as a number out of range or as complex as not being able to kill the dragon because you forgot to pick up the fire wand 5 levels higher in the dungeon. From a development viewpoint the application should not break and the user should be informed of their fault in a clear,  accurate and polite way. (Alas! We told you not to sell the 'Wand of Dragon Slaying'. But did you listen to us? No! The name of this game is Dragon Slayer, for Thor's sake.)
&lt;li&gt;&lt;b&gt;Internal Exceptions&lt;/b&gt;: Attempting to turn a string into a number with the Java Integer class can cause an exception if the string contains non-digits. You have 2 choices; either pre-parse the string or catch the exception and translate it into a validation. Most libraries will throw exceptions inappropriate for your application. Fortunately they are normally checked, so the compiler will nag you. Catch them as early as possible and translate them into either development error or validation exceptions - or deal with them if by some miracle that is possible. Do this in implementation layers and don't sully business logic with such confusing code.
&lt;/ol&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/04/exceptions-and-tiers.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111318593706170109</guid><pubDate>Mon, 11 Apr 2005 02:18:00 +0000</pubDate><atom:updated>2005-04-11T12:18:57.066+10:00</atom:updated><title>Transferring Data Between Tiers</title><description>Tiers and slices are simply mechanisms for visualising the age-old principle of divide and conquer. For the ultimate in maintainability and reuse, it is best to keep the divisions as clear as possible. 
&lt;p&gt;
A software object emulates objects in the real world by being a combination of information and functionality. To keep code regions separate and independant, however, it's best if they don't know anything about the internal functionality of their neighbours. You may ask an object to do a job or provide a result, but separation means you should not know about or rely on the methods the object uses to achieve that. You should know that you can use your calculator to add up, but you shouldn't be weighing up the variable resistance methods within the microchip that are used to represent binary data. The key to tiered design is to be creatively stupid. 
&lt;p&gt;
So, when calling a service in another tier we should provide and receive pure information without attached functionality. When the said information is larger that a single primative it's often called a data transfer object or DTO. 
&lt;p&gt;
There's another good reason for a DTO. If your application is J2EE EJB or otherwise designed so that it can be distributed, information will be passed by value rather than reference. 
&lt;p&gt;
For the separation to occur it is important that the DTO be clean. It should contain mostly primative, other DTOs or well known classes that do not cause too much interdependence. Where possible, the latter should be immutable. There's nothing worse than receiving a DTO by RMI from a remote server that includes the whole session or security structure as a field. Not only is it massive to transfer over the wire, but you rebuild it locally without almost all the remote code existing locally in an up-to-date format. On the other hand a Map of String is quite acceptable since both sides will be using a common library. 
&lt;p&gt;
I prefer DTOs to be very specific. I intensely dislike DTOs floating around with partially filled fields depending on what was asked for. It's also bad to have DTOs with information extraneous to requirements. Not only is it confusing when maintenance is required, but it also means you are retrieving information that is not required - often at considerable expense in resources. This can happen if you attempt to pass a DTO through more than one tier. 
&lt;p&gt;
The persistence tier, for example, will probably have DTOs that match the database tables. There's a temptation when providing a service at the service layer to create a DTO that has persistence tier DTOs as fields. Resist. Firstly you are exposing too much of your database structure, secondly your service is presenting a complex graph to its client in a form not logical for that view and thirdly it's not common for data to require different formatting from different viewpoints. In an extreme case, the persistence layer may be using sql.date while the service layer uses Calendar and the GUI tiers deal with a string including a formatted date. 
&lt;p&gt;
The bigest valid complaint against using DTOs in a clean compartmentalised manner is the need to be continually copying the contents at each tier interface - in both directions. I have seen tier interface services that are just masses of copy statements. Updating a DTO without changing all the copy code is a common source of subtle bugs. 
&lt;p&gt;
The Adept library object package has a lot of support for DTOs. Specifically there is a DTO helper class with static methods for transfering data between DTOs and POJOs, both in bulk and given a list of required fields. It is a deep copy operation. There are also classes to convert DTOs to/from XML streams for data transfer and to/from name-value pairs for screen or form population and retrieval. 
&lt;p&gt;
Using deep copy methods such as this allow you to remove the interface layers in each side of each tier since the copy can become a single line part of the logic layer of the tier.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/04/transferring-data-between-tiers.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111267376593265357</guid><pubDate>Tue, 05 Apr 2005 04:02:00 +0000</pubDate><atom:updated>2005-04-05T14:02:45.936+10:00</atom:updated><title>Tiers for Logical Application Separation</title><description>This is the first of three articles on tiers in software development. The next two will focus on data transfer and exception processing.
&lt;p&gt;
Current thinking in software architecture is that applications should be designed and implemented in clear tiers. Think of a chocolate layer cake. The icing is the GUI tier - the one you see. Each layer below is a tier providing unique functionality.

&lt;h4&gt;Examples&lt;/h4&gt;

A standard PHP web application is a single tier design. The same code that accesses the database also displays the pages. From another perspective it is a 3-tier design with the browser providing the GUI tier and the database engine the persistence tier. Still, the developer can only change the single central tier.
&lt;p&gt;
A client/server system is a clear 2-tier design. There is a clear division between the 2 parts of the application - to the extent that they are usually running on different systems.
&lt;p&gt;
N-tier systems are less easy to see from the outside as it is a development model rather than a physical separation.

&lt;h4&gt;An N-Tier Pattern&lt;/h4&gt;
I divide an application into tiers and tiers into layers. Really there is no difference except for logical grouping.
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;GUI Tier&lt;/strong&gt;
  &lt;ol&gt;
&lt;li&gt;&lt;b&gt;GUI display layer&lt;/b&gt; - responsible for display and retrieval of data only. It expects information ready to display and passes information back as it comes from the user. In a web application it is code and information passed to the browser for rendering. Because it is the only part of the application that will be operating locally to the user, it may include some code for validation of input and manipulation of display for output.
&lt;li&gt;&lt;b&gt;GUI support layer&lt;/b&gt; - Manipulates information as the program sees it to produce information that the browser renders and the user sees. In traditional web server applications the JSP or ASP is the prime GUI support tier.
&lt;li&gt;&lt;b&gt;GUI transfer layer&lt;/b&gt; - is the primary interface with the next tier. Information to be displayed is consolidated here and formatted into the form to be placed on the screen. A date object, for example, will be turned into a string to be displayed. The same goes for the reverse direction. When the user enters information, this is the second level of validation as strings are translated to internal form.
&lt;/ol&gt;
&lt;li&gt;&lt;strong&gt;Business Logic Tier&lt;/strong&gt;
  &lt;ol&gt;
&lt;li&gt;&lt;b&gt;Service Layer&lt;/b&gt; - The latest buzz-word is SOA (Service Oriented Architecture). This is the layer that supplies the exported services, and can talk with the GUI tier above, a fat client or a Tuxedo interface for a remote request from another server. The service layer should not be involved in the infrastructure required to deliver the service. It's prime responsibility is to respond to requests, recieving and sending messages as simple data structures without code or other ties to the underlying system (DTO). It will need to validate parameters and apply security as required.
&lt;li&gt;&lt;b&gt;Definition Layer&lt;/b&gt; - Here the definitions for business logic - as defined in the design documents - have been translated to code. Code here should be simple, clean and clear - able to be compared one-to-one with the design documents. Don't confuse the definitions with implementation, and don't validate parameters, catch and process exceptions or any other implementation code that can 'muddy the waters' when reviewing business process. Each method should be a clear list of actions with branches and loops.
&lt;li&gt;&lt;b&gt;Implementation Layer&lt;/b&gt; - This is where the business work is done. Each of the actions used in the matching Definitions layer will be implemented with all the nasties of exception processing, data retrieval and consolidation. While the code in this layer will be dirty with detail, if the business instructions in the definition layer are finely grained, then methods should not be too large or hard to follow. Any refactoring to make use of common code should be done at this layer rather than the Definitions layer, for the sake of clarity on the higher tiers.
&lt;/ol&gt;
 &lt;li&gt;&lt;strong&gt;Persistence Tier (Database, mail, IPC and such)
&lt;/strong&gt;
  &lt;ol&gt;
&lt;li&gt;&lt;b&gt;Interface Layer&lt;/b&gt; - This is effectively an internal service layer. For a database package it would encompass the domain model representation, providing enough information in a single method to satisfy service layer requests without loading to much additional information from the tables. The service layer should not know about persistence internals, so this layer provides a level of separation.
&lt;li&gt;&lt;b&gt;Implementation Layer&lt;/b&gt; - The implementation layer more finely-grained than the interface layer. It's typically one-to-one with the database tables or other interface services. It's often provided by external packages (i.e. hibernate or javax.mail), although it can also include system-local interfaces to external packages.
&lt;li&gt;&lt;b&gt;Helper Layer&lt;/b&gt; - In the persistence tier above all others, the various tables and interfaces referred to in the implementation layer will require common code for processing. In a full OO design these would be part of the super class. Because tables and interfaces often use external packages that cannot always be subclassed, common support code will need to be in separate objects. Put these in a separate Helper layer for clarity.
&lt;/ol&gt;
&lt;/ol&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/04/tiers-for-logical-application.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111153046833686226</guid><pubDate>Tue, 22 Mar 2005 22:26:00 +0000</pubDate><atom:updated>2005-03-23T08:27:48.340+10:00</atom:updated><title>Data Lifetimes</title><description>Our first experience with data lifetime is usually in a beginner's handbook, supported by a particular language of choice for the text. Out of necessity these books tend to focus on a single use, single threaded application - in a nutshell:
&lt;ol&gt;
    &lt;li&gt;&lt;b&gt;Code Block Local&lt;/b&gt;: C would only allow function local at the top of the function. C++ allows data to be defined anywhere in the function - but typically allocates it when the function is called. Unlike other forms of data, block local doesn't have a default value, so it must be set before it is accessed. This is because in the "old days" such data was kept on the processor return stack - it was quick to allocate, free, and most processors had specific instructions to read/write such data quickly. Said data goes out of scope when the method/function returns.     &lt;/li&gt;
    &lt;li&gt;&lt;b&gt;Instance Local&lt;/b&gt;: Fields that aren't static are created when an instance of a class/struct is created. They exist until the enclosing structure is freed or garbage collected.     &lt;/li&gt;
    &lt;li&gt;&lt;b&gt;File Local&lt;/b&gt;: Data or classes that aren't inside another structure are file local. In Java, file local is seldom used - and even then can only be a private class. A class is considered data here since any static fields are instantiated when the class file is loaded. The data lives for the life of the program unless explicitly cleared.     &lt;/li&gt;
    &lt;li&gt;&lt;b&gt;Class Local&lt;/b&gt;: Fields set to static are only instantiated once when the class is loaded. This is the way to produce singletons - very useful for caches and constant data. The data lives for the life of the program unless explicitly cleared.     &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Having said that, this article is really about data lifetimes in more complex environments. Any type of services architecture -&amp;nbsp; including web services - require that information be kept for conversations and sessions.&lt;/p&gt;
&lt;p&gt;Every interaction with the program can be considered a conversation. This could be a call from client to server, or a post or get command from a web browser. The conversation is complete once the program has responded. For the browser this is when the next page is displayed. A session lasts for as long as a single client interacts with the system - and typically encompasses may varied conversations.&lt;/p&gt;
&lt;p&gt;So think of your program as a chocolate layer cake (the flavour isn't significant - I just happen to like chocolate - so there!). The cake layers are the application tiers - the top layer being GUI (with plenty of icing, I hope), above business logic, above persistence. The cream between the layers represents the interfaces. Objects are just parts of the cake contained completely within one layer - for the sake of the analogy imagine them as almonds. A conversation is like sticking a skewer through the cake to see if it's cooked. If any object wants to tell the user of the skewer it is uncooked it will leave raw cake sticking to the skewer. The session is the person holding the skewer. The session/person notes how may objects respond to the conversation by leaving cake on the skewer. They note the difference between the results of conversations to decide how long the cake has left to bake.&lt;/p&gt;

&lt;h4&gt;Conversation Data&lt;/h4&gt;
&lt;p&gt;A conversation starts with a client making a request on the program/server. Because this is usually a sequential operation, conversation specific data is rarely required, although I personally find an excellent use of it for messages. As each object gets involved in the conversation, there is the opportunity for a problem to arise. Some, such as validation and informational messages should be displayed to the user at the end of the conversation. I actually use the same system for exceptions and errors. I believe it's better to say to the user "Oops, you've encountered a problem" rather than display an unsightly and uninformative stack dump.&lt;/p&gt;
&lt;p&gt;Traditionally, conversation data is passed to the service methods as a parameter and passed down through the tiers as necessary. This sours the cake because it reduces the independence of said modules and is of no use for methods that don't have the parameter passed.&lt;/p&gt;
&lt;p&gt;A conversation is almost always a sequential set of steps from initiation to reply. By definition this means it will work in a single thread and that thread will do nothing else by service said conversation to completion. So, data keyed on the thread that is cleared at the start of every conversation will work as conversation data. The &lt;a href="http://marringtons.com/Java/Library/"&gt;Adept Java Library&lt;/a&gt; has a class &lt;i&gt;com.marringtons.util.ThreadData&lt;/i&gt; that provides the code necessary retrieve and update conversation specific data.&lt;/p&gt;

&lt;h4&gt;Session Data&lt;/h4&gt;
&lt;p&gt;A session starts when a client first accesses (anonymous) or logs in to the server. Most web application servers such as Tomcat maintain a session with the client browser. The class/method called to process the request from the browser has easy access to the session, but problems arise when code in the lower tiers need to keep session related information. They often need access to authorisation or environmental information and sometimes need historical accessess to their own usage.&lt;/p&gt;
&lt;p&gt;In both cases a class in a lower tier needs access to a session dictionary (map). Session and environment data need well-known keys, while local 'memory' can use the class name as a key. Examples? A service may use the client's name to customise a message or get user specific data like a history of transactions. Local data for a class that is to live for the life of a session can be used for caching information that does not change regularly for the user. You might, for example, wish to generate a menu tree for specific to the current user. By caching this as a private session variable, it only needs to be created once per login. Another equally valid question is "How?". This depends on your application container.&lt;/p&gt;
&lt;p&gt;It can be achieved without any infrastructure by passing a reference to the session data down through the tiers as a parameter to the method calls. This always works - and I have worked on both cleint/server and J2EE projects where it was used - but I have always thought this method a bit tacky. It always seems that when you need it the most it's for methods that do not have access to it. My preferred method is to provide infrastructure.&lt;/p&gt;
&lt;p&gt;As long as you have access to some connection to the original request you can gain access to session data from a dictionary of dictionaries.&lt;/p&gt;
&lt;p&gt;For Tomcat or similar 'simple' servers, create a dictionary using the thread ID pointing to the session information when a conversation starts. Use an ageing cache since HTTP is a connectionless protocol. Remove session data after a timeout.&lt;/p&gt;
&lt;p&gt;For J2EE servers life can become even more complicated. If you can be sure that all your EJBs are to be run in a single container then a method similar to above will work. If not, then we have a problem. Don't revert to passing all session data as a parameter: RMI calls between servers are expensive and passing additional unused data is inefficient.&lt;/p&gt;
&lt;p&gt;One solution is to have a consolidation layer between your GUI support layer and the EJBs. All EJB calls are fine-grained and are only passed the information they need to do a specific task. The other way is to use stateful session beans. The container manages a connection to a single client, allowing you to keep data between calls. While stateful session beans are not as evil as many developers make out, turning your whole application into a box of stateful session beans may not be the best way to go. Besides, they can only hold session private data for the bean and not common session data. So for my contribution, the &lt;a href="http://marringtons.com/Java/Library/"&gt;Adept Java Library&lt;/a&gt; has a class &lt;i&gt;com.marringtons.util.SessionData&lt;/i&gt; that provides the code necessary retrieve and update session specific specific data for single-cpu servers.&lt;/p&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/03/data-lifetimes.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111101030345944679</guid><pubDate>Wed, 16 Mar 2005 21:56:00 +0000</pubDate><atom:updated>2005-03-17T07:58:23.470+10:00</atom:updated><title>Code Generalisation - The Do's and Don'ts</title><description>&lt;p&gt;Ever since we started developing software there has been calls for code reusability. First there was (and still is) the library. In the 80's we talked about the 'black box', meaning component objects where only the interfaces were published. Later COM extended this principle. Then came object-oriented design and we talked objects. Now we have beans, activeX components, EJBs, applets, scriptlets and a myriad of ways to provide code for reuse.&lt;/p&gt;

&lt;p&gt;Even when technologies work together, their view of generalisation is different. For example, an EJB uses objects. Conventionally, objects can have instance and common (static) data. Objects used by EJBs, however can have separate 'common' data - uncommon data. I digress. This article is about when to write specific code and when to generalise.&lt;/p&gt;
&lt;p&gt; Why generalise code? There are two valid reasons:&lt;/p&gt;
  &lt;ol&gt;
    &lt;li&gt;Code reuse.
    &lt;/li&gt;
    &lt;li&gt;Clarity
    &lt;/li&gt;
  &lt;/ol&gt;

&lt;h4&gt;Generalisation for Code Clarity&lt;/h4&gt;
Let's take clarity first because it is easiest. Clarity is tantamount. Self documenting code is far easier to maintain that a long stream of unrelated groups of statements.

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
public Account getAccount()
  {
    User user = getUser();
    Account account = readAccount( user);
    updateTransations( account);
    return account;
  }

private Account readAccount( User user)
  {
    // connect to account system, retrieve and translate account details
    ... lots of technical code ...
  }

private void updateTransations( Account account)
  {
    // Retrieve recent transactions and update the account details accordingly.
    ... lots of technical code ...
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;The public getAccount() method clearly tells us functionally what is involved in retrieving account details and the names and function clearly match the business requirement. The private methods readAccount() and updateTransations() are never used elsewhere, but remove implementation details from the functional code. It makes sense to hold aside functional code from implementation into separate objects, quite possibly in different application tiers.&lt;/p&gt;

&lt;p&gt;In short, code separation for clarity is the main use for generalisation techniques and should be practiced constantly.&lt;/p&gt;

&lt;h4&gt;Generalisation for Code Reuse&lt;/h4&gt;

Everyone leaves university with the belief that every line of code they write is sacred and will be used over and over again (Or was I really that pig-headed?). Unfortunately no-one is taught how the rest of the world will know to use these new pearls of the developer's art. In fact, there are a heavy set of benefits in writing code specific to the task:

&lt;ol&gt;
    &lt;li&gt;It's more clear because the internals are not generalised (accountKey instead if key).&lt;/li&gt;
    &lt;li&gt;It's more concise, because the best generalised code must take into account conditions that in a specific instance would not occur. Why check for a null parameter in a specific method when the one caller cannot - under any conditions - pass a null? For a general method, one must cover for outcomes not obvious for any one caller.&lt;/li&gt;
    &lt;li&gt;For the same reason, it's faster to write - since we can design the internals to match the known user, we don't have to wrap our heads around all the possible uses that our new code could be put to.&lt;/li&gt;
    &lt;li&gt;It's easier to maintain because there is no fear of changing code that will cause other callers to behave differently. How often have we seen code that uses quirks of a known interface rather than just it's published uses? How often does this happen by accident?&lt;/li&gt;
    &lt;li&gt;It's easier on system testing since changes to more generalised code is more likely to require broad regression testing.&lt;/li&gt;
&lt;/ol&gt;

For the sake of impartiality, here's the argument for code reuse:
&lt;ol&gt;
    &lt;li&gt;Changes are made in one place - and effect all callers.&lt;/li&gt;
    &lt;li&gt;Smaller code base.&lt;/li&gt;
    &lt;li&gt;Behaviour is consistent across callers.&lt;/li&gt;
&lt;/ol&gt;

Hmm, do we see a trend here? Personally I follow this checklist:

&lt;ol&gt;
    &lt;li&gt;If I do not know of another use for the code I will write it in a way totally specific to the requirement.&lt;/li&gt;
    &lt;li&gt;If I suspect that other parts of the application are likely to used code the same or similar I will take care that the code involved is fairly separate. I will also take care that this does not take extra time. There will be no general interface or other non-specialised code.&lt;/li&gt;
    &lt;li&gt;When a second caller requires nearly or completely identical code I will review the common code and and refactor it as required. It should go no higher up the object tree than the common need.&lt;/li&gt;
    &lt;li&gt;If I identify the need for a low level common object I will be tempted to take the time to create it. I do not, however, add more general interface above what I need. Why account for float and double parameters when you only ever use the int ones? Only when the additional functionality is needed will I update the library class.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Pitfalls of Early Generalisation&lt;/h4&gt;

&lt;ol&gt;
    &lt;li&gt;You'll spend excessive time adding tests and interfaces that will not be used in case they are needed later.&lt;/li&gt;
    &lt;li&gt;You'll end up with code that has an excessive number of if() statements or similar branches to cater for different clients.&lt;/li&gt;
    &lt;li&gt;You'll have obscure object inheritances making it difficult to find who is doing what.&lt;/li&gt;
&lt;/ol&gt;

Do you want to see a beauty?

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
public static boolean isSet(Object o) {
    if (o == null) {
        return false;
    } else if (o instanceof Boolean) {
        return isBooleanSet((Boolean) o);
    } else if (o instanceof String) {
        return isStringSet((String) o);
    } else if (o instanceof Long) {
        return isLongSet((Long) o);
 ...
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

This one is possibly useful if the calling code did not know the type of object, but in all cases in the project that uses this method they do!

&lt;h4&gt;Code Generalisation Methods&lt;/h4&gt;

The simplest and most common is at the method level internal to an object. As we are creating the class we see use for code elsewhere and refactor it into a private method so that both can call it. This usually also makes it easier to read the calling method.
&lt;p&gt;
Subclassing can be used to place generalised code in the parent class to be used by children when needed. While the code is not as visible as when it is in the working class, it is clearly associated with the object heirarchy. The same method can be used to separate functional from implementation code, with the restriction that Java only allows single inheritance.
&lt;p&gt;
Helpers are separate objects or static class methods in a separate class that provides common code. A modern code library is a collection of helpers. Care must be taken with code helpers to ensure that all developers know of their existence. Because they are not physically connected to a class (as in inheritence) they can often be lost leading to inconsistencies and code duplication.
&lt;p&gt;
A bean is an independant item with a clear interface that can be used to ask it questions or have it perform actions. A bean is in truth the implementation of the software black box.

&lt;h4&gt;How to Find General Code - The Unanswered Question&lt;/h4&gt;

Code generalisation is a wonderful thing. It attracts designers and developers like moths to a flame. But, to carry on with the metaphors - there is a fly in the ointment. No-one has found an even marginally successful method of documenting common code in a way that potential users know that it exists. Sure, we all familiarise ourselves with the core libraries of the packages we use (do we?). We'll also look for libraries that fill our needs. The problem arises internal to a project. Most developers will develop a component for a complex system by looking for and finding a similar component and duplicating it's functionality. Common code may be pushed up the inheritence tree or refactored into helpers, but unless the team is small and tightly knit or the communications are very good, only a small percentage of the developers will make use of the new tools provided. Enforcing clear javadoc helps - if it is read. What other techniques are useful?</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/03/code-generalisation-dos-and-donts.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-111015869852529368</guid><pubDate>Mon, 07 Mar 2005 01:24:00 +0000</pubDate><atom:updated>2005-03-07T11:24:58.593+10:00</atom:updated><title>Unit Testing - Good for All</title><description>I am an avid unit test supporter. For my own projects I write unit
tests at all levels for all classes; I don't consider a class complete
without a matching unit test. But I get the bigger picture: because
it's my code I see it from the perspective of developer, architect,
designer, tester and stake holder.
In the corporate world things are a little different. Unit testing is
starting to see wide acceptance, but at best as a necessary evil.
&lt;ul&gt;
  &lt;li&gt;&lt;b&gt;The Developer&lt;/b&gt; sees it as a waste of development time. A
working reproducable unit test can easily double the testing time.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Architect&lt;/b&gt; ignores them as not his problem.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Development Manager&lt;/b&gt; has to continually balance
schedules and decide whether there is time to write 'correct' tests.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Designers&lt;/b&gt; don't want to know about them.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Project Manager&lt;/b&gt; does not want to have to justify the
push-out to the schedule they cause.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Test Manager&lt;/b&gt; is only interested on how many tests
there are and that they have all passed, and is generally only
interested in adding to that tally.&lt;br&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Stake Holders&lt;/b&gt; see no value in them and resent the
potential effects to time and budget.
  &lt;/li&gt;
&lt;/ul&gt;
For unit testing to reach its full potential, each and every group
needs to see the value to themselves and the project as a whole.
&lt;ul&gt;
  &lt;li&gt;&lt;b&gt;The Developer&lt;/b&gt; has the involvement in creating and
maintaining unit tests. Surprisingly they often feel that it gives them
the least gain. This could not be further from the truth. Changing
their development style encompass unit testing provides important of
gains:
    &lt;ol&gt;
      &lt;li&gt;
        &lt;b&gt;Perspective&lt;/b&gt;:
The developer gets to
exercise all the functionality of any given class in the way that
they see its clients using it. By writing a unit test for each class
before it is ever called from elsewhere, you get the all important
second perspective on the code design phase &amp;ndash; often revealing
aspects that could be changed or improved.
      &lt;/li&gt;
   &lt;li&gt;&lt;b&gt;Javadoc:&lt;/b&gt; Unit test code works well for examples of use in the Javadoc.&lt;/li&gt;
      &lt;li&gt;&lt;b&gt;Self Documentation:&lt;/b&gt; Developers commonly use classes
and packages by example. The best and least likely to be abused
examples will be in the unit tests attached to the class. If you need
to use the target class in a different way, update the unit test first
- both to test the usage and to provide a valid example for the future.
      &lt;/li&gt;
      &lt;li&gt;&lt;b&gt;Level of Confidence:&lt;/b&gt; The developer can release the
class for use knowing it meets a clear group of tests. Since these
tests run regularly, they can also be confident that changes to the
code or dependancies do not effect the expected uses for the class.
      &lt;/li&gt;
      &lt;li&gt;&lt;b&gt;Issue Resolution:&lt;/b&gt; When problems are reported in
testing, reproducing them can be quite convoluted. By updating the unit
test to reproduce the error, it is easier to debug and prove fixed.
      &lt;/li&gt;
      &lt;li&gt;&lt;b&gt;No Error Deja Vu:&lt;/b&gt; In projects without unit tests it is
common to have a problem fixed then come back on the next release. If
you have added unit tests to isolate problems before resolving them you
can be confident they will not return without being noticed until too
late.
      &lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Architect&lt;/b&gt; should consider unit tests as part of the
overall technical design. Current practice is to have the developer
totally in control of unit tests. This works fine for smaller systems
and stand-alone classes. It is less than practical in real-world
corporate projects where the information required to run a particular
test is much larger than the test itself. To test a close account
service, the unit test needs to open the account and set conditions to
both stop closure (outstanding invoices) and to allow closure again
(cancelled invoices). If left to the developer this can be a daunting
task. Don't leave your developers out in the cold! If the unit test
structure is part of the architectural design, test frameworks can be
built up that make individual tests easy to create and read.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Designer&lt;/b&gt; also has an important role to play also. By
providing valid real-world examples as part of the design document the
designer can ensure that the unit tests will use real information and
as such are much more likely to work in the final product. The designer
should also be on the lookout for variations so that they can be
documented for the testing process. Note that all this is already a
by-product of the design operation - one simply has to be conscious to
record everything.&lt;br&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Project Manager&lt;/b&gt; is trained to look at the big picture.
Competent unit testing means less errors -&amp;nbsp; more than making up
for the additional development time. In addition the increased level of
confidence in product stability means targets more accurately met.
Finally, questions on functionality implementation can be more quickly
answered by developers viewing the unit test rather than reviewing the
code. If the unit test does not excercise said functionality then it's
implementation is incidental and not to be trusted.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Test Manager&lt;/b&gt; can use code coverage tools to measure
the level of test coverage unit tests provide. The test manager should
also have input in the review of test data to ensure that the unit
tests provide for real-time situations. Lastly by using unit test pass
as a pre-requestite for test releases the test manager can ensure a
stable system before more advanced testing phases start.
  &lt;/li&gt;
  &lt;li&gt;&lt;b&gt;The Stake Holders&lt;/b&gt; simply need to be convinced that over
the life of the project unit testing improves their bottom line. They
will appreciate the higher level of confidence that unit tests provide
for releases. Perhaps we need to research or instigate studies
measuring the time saved by reducing the need for bug fixes during test
and product release life against the onset of additional development
time.
  &lt;/li&gt;
&lt;/ul&gt;</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/03/unit-testing-good-for-all.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-110316877129999684</guid><pubDate>Wed, 02 Mar 2005 03:45:00 +0000</pubDate><atom:updated>2005-03-02T13:26:00.260+10:00</atom:updated><title>Data Retrieval Patterns: The Data Transfer Object (DTO) Pattern</title><description>&lt;p&gt;One of the foremost drives of developers and architects for the last 3 decades has been the divide-and-conquer concept of isolationism - 'black-box development', as we coined it in the 80's.&lt;/p&gt;

&lt;p&gt;These days, we use objects and messages. A &lt;i&gt;DTO&lt;/i&gt; or &lt;i&gt;Data Transfer Object&lt;/i&gt; (read also, DAO or Data Access Object) is pure data used to pass cohesive information between functionally separate parts of the system.&lt;/p&gt;

&lt;p&gt;Unlike the rest of the software development world where we strive to reuse code, it is considered bad form to use a &lt;i&gt;DTO&lt;/i&gt; in more than one transfer. For example, a request for some database data may involve a &lt;i&gt;DTO&lt;/i&gt; between the persistence and service tiers and a second between service and GUI. As you can imagine, this methodology causes more transfer of data between objects, but it goes a long way to providing a level of isolation that would otherwise be impossible.&lt;/p&gt;

&lt;p&gt;Since the &lt;i&gt;DTO&lt;/i&gt; used by the persistence layer is never passed to the GUI it can't accidentally be changed and written back by the GUI. The down-side is a lot of deep copying of data. This is not as annoying as it sounds (touch wood!), since each black-box usually needs a different view. The persistence layer can have differed &lt;i&gt;DTO&lt;/i&gt;s for different tables. The service layer may combine, extract and apply business rules to that data before sending a result back to the GUI as a different type of &lt;i&gt;DTO&lt;/i&gt;.&lt;/p&gt;

&lt;p&gt;Because a &lt;i&gt;DTO&lt;/i&gt; has a limited life as a message between two well-known components, continued consistency of its contents can be the responsibility of the receiver. For this reason any of the earlier data access patterns can be used for the &lt;i&gt;DTO&lt;/i&gt;, including the Public Data Access Pattern.&lt;/p&gt;

&lt;pre style="font-style: italic; font-family: monospace; font-size: small;"&gt;
 class MyFirstDTO
   {
     public int integer;
  public String string;
   }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;h4&gt;Strengths&lt;/h4&gt;
&lt;ol&gt;
  &lt;li&gt;Provides true isolation between modules/tiers/objects/systems.&lt;/li&gt;
  &lt;li&gt;Used correctly it will present the requester with only the information they require.&lt;/li&gt;
  &lt;li&gt;Information can be provided using names clearly readable in the context in which they are used. Often an item named for what it is in one location is better known for why elsewhere. Thus, a persistence tier may name a column "&lt;i&gt;upper_case_description&lt;/i&gt;", while the service tier could name it by preference, for example "&lt;i&gt;descriptionToSearchOn&lt;/i&gt;".
  &lt;/li&gt;
  &lt;li&gt;Minimises the need for empty fields. If the DAO has an entry you should be able to assume you have the data. Use the class container method below to minimise source files.&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;Weaknesses&lt;/h4&gt;
&lt;ol&gt;
  &lt;li&gt;Source File Explosion: Many &lt;i&gt;DTO&lt;/i&gt;s means many objects means a full source tree, but use the class containers technique below to reduce the source file count.&lt;/li&gt;
  &lt;li&gt;Your data gets copied like a bag full of rabbits. However, all this data copying is not usually time-consuming as &lt;i&gt;DTO&lt;/i&gt;s hold mostly immutable data - such as Strings where the copy can be just a pointer.   &lt;/li&gt;
  &lt;li&gt;Code bloat: the server side of a client-server object pair can end up with a lot of code, even just loading &lt;i&gt;DTO&lt;/i&gt; messages to be sent to the client - often just a transfer from another &lt;i&gt;DTO&lt;/i&gt; after communicating with another tier or object.&amp;nbsp; This is unfortunately a necessary part of a DAO architecture model. So don't clutter business logic with DAO transfer code, but place it in a separate transfer object or method. It's even acceptable with a consolidator DAO to provide it with DAOs from another tier in the constructor, to allow it to populate itself.
    &lt;pre style="font-style: italic; font-family: monospace; font-size: small;"&gt;
class ConsolidatorDAO
  {
    public ConsolidatorDAO(
        FirstDAO firstDAO,
        SecondDAO secondDAO)
      {
        integer = firstDAO.integer;
        string = secondDAO.string;
      }

    public int integer;
    public String string;
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Uses&lt;/h4&gt;
&lt;ol&gt;
  &lt;li&gt;SOE (Service Oriented Architecture): Where the GUI layer calls a service layer for all business logic. The service layer will also in it's turn call a persistence layer for database and other data storage or retrieval mechanisms.&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;Hints - Class Containers&lt;/h4&gt;
One of the problems of using &lt;i&gt;DTO&lt;/i&gt; to transfer data between components is the proliferation of files in the source tree. The temptation is to use a common &lt;i&gt;DTO&lt;/i&gt; with only some fields populated, which is bad practise at the best of times. Where possible, variations should have their own DAO objects fully populated and ready for use, but to reduce the source tree and keep those common &lt;i&gt;DTO&lt;/i&gt; types together, use the class container pattern.

&lt;pre style="font-style: italic; font-family: monospace; font-size: small;"&gt;
public class CarDAO
  {
    int engineCapacity;
 String colour;
 String manufacturer;
 String model;
 int year;

    public static class Sports extends Car
   {
     String roofType;
  String suspensionType;
   }
   
   public static class OffRoad extends Car
     {
    boolean constant4WD;
    int clearanceInCM;
    boolean snorkel;
  }
  }
  //...
  CarDAO.Sports sportsCar = new CarDAO.Sports();
  CarDAO.OffRoad fourWheelDrive = new CarDAO.OffRoad();
  CarDAO oldCar = new CarDAO();
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

Here we have a transfer object for Car information. If we are dealing with a sports car we can create and read with CarDTO.Sports - and all the common Data Transfer Objects are in a single file.</description><link>http://marringtons.com/Adept/blog/Software.Development/2005/03/data-retrieval-patterns-data-transfer.html</link><author>Paul Marrington</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-9531034.post-110316873575838044</guid><pubDate>Fri, 25 Feb 2005 03:45:00 +0000</pubDate><atom:updated>2005-03-02T08:47:51.596+10:00</atom:updated><title>Data Retrieval Patterns: Method Access Pattern</title><description>While data is data, there are times when some minor code is required as part of the retrieval.  For clarity's sake this code should use the same name as the data it serves, and this name should be a noun (i.e. Count, dayOfWeek, style).  Methods requiring business logic should use names that are verbs (i.e. readLastCount or calculateMean).

&lt;pre style="FONT-STYLE: italic; FONT-FAMILY: monospace; font-size: small;"&gt;
class MyObject
  {
    private int integer;
    public int integer() { return integer; }
    public int integer( int newValue)
      { return integer = newValue; }
  }
&lt;/pre&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;h2&gt;Strengths of the Method Access Pattern&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Can use the same name as the data – providing a clear relationship between the two.&lt;/li&gt;
&lt;li&gt;2.The data name has a minimal pollution (braces at th