A generic cache

Update: Eric Grange suggested a change to TStringList that speeds it up significantly and place it well in front of TCache. I will update the article to reflect this in the end of the week. See the article comments for the details.

In my previous article about a generics based case statement for strings, I commited many sins towards the Church Of Pure Pascal πŸ™‚

One of them was to not check for prior art. Sergey Antonov aka 0xffff did something similar back in April 2010 (Note: two links!).

So, to make penance for my lack of purity (and have a chance to sin some more), I have tried to take the good parts of the concept and create something less ugly, and more comfortable to use.

I rewrote the generic class and named it TCache. I kept the configurable key type, and I made the lookup value configurable as well. Basically, it all ends up as a thin wrapper around a dictionary, but I quite like the simple declaration you can achieve with this approach.

Example declaration

var
Cache : TCache;
i : Integer;
begin
if not Assigned(Cache)
then TCache
.Define(Cache, 0)
['alpha', 11]
['bravo', 22]
['charlie', 33]
['delta', 44]
['echo', 55]
['foxtrot', 66];

i := Cache.Lookup('charlie');

Remember that can be almost any type you like, including code (for the value part, at least).

Here is the class. Note that I also keep track of the index of the order each key/value was added. This could be removed. Also note that I still do dirty deeds, such as a dangerous cast. I guess I just suck at writing clean code ;).

unit GenericsCache;

/// Written by Lars Fosdal , December 5, 2010

interface
uses
SysUtils, Generics.Collections;

type
TCacheEntry = record
Value: T;
Index: Integer;
end;

TCache = class(TObjectDictionary<KeyT, TCacheEntry>)
private
FCache: TCache;
FDefaultValue: ValT;
function AddValue(const Id: KeyT; const Value: ValT): TCache;
protected
function ValidateId(Id: KeyT): KeyT; virtual;
public
class function Define(var Cache; const aDefaultValue:ValT): TCache;
function Lookup(const Id: KeyT):ValT;
function Index(const Id: KeyT):Integer;
property DefaultValue: ValT read FDefaultValue write FDefaultValue;
property Values[const Id: KeyT; const Value: ValT]: TCache read AddValue; default;
end;

TCaseStringCache = class(TCache)
function ValidateId(Id: String): String; override;
end;

implementation

{ TCache }

function TCache.AddValue(const Id: KeyT; const Value: ValT): TCache;
var
Rec : TCacheEntry;
begin
Result := Self;
Rec.Value := Value;
Rec.Index := Count;
Add(ValidateId(Id), Rec);
end;

class function TCache.Define(var Cache; const aDefaultValue: ValT): TCache;
begin
Result := Create;
Result.FCache := Result;
Result.DefaultValue := aDefaultValue;
TCache(Cache) := Result;
end;

function TCache.Index(const Id: KeyT): Integer;
var
Rec : TCacheEntry;
begin
if TryGetValue(ValidateId(Id), Rec)
then Result := Rec.Index
else Result := -1;
end;

function TCache.Lookup(const Id: KeyT): ValT;
var
Rec : TCacheEntry;
begin
if FCache.TryGetValue(ValidateId(Id), Rec)
then Result := Rec.Value
else Result := DefaultValue;
end;

function TCache.ValidateId(Id: KeyT): KeyT;
begin
Result := Id;
end;

{ TCaseStringCache }

function TCaseStringCache.ValidateId(Id: String): String;
begin
Result := LowerCase(Id);
end;

end.

I wrote a simple benchmark, testing different ways to use this, and also comparing it to do String2Index / Case -type lookup mechanisms as well as if/then/else, and the ugly method from my previous article. Several people suggested using a string to index / case approach. I have also used that many times. The painful part of strings to index, is that if you change the order of the strings, you also have to change the indices. TCache makes the index entirely optional, since the string is the index – if you see what I mean.

See below for the test code.

The Good, The Bad, and the Ugly.

The test uses GetTickCount and 5.000.000 iterations, for each method, repeated 10 times, with 12 strings (the numbers are consistant at 6 strings as well – except that string to index will be slightly faster) and the process priority was set to High to avoid other parts of the PC affecting the numbers. I weighted the results towards the results for the if/then/else. So Perf tells you how many times slower than if/then/else each test was.

Method Perf AvgRun Comment
TStringSwitch 52.94 35340.7 The ugly was really ugly performance-wise as well.
AnsiIndexTextFunc 12.40 8277.3 String to Index, then case/anon.method is no speed demon either.
StringIndex 12.05 8041.7 Jolyon’s variant is around the same speed.
AnsiIndexText 12.03 8032.6 If you remove anon.methods, the impact is not huge.
TStringList 6.94 4631.6 Using a pre-created sorted string list is nearly twice the speed of AnsiIndexText.
TCacheProc 3.70 2471.1 Cool, TCache and anon.methods are nearly twice the speed of a string list.
TCacheFunc 3.71 2475.8 No signficant difference between a procedure and function.
TCacheFuncStack 3.72 2480.5 Passing the anon.method by stack doesn’t cost much either.
TCaseStringCache 3.12 2085.6 Case insensitive string to string saves some time over using anon.methods.
TCacheString 2.93 1957.9 So does eliminating the LowerCase function. TCache with string/string is 3 times slower than if/then/else, but it is 4 times faster than AnsiIndexText, and more than twice as fast as a string list.
If/then/else 1 667.6 You can’t beat this. You also can’t enjoy maintaining it.
LastKey 0.39 260.4 No surprise that indexed array lookups are fast. Yes, I know I am not looking up the same strings, but the cost is the same.

If you need to repeatedly do lookup by strings, you can benefit from using something like TCache. Also – for the AnsiIndexText – it does a sequential search for the match, hence if the list is long, or the later entries are more commonly used – it will degrade performancewise. Without having dissected TDictionary in detail, I would believe that it’s hash table will allow TCache to remain relativly constant in performance, even if you add thousands of entries.

You could also shave off some more by eliminating the ValidateId methods.

Here is the test program (which also use the unit from my previous article).

program TestGenericsSwitch;
{$apptype Console}
uses
ExceptionLog,
Windows,
Classes,
SysUtils,
StrUtils,
GenericsCache in 'GenericsCache.pas',
GenericsSwitch in 'GenericsSwitch.pas';

{$define Twelve} // remove this to run with 6 strings

const
{$ifndef Twelve}
Elements = 6;
{$else}
Elements = 12;
{$endif}
TestCount = 5000000;
Keys : Array[0..Elements] of String
= ('alpha','bravo', 'charlie', 'delta', 'echo', 'foxtrot',
{$ifdef Twelve}
'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima',
{$endif}
'what');

type
TFunc = reference to function:String;
TProc = reference to procedure;

function RandomKey:String;
begin
Result := Keys[Random(Elements + 1)];
end;

function AssignTest:String;
var
ix : Integer;
s: String;
begin
for ix := 0 to TestCount - 1
do TStringSwitch.CaseOf(RandomKey)
['alpha', procedure begin
s := 'Definitively any case';
end]
['bravo', procedure begin
s := 'B all you can B';
end]
['charlie', procedure begin
s := 'Checkpoint C';
end]
['delta', procedure begin
s := 'Checkpoint D';
end]
['echo', procedure begin
s := 'Checkpoint E';
end]
['foxtrot', procedure begin
s := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', procedure begin
s:= 'golf';
end]
['hotel',procedure begin
s:= 'hotel';
end]
['india',procedure begin
s:= 'india';
end]
['juliet', procedure begin
s:= 'juliet';
end]
['kilo', procedure begin
s:= 'kilo';
end]
['lima', procedure begin
s:= 'lima';
end]
{$endif}
.ElseCase(procedure begin
s := 'Else what?';
end)
.EndCase;
Result := s;
end;

function AssignTestIf:String;
var
ix : Integer;
s, t: String;
begin
for ix := 0 to TestCount - 1 do
begin
t := LowerCase(RandomKey);
if t = 'alpha' then
s := 'Definitively any case'
else if t = 'bravo' then
s := 'B all you can B'
else if t = 'charlie' then
s := 'Checkpoint C'
else if t = 'delta' then
s := 'Checkpoint D'
else if t = 'echo' then
s := 'Checkpoint E'
else if t = 'foxtrot' then
s := 'Checkpoint F'
{$ifdef Twelve}
else if t = 'golf' then
s := 'golf'
else if t = 'hotel' then
s := 'hotel'
else if t = 'india' then
s := 'india'
else if t = 'juliet' then
s := 'juliet'
else if t = 'kilo' then
s := 'kilo'
else if t = 'lima' then
s := 'lima'
{$endif}
else
s := 'Else what?';
end;
Result := s;
end;

function AssignTestS2I:String;
var
ix : Integer;
s: String;
begin
for ix := 0 to TestCount - 1
do case AnsiIndexText(RandomKey, ['alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'
{$ifdef Twelve}
, 'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima'
{$endif}
]) of
0 : s := 'Definitively any case';
1 : s := 'B all you can B';
2 : s := 'Checkpoint C';
3 : s := 'Checkpoint D';
4 : s := 'Checkpoint E';
5 : s := 'Checkpoint F';
{$ifdef Twelve}
6 : s := 'golf';
7 : s := 'hotel';
8 : s := 'india';
9 : s := 'juliet';
10 : s := 'kilo';
11 : s := 'lima';
{$endif}
else s := 'Else what?';
end;
Result := s;
end;

function StringIndex(const aString: string; const aCases: array of string;
const aCaseSensitive: Boolean): Integer;
begin
if aCaseSensitive then
begin
for Result := 0 to Pred(Length(aCases)) do
if ANSISameText(aString, aCases[Result]) then
EXIT;
end
else
begin
for Result := 0 to Pred(Length(aCases)) do
if ANSISameStr(aString, aCases[Result]) then
EXIT;
end;

Result := -1;
end;

function AssignStringIndexFunc:String;
var
ix : Integer;
func : TFunc;
begin
for ix := 0 to TestCount - 1
do case StringIndex(RandomKey, ['alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'
{$ifdef Twelve}
, 'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima'
{$endif}], false) of
0 : func := function:String begin
Result := 'Definitively any case';
end;
1 : func := function:String begin
Result := 'B all you can B';
end;
2 : func := function:String begin
Result := 'Checkpoint C';
end;
3 : func := function:String begin
Result := 'Checkpoint D';
end;
4 : func := function:String begin
Result := 'Checkpoint E';
end;
5 : func := function:String begin
Result := 'Checkpoint F';
end;
{$ifdef Twelve}
6 : func := function:String begin
Result:= 'golf';
end;
7 : func := function:String begin
Result:= 'hotel';
end;
8 : func := function:String begin
Result:= 'india';
end;
9 : func := function:String begin
Result:= 'juliet';
end;
10: func := function:String begin
Result:= 'kilo';
end;
11: func := function:String begin
Result:= 'lima';
end;
{$endif}
else func := function:String begin
Result := 'Else what?';
end;
end;
Result := Func;
end;

function AssignTestS2IFunc:String;
var
ix : Integer;
func : TFunc;
begin
for ix := 0 to TestCount - 1
do case AnsiIndexText(RandomKey, ['alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'
{$ifdef Twelve}
, 'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima'
{$endif}]) of
0 : func := function:String begin
Result := 'Definitively any case';
end;
1 : func := function:String begin
Result := 'B all you can B';
end;
2 : func := function:String begin
Result := 'Checkpoint C';
end;
3 : func := function:String begin
Result := 'Checkpoint D';
end;
4 : func := function:String begin
Result := 'Checkpoint E';
end;
5 : func := function:String begin
Result := 'Checkpoint F';
end;
{$ifdef Twelve}
6 : func := function:String begin
Result:= 'golf';
end;
7 : func := function:String begin
Result:= 'hotel';
end;
8 : func := function:String begin
Result:= 'india';
end;
9 : func := function:String begin
Result:= 'juliet';
end;
10: func := function:String begin
Result:= 'kilo';
end;
11: func := function:String begin
Result:= 'lima';
end;
{$endif}
else func := function:String begin
Result := 'Else what?';
end;
end;
Result := Func;
end;

function AssignCaseStringCache:String;
var
ix : Integer;
s: String;
Cache : TCaseStringCache;
begin
TCaseStringCache.Define(Cache,
'Else what?')
['alpha', 'Definitively any case']
['bravo', 'B all you can B']
['charlie', 'Checkpoint C']
['delta', 'Checkpoint D']
['echo', 'Checkpoint E']
['foxtrot', 'Checkpoint F']
{$ifdef Twelve}
['golf', 'golf']
['hotel', 'hotel']
['india', 'india']
['juliet', 'juliet']
['kilo', 'kilo']
['lima', 'lima']
{$endif};
for ix := 0 to TestCount - 1
do s := Cache.Lookup(RandomKey);
Result := s;
end;

function AssignCacheString:String;
var
ix : Integer;
s: String;
Cache : TCache;
begin
TCache
.Define(Cache, 'Else what?')
['alpha', 'Definitively any case']
['bravo', 'B all you can B']
['charlie', 'Checkpoint C']
['delta', 'Checkpoint D']
['echo', 'Checkpoint E']
['foxtrot', 'Checkpoint F']
{$ifdef Twelve}
['golf', 'golf']
['hotel', 'hotel']
['india', 'india']
['juliet', 'juliet']
['kilo', 'kilo']
['lima', 'lima']
{$endif};
for ix := 0 to TestCount - 1
do s := Cache.Lookup(RandomKey);
Result := s;
end;

function AssignCacheProc:String;
var
ix : Integer;
s: String;
Cache : TCache;
begin
TCache.Define(Cache,
procedure begin
s := 'Else what?';
end)
['alpha', procedure begin
s := 'Definitively any case';
end]
['bravo', procedure begin
s := 'B all you can B';
end]
['charlie', procedure begin
s := 'Checkpoint C';
end]
['delta', procedure begin
s := 'Checkpoint D';
end]
['echo', procedure begin
s := 'Checkpoint E';
end]
['foxtrot', procedure begin
s := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', procedure begin
s:= 'golf';
end]
['hotel', procedure begin
s:= 'hotel';
end]
['india', procedure begin
s:= 'india';
end]
['juliet', procedure begin
s:= 'juliet';
end]
['kilo', procedure begin
s:= 'kilo';
end]
['lima', procedure begin
s:= 'lima';
end]
{$endif};

for ix := 0 to TestCount - 1
do Cache.Lookup(RandomKey)();
Result := s;
end;

function AssignCacheFuncStack:String;
var
ix : Integer;
s: String;
Cache : TCache;
begin
TCache.Define(Cache,
function:String begin
Result := 'Else what?';
end)
['alpha', function:String begin
Result := 'Definitively any case';
end]
['bravo', function:String begin
Result := 'B all you can B';
end]
['charlie', function:String begin
Result := 'Checkpoint C';
end]
['delta', function:String begin
Result := 'Checkpoint D';
end]
['echo', function:String begin
Result := 'Checkpoint E';
end]
['foxtrot', function:String begin
Result := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', function:String begin
Result := 'golf';
end]
['hotel',function:String begin
Result := 'hotel';
end]
['india',function:String begin
Result := 'india';
end]
['juliet', function:String begin
Result := 'juliet';
end]
['kilo', function:String begin
Result := 'kilo';
end]
['lima', function:String begin
Result := 'lima';
end]
{$endif};

for ix := 0 to TestCount - 1
do s := Cache.Lookup(RandomKey)();
Result := s;
end;

function AssignCacheFunc:String;
var
ix : Integer;
s: String;
func: TFunc;
Cache : TCache;
begin
TCache.Define(Cache,
function:String begin
Result := 'Else what?';
end)
['alpha', function:String begin
Result := 'Definitively any case';
end]
['bravo', function:String begin
Result := 'B all you can B';
end]
['charlie', function:String begin
Result := 'Checkpoint C';
end]
['delta', function:String begin
Result := 'Checkpoint D';
end]
['echo', function:String begin
Result := 'Checkpoint E';
end]
['foxtrot', function:String begin
Result := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', function:String begin
Result := 'golf';
end]
['hotel',function:String begin
Result := 'hotel';
end]
['india',function:String begin
Result := 'india';
end]
['juliet', function:String begin
Result := 'juliet';
end]
['kilo', function:String begin
Result := 'kilo';
end]
['lima', function:String begin
Result := 'lima';
end]
{$endif};

for ix := 0 to TestCount - 1
do begin
func := Cache.Lookup(RandomKey);
s := func;
end;
Result := s;
end;

function AssignStringList:String;
var
ix, fx : Integer;
s: String;
func : TFunc;
obj : TObject absolute func;
StrList : TStringList;
begin
StrList := TStringList.Create;

StrList.AddObject('alpha', TObject(function:String begin
Result := 'Definitively any case';
end));
StrList.AddObject('bravo', TObject(function:String begin
Result := 'B all you can B';
end));
StrList.AddObject('charlie', TObject(function:String begin
Result := 'Checkpoint C';
end));
StrList.AddObject('delta', TObject(function:String begin
Result := 'Checkpoint D';
end));
StrList.AddObject('echo', TObject(function:String begin
Result := 'Checkpoint E';
end));
StrList.AddObject('foxtrot', TObject(function:String begin
Result := 'Checkpoint F';
end));
//{$ifdef Twelve}
StrList.AddObject('golf', TObject(function:String begin
Result := 'golf';
end));
StrList.AddObject('hotel',TObject(function:String begin
Result := 'hotel';
end));
StrList.AddObject('india',TObject(function:String begin
Result := 'india';
end));
StrList.AddObject('juliet', TObject(function:String begin
Result := 'juliet';
end));
StrList.AddObject('kilo', TObject(function:String begin
Result := 'kilo';
end));
StrList.AddObject('lima', TObject(function:String begin
Result := 'lima';
end));
//{$endif}

StrList.Sorted := True;

for ix := 0 to TestCount - 1
do begin
fx := StrList.IndexOf(RandomKey);
if fx >= 0
then begin
obj := StrList.Objects[fx];
s := func;
end
else s := 'ElseWhat';
end;
Result := s;
end;

function TimeIt(proc: TFunc; name:String):Integer;
var
start : Cardinal;
LastLookup : String;
oldSeed : Integer;
begin
OldSeed := RandSeed;
start := GetTickCount;
LastLookup := Proc;
Result := GetTickCount - start;
Writeln(Format('%-18s %6d - %s', [name, Result, LastLookup]));
RandSeed := OldSeed;
end;

function LastKey:String;
var
ix : Integer;
s : String;
OldSeed : Integer;
begin
OldSeed := RandSeed;
for ix := 0 to TestCount - 1
do s := RandomKey;
RandSeed := OldSeed;
Result := s;
end;

procedure Test;
const
Repeats = 10;
var
ix : Integer;
begin
for ix := 0 to Repeats - 1
do begin
Randomize;
// RandSeed := 2003112605;
Writeln;
Writeln(Format('strings=%d, repeats=%d, seed=%d', [Elements, TestCount, RandSeed]));
TimeIt(AssignTest,'TStringSwitch');
TimeIt(AssignTestS2IFunc, 'AnsiIndexText Func');
TimeIt(AssignStringIndexFunc, 'StringIndex');
TimeIt(AssignTestS2I, 'AnsiIndexText');
TimeIt(AssignStringList,'TStringList');
TimeIt(AssignCacheProc, 'TCache Proc');
TimeIt(AssignCacheFunc, 'TCache Func');
TimeIt(AssignCacheFuncStack, 'TCache Func Stack');
TimeIt(AssignCaseStringCache, 'TCaseStringCache');
TimeIt(AssignCacheString, 'TCache String');
TimeIt(AssignTestIf, 'If/then/else');
TimeIt(LastKey, 'LastKey');
end;
end;

begin
try
Write('Press Enter to start: ');
Readln;

Test;

finally
Writeln;
Write('Press Enter: ');
Readln;
end;
end.

How time flies…

It is amazing how time flies when you get engaged to the woman of your dreams, and move to another city! In the midst of moving all the furniture, refurbishing and selling the old apartment, getting to know my new bonus kids, and walking the dog – I also changed jobs πŸ™‚ I am no longer an Oslo citizen, but mainly work from home, about 180km further south on the sunny coastline from Oslo. So… what happened on the Delphi level of things?

Status: D5 -> D2009 port: It never completed. The dependencies between the old TopView grid and the database ORM were too large, and the TopView component turned to be for all practical purposes – non-portable. Not only because of the Unicode change, but also due to the fact that they changed the database TBookmark definition. At that point, it became clear that there was not enough resources/time to complete the work needed to complete port within reasonable cost.

I did plan on writing further posts on the migration process, but the main points about the Unicode change have already been covered elsewhere. All in all, it is quite remarkable how smooth that transition seem to have gone.

In January 2010, I began in a new position at Tine SA. No more porting projects, but writing and maintaining data warehouse, production and logistics related code for the dairy product manufacturing at Tine, using Delphi 2010, and integrating with Lawson’s MOVEX M3 AS/400 ERP systems, as well as various robotic systems, using MSSQL Server 2008 as the backend.

The Tine team is a distributed team, working from many different geographical locations, that meet up at the same place physically just a few days every month. I am pleasantly surprised of how enjoyable and effective this way of working is! Naturally, it requires a bit more planning and coordination, but with good project management and regular meetings – it works better than I could have hoped for. The only drawback is that when I am mentally engaged in solving a new software challenge, my lady will complain that I spend too much time by the computer πŸ™‚

Generics: We recently started refactoring the class hierarchies using Generics, and that has shaved about 7% off the number of lines of code in the projects so far. Hopefully, when we are done – the codebase will be 10-15% smaller than what we started with, and a lot simpler to maintain.

Generics is fun. Generics is also somewhat painful, as there are a lot of flaws in D2010. I can’t wait to see which 87 generics issues that have been fixed in Delphi XE!

During the fall, I plan resuming the work on the FDCLib, and focus on getting the most out of Generics, Attributes, and Anonymous Methods. Stay tuned.

FDCLib – Wizards without the black magic – part 1

It is time to get started with another contribution to the FDCLib.

Frames are great, but they can require a bit of fiddling to work well. The plan is to create a simple system to organize frames for tabbed views and wizards and simplify some of the design problems related to visual inheritance as well as class inheritance.

Design goals
1a. Each frame must not need to descend from the same class
1b. It must still be possible to extend frames by inheritance
2a. Each frame should not need to contain extensive knowledge about other frames
2b. Each frame must be able to obtain knowledge about/from other frames
3. Each frame should have (optional) input validation
4. It must be possible to make reusable frames such as file pickers, etc.
5. It must be possible to use the frames in different contexts (modal or modeless, window or dialog)

Using a similar approach to what I did for the grid controller, I will wrap myself around a basic TFrame and grab hold of the hooks that exist and move the business logic into a non-visual class known as TFrameWrapper.

To organize these non-visual classes, I will create a TFrameController. The frame controller will hold the knowledge about how the frames are to be presented, organized, and navigated.

Teaser: The frame controller will be reused in a future part of FDCLib. What part of the application do you usually need first, but often end up writing last? No, not the documentation πŸ˜› – we are talking about code after all πŸ™‚

Before I move on – I’d like gather some intel about how you are using frames today.

Your comments will be greatly appreciated!

Writing Readable Code – Paul’s Snippet Rewritten

Five brave people have contributed rewritten versions of Paul’s code. Let’s take a look at their approach.

I reformatted Paul’s example using two spaces for every indent level (after then, begin, else, etc. ), hoping that this is the way he originally wrote it.

IMO, Paul’s formatting (or my assumptions about his formatting) does not reflect the true code path, and trying to decipher the code paths become unnecessarily hard.

  try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then begin
if ConditionD then
code(2);
code(3);
end
else if ConditionE then begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end else begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

Here is a small exercize. If we say that the string “ABCDEFG” are all conditions TRUE, and the string “abcdefg” are all conditions false:
β€’ What codes are run by “ABCDEFG”?
β€’ What codes are run by “aBcdefg”?
β€’ What string(s) would make code(6) run?

TS contributed a very nicely formatted version, which is effective in guiding us through the potential code paths. I like his style.

  try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then
begin
if ConditionD then
code(2);
code(3);
end // without the semi-colon
else if ConditionE then
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end
else
begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

SS didn’t quite manage to reflect the flow in the code and his formatting is similar to Paul’s code.

  try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then
begin
if ConditionD then
code(2);
code(3);
end //; I assume this semi-colon has to go
else if ConditionE then
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end
else
begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

Jolyon scores high on restructuring and simplifying, but he made one mistake in his change. Under which condition will his code behave differently form the original?

  procedure WhenC;
begin
if ConditionD then
code(2);
code(3);
end;

procedure WhenE;
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end;

begin
try
if ConditionA then
Code(1)
else if NOT ConditionB then
EXIT;

if ConditionC then
WhenC
else if ConditionE then
WhenE
else
begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

AO goes even further than Jolyon in reformatting and comments: As I believe that proper formatting is not the only way to increase readability of code, I also refactored it a bit to reduce nesting where possible. If this were an actual code, I would have probably gone even further and introduced separate routines for the individual blocks, depending on their complexity.

  try
{ This comment explains why ConditionA is handled first. }
if ConditionA then begin
Code(1);
Exit;
end;

{ This comment explains why the aggregate of ConditionB and ConditionC is handled next. }
if ConditionB and ConditionC then begin
{ This comment explains why ConditionD is relevant only in this block. }
if ConditionD then
code(2);

code(3);
Exit;
end;

{ This comment explains why ConditionE is handled next. }
if ConditionE then begin
code(4);
morecode('X');

{ This comment explains why ConditionF and ConditionG are relevant only in this block. }
if ConditionF then
code(5)
else if ConditionG then
code(6);

Exit;
end;

{ This comment explains the default handling. }
code(7);
morecode('Y');
finally
{ This comment explains why the following code must execute no matter what. }
code(8);
end;

This is readable, but personally I think AO went a bit too far. Like Jolyon, he also missed a structural detail in the refactoring and broke the code. Which condition state(s) will cause the code to misbehave?

I am divided on the use of Exit. It can add clarity, but it can also be a big problem as you leave a lot of code “dangling”. If you decide to move the exit – you have to be really careful to ensure that any states that suddenly move in or out of scope behave correctly. If there is a nesting church and a chunking church, I’m probably a “Nestorian”.

I do agree that refactoring is a valuable tool to clarify and simplify, and we should make an effort to break down our code into manageable blocks, but in this particular case it probably isn’t necessary.

Another thing: I avoid using {curly braces} for in-code commentary and use // instead. Why? Because if you need to comment out code containing select count(*) from table, those curlies will work where the (* comment *) fail. Should CodeGear add support for comment nesting? I don’t know…

Anyways…
Here’s how I would format the example. This is very similar to TS’s example, except that I showel all the reserved words to the left side to leave the logic more visible and commentable, but I also add indentation to the innermost conditional code.

  try
if ConditionA
then Code(1)
else if ConditionB
then if ConditionC
then begin
if ConditionD
then code(2);
code(3);
end
else if ConditionE
then begin
code(4);
morecode('X');
if ConditionF
then code(5)
else if ConditionG
then code(6);
end
else begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

MJ adds a contribution with the following comment: “To be sure not to create any future bugs you should consider adding a begin end section after every if statement even if it is not required, but that will make the code more unreadable.”

  try
if ConditionA then
Code(1)
else if ConditionB then
begin
if ConditionC then
begin
if ConditionD then
code(2);
code(3);
end
else if ConditionE then
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end
else
begin
code(7);
morecode('Y');
end;
end;
finally
code(8);
end;

Good Points! In all honesty, I also screwed up the code blocks on first try. Conditional code will bite you if you are not very very careful. You should indeed think about what may happen if you need to add more code and/or conditions. Personally, I don’t think a few more enclosures makes the code less readable. Here is how I would be more explicit in the use of enclosures to make the code less ambiguous.

  try
if ConditionA
then Code(1)
else begin
if ConditionB
then begin
if ConditionC
then begin
if ConditionD
then code(2);
code(3);
end
else begin
if ConditionE
then begin
code(4);
morecode('X');
if ConditionF
then code(5)
else if ConditionG
then code(6);
end
else begin
code(7);
morecode('Y');
end;
end;
end;
end;
finally
code(8);
end;

There is no one true correct way of formatting.
The point I am trying to make is that code structure matter for understanding the code at first glance, and we should be mindful about how we lay it out. We should strive for consistency, but always keep clarity and unambiguity as priority one. Bend your rules, if you need to.

Paul, Thank you for creating such a devious code snippet!

P.S. If you haven’t figured out the answers yet, load up your Delphi and run all the examples in StructureDemo.dpr.
No peeking until you have some suggestions! πŸ™‚
(Hint: else starts a new block)

Edit: Added an example of using Exits instead of nesting to the initial post on formatting. It would be interesting to see DelphiFreak have a go at Paul’s snippet.

Writing readable code – Comment for Context

While waiting for more suggestions to Paul’s snippet, and while weaving and dodging through the virtual fireballs from the previous posts on code formatting and comments, I fearlessly continue my Don Quixote Crusade for readable code. I am not quite done with comments yet.

It is said that “Real programmers don’t write comments – It was hard to write – it should be hard to read”. That doctrine is way overdue for deletion.

Comments are the context frame of our code.

Comments are like garlic.
Too little and the result is bland and unpalatable, too much and it stinks things up. Comments should only in rare occasions assume that the reader is too dumb to read the code properly, so as a general rule – we should not rewrite our code in plain english, line for line. Such an approach tend to fog up [sic] the source, and it becomes a drag to maintain (Yeah, I know… garlic metaphors stink…).

Comments are like news headlines.
// File ready for saving!
// No more filehandles, says OS
// Raises exception for World+dog
They should be short, concise and to the point. In good tabloid tradition, they should also only touch on the very general points, and oversimplifying what really is going on.

Comments are like foreplay.
They serve to get us readers in the mood to appreciate the sleek lines, the richness of it’s properties, and possibly arouse our interest to the point where we are ready to ravage the code.

Comments are like, bi-se… uh directional.
Err, well … what I am trying to say is that sometimes we can write them before we do stuff, while at other times – it can be just as effective to write them after something has been done in the code. The first would probably indicate how we are going to do something, while the latter would focus on the result of what we just did.

Pick up the garbage!
Do you leave your workshop tools lying about in case you need them quickly, or do you store them safely away? Once you finalize your code, clean it up. Put “I deleted this, moved that, added that” comments in your version control commit/check-in comments, and don’t litter the source code with it. Once the code has been created or removed, the reason is uninteresting. What the code is supposed to do, is so much more valuable information. Remove unnecessary comments.

Don’t spread FUD.
Your commentary should embellish the correctness and function of your code, not underline how insecure, unexpectedly behaving, mysterious, or potentially unreliable code it is. If there is remaining work to be done, that should be written as a To-Do point, containing sufficient information to guide you to a starting point for that work.

It takes practice to write good commentary (life-long practice, some would say), but unless your code is totally clear and unambiguous to the point of self-explaining to your grandmother, you should probably add some comments.

To round it off, here are a couple of disturbing – but funny – comments I have come across in production code:


// ---- Better let sleeping dogs lie? Note the date. ----

try
if ValidParentForm(self) nil then
begin
ValidParentForm(Self).ActiveControl:=NIL;
//Turn this back on when I return from my holiday and have time to test it
//N.N. 20.07.2001
{if ValidParentForm(Self).ActiveControl=lFocusedControl then
//Control doesn't let go of focus, most likely because of some error message
exit; //Aborting the routine}
end;
if DataSet is TExtendedDataSet then
TExtendedDataSet(DataSet).Save
else
DataSet.Post;
finally
. . .

// ---- Unusual if/then/else construct ----

if (EditSomeProp.Text '')
and ((RequestTable[i].NUMREF+1) > StrToInt(EditSomeProp.Text)) then
// Watch out for Elsie here
else
begin
. . .

Should I or should I not uncomment the first comment? Well, it has been inactive for 7 years, so I think I’ll leave it as is, or remove it.
The second one is history, due to a rewrite.

Writing Readable Code – Paul’s Snippet – Show your formatting!

Paul made an comment in my previous post, but unfortunately the Blogger comment system ate all the whitespace. Paul, if you read this – please reformat the code below and email it to me as a zipped attachment! I would also love to receive everybody else’s reformatted version as well!
Your anonymity is guaranteed (Unless you explicitly permit me to name you and/or link to a site or something).

Please note that there was a syntax problem in the original snippet, and I have indicated that below. I suggest that unless Paul instruct us otherwise, we take out the semi-colon after the end on that line.


try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then begin
if ConditionD then
code(2);
code(3);
end; // I assume this semi-colon has to go
else if ConditionE then begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end else begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

Writing Readable Code – Formatting and Comments

I guess there are at least as many coding styles as there are programmers times programming languages. Many of us are conformists, and some of us have our highly personal style, but most of us probably fail at being 100% consistant. Code formatting can probably be considered a slightly religious area, so please feel free to disagree with me πŸ™‚

So – why bother to put any effort into the formatting anyways? It is not like the compiler care? I mean, apart from the obvious requirement that things should be reasonable readable and not just a jumble of code?

Layout matters. It is significantly easier to pick up code that is consistant in style than wading through spaghetti (or gnocchi) code where the indentation has not been considered much.

I have noticed that my personal style with regards to block formatting definitively is non-conformist. Whenever there is a do, if/then/else, with or without related begin/end, I will probably annoy the heck out of the conformists.

My formatting goals:
β€’ I want clear indication of flow
β€’ I want to separate conditions from actions
β€’ I want room to comment

In the examples below, I will place comments in the individualist examples to give an indication of why I choose to wrap/indent in this way rather than the “global standard”

Let’s start with the do in with and the for-loop (I do at times, albeit rarely, confess to the occasional use of with).


// Conformist
for i := 1 to 5 do Something;

for i := 1 to 5 do begin
Something;
SomethingMore;
end;

with SomeObject do begin
Something;
SomethingMore;
end;

// Individualist
for i := 1 to 5 // For selected range
do Something(i); // get stuff done

for i := 0 to (count - 1) // For every item
do begin
Something(i); // Step 1
SomethingMore(i); // Step 2
end;

with SomeObject // Focusing on this specific item,
do begin // I can add this comment for clarity
Something;
SomethingMore;
end;

The if/then/else statement have too many combinations to even get close to cover them all, so I am only going to do a handful.


// Conformist
if (SomeVariable = Condition) and (SomeOtherExpression) then
PerformSomeRoutine;

if (SomeVariable = Condition) and (SomeOtherExpression) then begin
Code;
MoreCode;
LotsOfCode;
end else OtherCode;

if Condition then DoFirstAlternative else DoSecondAlternative;

if FirstCondition then DoFirstAlternative
else if SecondCondition then DoSecondAlternative
else DoThirdAlternative;

// Individualist
if (SomeVariable = Condition) // I always put the condition alone
then PerformSomeRoutine;

if (SomeVariable = Condition) // Condition X found
and (SomeOtherExpression) // Requirement Y fulfulled
then begin // we can do our stuff
Code;
MoreCode;
LotsOfCode;
end
else OtherCode; // or we ended up with the alternative

if Condition
then DoFirstAlternative // The Condition compelled us to do this
else DoSecondAlternative; // Optionally, we explain the alternative

// Here I find myself doing many different approaches,
// depending on the complexity of the logic, and the number
// of alternatives, but with multiple nested if's, I tend to indent
if FirstCondition
then DoFirstAlternative // We are doing 1 because ...
else if SecondCondition
then DoSecondAlternative // We are doing 2 because ...
else DoThirdAlternative; // Otherwise, We are doing 3

// I might chose this approach if it is a long chain
if FirstCondition
then DoFirstAlternative // We are doing 1 because ...
else if SecondCondition
then DoSecondAlternative // We are doing 2 because ...
else DoThirdAlternative; // Otherwise, We are doing 3

C++/C# and Java people likes to rant about how ugly and verbose our begin / end‘s are, and I am sure I could rile myself up over their curly brace enclosures and some of the religion connected to those too – but I am not going to go there. However, there are a few things about our enclosures that may be worth thinking over.

Some like to dangle their end‘s at various indentation levels. I prefer to align it with the matching start of the block (do begin, then begin). Why? It makes it is easier to identify the code path in relation to the condition.

More enclosures doesn’t always mean better readability. Here is an example from Indy9(IdTunnelCommon.pas). Kudzu, I love you – but I disagree with your code style in this file πŸ™‚
(I know I am going to get flamed for this…^^)

procedure TReceiver.SetData(const Value: string);
var
CRC16: Word;
begin
Locker.Enter;
try
try
fsData := Value;
fiMsgLen := Length(fsData);
if fiMsgLen > 0 then begin
Move(fsData[1], (pBuffer + fiPrenosLen)^, fiMsgLen);
fiPrenosLen := fiPrenosLen + fiMsgLen;
if (fiPrenosLen >= HeaderLen) then begin
// copy the header
Move(pBuffer^, Header, HeaderLen);
TypeDetected := True;
// do we have enough data for the entire message
if Header.MsgLen <= fiPrenosLen then begin
MsgLen := Header.MsgLen - HeaderLen;
Move((pBuffer+HeaderLen)^, Msg^, MsgLen);
// Calculate the crc code
CRC16 := CRC16Calculator.HashValue(Msg^);
if CRC16 Header.CRC16 then begin
fCRCFailed := True;
end
else begin
fCRCFailed := False;
end;
fbNewMessage := True;
end
else begin
fbNewMessage := False;
end;
end
else begin
TypeDetected := False;
end;
end
else begin
fbNewMessage := False;
TypeDetected := False;
end;
except
raise;
end;

finally
Locker.Leave;
end;
end;

This little nugget very much bear the signs of being written for debugging, so I am not going to hold it to Kudzu for style (much). It includes a few nice examples of code that can be simplified, so it is all good.

So what do I do? β€’ I assume the except block was for debugging, but I’m taking it out. β€’ I might be changing the behaviour by setting two default values up top, but at least there is no doubt about their default value. β€’ But what is with the conditional assignments? Please! BooleanVariable := Expression; !! Don’t go “then true else false” on me! β€’ Reworked the comments.

Personally, I think it is more readable like this.

procedure TReceiver.SetData(const Value: string);
var
CRC16: Word;
begin
fbNewMessage := False;
TypeDetected := False;
Locker.Enter;
try
fsData := Value;
fiMsgLen := Length(fsData);
if fiMsgLen > 0
then begin // Data found, Check for Type
Move(fsData[1], (pBuffer + fiPrenosLen)^, fiMsgLen);
fiPrenosLen := fiPrenosLen + fiMsgLen;
TypeDetected := (fiPrenosLen >= HeaderLen);
if TypeDetected
then begin // We have a type, check header for content
Move(pBuffer^, Header, HeaderLen);
fbNewMessage := Header.MsgLen <= fiPrenosLen;
if fbNewMessage
then begin // we have enough data for the entire message
MsgLen := Header.MsgLen - HeaderLen;
Move((pBuffer+HeaderLen)^, Msg^, MsgLen);
CRC16 := CRC16Calculator.HashValue(Msg^); // Calculate the crc code
fCRCFailed := CRC16 Header.CRC16; // and check if it was correct
end;
end;
end;
finally
Locker.Leave;
end;
end;

Now, where is that flameproof tin foil dress…

Next: I’ll be rambling a little more on effective comments, naming and structure.

Edit – added after Paul’s Snippet Rewritten: Delphifreak likes to exit early. He also likes to initialize up top, and I totally agree – that is a good practice. Exit works well in this example where all the conditions are dependant on the previous one. IMO, things get a lot more hairy if there are multiple conditions with alternatives (if/then/else if). For this example, Exits are not bad.

procedure TReceiver.SetData(const Value: string);
var
CRC16: Word;
begin
fbNewMessage := False;
TypeDetected := False;
Locker.Enter;
try
fsData := Value;
fiMsgLen := Length(fsData);
if fiMsgLen = 0
then Exit;

// Data found, Check for Type
Move(fsData[1], (pBuffer + fiPrenosLen)^, fiMsgLen);
fiPrenosLen := fiPrenosLen + fiMsgLen;
TypeDetected := (fiPrenosLen >= HeaderLen);
if not TypeDetected
then Exit;

// We have a type, check header for content
Move(pBuffer^, Header, HeaderLen);
fbNewMessage := Header.MsgLen <= fiPrenosLen;
if not fbNewMessage
then Exit;

// we have enough data for the entire message
MsgLen := Header.MsgLen - HeaderLen;
Move((pBuffer+HeaderLen)^, Msg^, MsgLen);
CRC16 := CRC16Calculator.HashValue(Msg^); // Calculate the crc code
fCRCFailed := CRC16 Header.CRC16; // and check if it was correct

finally
Locker.Leave;
end;
end;