A generic cache

Update: Eric Grange suggested a change to TStringList that speeds it up significantly and place it well in front of TCache. I will update the article to reflect this in the end of the week. See the article comments for the details.

In my previous article about a generics based case statement for strings, I commited many sins towards the Church Of Pure Pascal 🙂

One of them was to not check for prior art. Sergey Antonov aka 0xffff did something similar back in April 2010 (Note: two links!).

So, to make penance for my lack of purity (and have a chance to sin some more), I have tried to take the good parts of the concept and create something less ugly, and more comfortable to use.

I rewrote the generic class and named it TCache. I kept the configurable key type, and I made the lookup value configurable as well. Basically, it all ends up as a thin wrapper around a dictionary, but I quite like the simple declaration you can achieve with this approach.

Example declaration

var
Cache : TCache;
i : Integer;
begin
if not Assigned(Cache)
then TCache
.Define(Cache, 0)
['alpha', 11]
['bravo', 22]
['charlie', 33]
['delta', 44]
['echo', 55]
['foxtrot', 66];

i := Cache.Lookup('charlie');

Remember that can be almost any type you like, including code (for the value part, at least).

Here is the class. Note that I also keep track of the index of the order each key/value was added. This could be removed. Also note that I still do dirty deeds, such as a dangerous cast. I guess I just suck at writing clean code ;).

unit GenericsCache;

/// Written by Lars Fosdal , December 5, 2010

interface
uses
SysUtils, Generics.Collections;

type
TCacheEntry = record
Value: T;
Index: Integer;
end;

TCache = class(TObjectDictionary<KeyT, TCacheEntry>)
private
FCache: TCache;
FDefaultValue: ValT;
function AddValue(const Id: KeyT; const Value: ValT): TCache;
protected
function ValidateId(Id: KeyT): KeyT; virtual;
public
class function Define(var Cache; const aDefaultValue:ValT): TCache;
function Lookup(const Id: KeyT):ValT;
function Index(const Id: KeyT):Integer;
property DefaultValue: ValT read FDefaultValue write FDefaultValue;
property Values[const Id: KeyT; const Value: ValT]: TCache read AddValue; default;
end;

TCaseStringCache = class(TCache)
function ValidateId(Id: String): String; override;
end;

implementation

{ TCache }

function TCache.AddValue(const Id: KeyT; const Value: ValT): TCache;
var
Rec : TCacheEntry;
begin
Result := Self;
Rec.Value := Value;
Rec.Index := Count;
Add(ValidateId(Id), Rec);
end;

class function TCache.Define(var Cache; const aDefaultValue: ValT): TCache;
begin
Result := Create;
Result.FCache := Result;
Result.DefaultValue := aDefaultValue;
TCache(Cache) := Result;
end;

function TCache.Index(const Id: KeyT): Integer;
var
Rec : TCacheEntry;
begin
if TryGetValue(ValidateId(Id), Rec)
then Result := Rec.Index
else Result := -1;
end;

function TCache.Lookup(const Id: KeyT): ValT;
var
Rec : TCacheEntry;
begin
if FCache.TryGetValue(ValidateId(Id), Rec)
then Result := Rec.Value
else Result := DefaultValue;
end;

function TCache.ValidateId(Id: KeyT): KeyT;
begin
Result := Id;
end;

{ TCaseStringCache }

function TCaseStringCache.ValidateId(Id: String): String;
begin
Result := LowerCase(Id);
end;

end.

I wrote a simple benchmark, testing different ways to use this, and also comparing it to do String2Index / Case -type lookup mechanisms as well as if/then/else, and the ugly method from my previous article. Several people suggested using a string to index / case approach. I have also used that many times. The painful part of strings to index, is that if you change the order of the strings, you also have to change the indices. TCache makes the index entirely optional, since the string is the index – if you see what I mean.

See below for the test code.

The Good, The Bad, and the Ugly.

The test uses GetTickCount and 5.000.000 iterations, for each method, repeated 10 times, with 12 strings (the numbers are consistant at 6 strings as well – except that string to index will be slightly faster) and the process priority was set to High to avoid other parts of the PC affecting the numbers. I weighted the results towards the results for the if/then/else. So Perf tells you how many times slower than if/then/else each test was.

Method Perf AvgRun Comment
TStringSwitch 52.94 35340.7 The ugly was really ugly performance-wise as well.
AnsiIndexTextFunc 12.40 8277.3 String to Index, then case/anon.method is no speed demon either.
StringIndex 12.05 8041.7 Jolyon’s variant is around the same speed.
AnsiIndexText 12.03 8032.6 If you remove anon.methods, the impact is not huge.
TStringList 6.94 4631.6 Using a pre-created sorted string list is nearly twice the speed of AnsiIndexText.
TCacheProc 3.70 2471.1 Cool, TCache and anon.methods are nearly twice the speed of a string list.
TCacheFunc 3.71 2475.8 No signficant difference between a procedure and function.
TCacheFuncStack 3.72 2480.5 Passing the anon.method by stack doesn’t cost much either.
TCaseStringCache 3.12 2085.6 Case insensitive string to string saves some time over using anon.methods.
TCacheString 2.93 1957.9 So does eliminating the LowerCase function. TCache with string/string is 3 times slower than if/then/else, but it is 4 times faster than AnsiIndexText, and more than twice as fast as a string list.
If/then/else 1 667.6 You can’t beat this. You also can’t enjoy maintaining it.
LastKey 0.39 260.4 No surprise that indexed array lookups are fast. Yes, I know I am not looking up the same strings, but the cost is the same.

If you need to repeatedly do lookup by strings, you can benefit from using something like TCache. Also – for the AnsiIndexText – it does a sequential search for the match, hence if the list is long, or the later entries are more commonly used – it will degrade performancewise. Without having dissected TDictionary in detail, I would believe that it’s hash table will allow TCache to remain relativly constant in performance, even if you add thousands of entries.

You could also shave off some more by eliminating the ValidateId methods.

Here is the test program (which also use the unit from my previous article).

program TestGenericsSwitch;
{$apptype Console}
uses
ExceptionLog,
Windows,
Classes,
SysUtils,
StrUtils,
GenericsCache in 'GenericsCache.pas',
GenericsSwitch in 'GenericsSwitch.pas';

{$define Twelve} // remove this to run with 6 strings

const
{$ifndef Twelve}
Elements = 6;
{$else}
Elements = 12;
{$endif}
TestCount = 5000000;
Keys : Array[0..Elements] of String
= ('alpha','bravo', 'charlie', 'delta', 'echo', 'foxtrot',
{$ifdef Twelve}
'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima',
{$endif}
'what');

type
TFunc = reference to function:String;
TProc = reference to procedure;

function RandomKey:String;
begin
Result := Keys[Random(Elements + 1)];
end;

function AssignTest:String;
var
ix : Integer;
s: String;
begin
for ix := 0 to TestCount - 1
do TStringSwitch.CaseOf(RandomKey)
['alpha', procedure begin
s := 'Definitively any case';
end]
['bravo', procedure begin
s := 'B all you can B';
end]
['charlie', procedure begin
s := 'Checkpoint C';
end]
['delta', procedure begin
s := 'Checkpoint D';
end]
['echo', procedure begin
s := 'Checkpoint E';
end]
['foxtrot', procedure begin
s := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', procedure begin
s:= 'golf';
end]
['hotel',procedure begin
s:= 'hotel';
end]
['india',procedure begin
s:= 'india';
end]
['juliet', procedure begin
s:= 'juliet';
end]
['kilo', procedure begin
s:= 'kilo';
end]
['lima', procedure begin
s:= 'lima';
end]
{$endif}
.ElseCase(procedure begin
s := 'Else what?';
end)
.EndCase;
Result := s;
end;

function AssignTestIf:String;
var
ix : Integer;
s, t: String;
begin
for ix := 0 to TestCount - 1 do
begin
t := LowerCase(RandomKey);
if t = 'alpha' then
s := 'Definitively any case'
else if t = 'bravo' then
s := 'B all you can B'
else if t = 'charlie' then
s := 'Checkpoint C'
else if t = 'delta' then
s := 'Checkpoint D'
else if t = 'echo' then
s := 'Checkpoint E'
else if t = 'foxtrot' then
s := 'Checkpoint F'
{$ifdef Twelve}
else if t = 'golf' then
s := 'golf'
else if t = 'hotel' then
s := 'hotel'
else if t = 'india' then
s := 'india'
else if t = 'juliet' then
s := 'juliet'
else if t = 'kilo' then
s := 'kilo'
else if t = 'lima' then
s := 'lima'
{$endif}
else
s := 'Else what?';
end;
Result := s;
end;

function AssignTestS2I:String;
var
ix : Integer;
s: String;
begin
for ix := 0 to TestCount - 1
do case AnsiIndexText(RandomKey, ['alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'
{$ifdef Twelve}
, 'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima'
{$endif}
]) of
0 : s := 'Definitively any case';
1 : s := 'B all you can B';
2 : s := 'Checkpoint C';
3 : s := 'Checkpoint D';
4 : s := 'Checkpoint E';
5 : s := 'Checkpoint F';
{$ifdef Twelve}
6 : s := 'golf';
7 : s := 'hotel';
8 : s := 'india';
9 : s := 'juliet';
10 : s := 'kilo';
11 : s := 'lima';
{$endif}
else s := 'Else what?';
end;
Result := s;
end;

function StringIndex(const aString: string; const aCases: array of string;
const aCaseSensitive: Boolean): Integer;
begin
if aCaseSensitive then
begin
for Result := 0 to Pred(Length(aCases)) do
if ANSISameText(aString, aCases[Result]) then
EXIT;
end
else
begin
for Result := 0 to Pred(Length(aCases)) do
if ANSISameStr(aString, aCases[Result]) then
EXIT;
end;

Result := -1;
end;

function AssignStringIndexFunc:String;
var
ix : Integer;
func : TFunc;
begin
for ix := 0 to TestCount - 1
do case StringIndex(RandomKey, ['alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'
{$ifdef Twelve}
, 'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima'
{$endif}], false) of
0 : func := function:String begin
Result := 'Definitively any case';
end;
1 : func := function:String begin
Result := 'B all you can B';
end;
2 : func := function:String begin
Result := 'Checkpoint C';
end;
3 : func := function:String begin
Result := 'Checkpoint D';
end;
4 : func := function:String begin
Result := 'Checkpoint E';
end;
5 : func := function:String begin
Result := 'Checkpoint F';
end;
{$ifdef Twelve}
6 : func := function:String begin
Result:= 'golf';
end;
7 : func := function:String begin
Result:= 'hotel';
end;
8 : func := function:String begin
Result:= 'india';
end;
9 : func := function:String begin
Result:= 'juliet';
end;
10: func := function:String begin
Result:= 'kilo';
end;
11: func := function:String begin
Result:= 'lima';
end;
{$endif}
else func := function:String begin
Result := 'Else what?';
end;
end;
Result := Func;
end;

function AssignTestS2IFunc:String;
var
ix : Integer;
func : TFunc;
begin
for ix := 0 to TestCount - 1
do case AnsiIndexText(RandomKey, ['alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'
{$ifdef Twelve}
, 'golf', 'hotel', 'india', 'juliet', 'kilo', 'lima'
{$endif}]) of
0 : func := function:String begin
Result := 'Definitively any case';
end;
1 : func := function:String begin
Result := 'B all you can B';
end;
2 : func := function:String begin
Result := 'Checkpoint C';
end;
3 : func := function:String begin
Result := 'Checkpoint D';
end;
4 : func := function:String begin
Result := 'Checkpoint E';
end;
5 : func := function:String begin
Result := 'Checkpoint F';
end;
{$ifdef Twelve}
6 : func := function:String begin
Result:= 'golf';
end;
7 : func := function:String begin
Result:= 'hotel';
end;
8 : func := function:String begin
Result:= 'india';
end;
9 : func := function:String begin
Result:= 'juliet';
end;
10: func := function:String begin
Result:= 'kilo';
end;
11: func := function:String begin
Result:= 'lima';
end;
{$endif}
else func := function:String begin
Result := 'Else what?';
end;
end;
Result := Func;
end;

function AssignCaseStringCache:String;
var
ix : Integer;
s: String;
Cache : TCaseStringCache;
begin
TCaseStringCache.Define(Cache,
'Else what?')
['alpha', 'Definitively any case']
['bravo', 'B all you can B']
['charlie', 'Checkpoint C']
['delta', 'Checkpoint D']
['echo', 'Checkpoint E']
['foxtrot', 'Checkpoint F']
{$ifdef Twelve}
['golf', 'golf']
['hotel', 'hotel']
['india', 'india']
['juliet', 'juliet']
['kilo', 'kilo']
['lima', 'lima']
{$endif};
for ix := 0 to TestCount - 1
do s := Cache.Lookup(RandomKey);
Result := s;
end;

function AssignCacheString:String;
var
ix : Integer;
s: String;
Cache : TCache;
begin
TCache
.Define(Cache, 'Else what?')
['alpha', 'Definitively any case']
['bravo', 'B all you can B']
['charlie', 'Checkpoint C']
['delta', 'Checkpoint D']
['echo', 'Checkpoint E']
['foxtrot', 'Checkpoint F']
{$ifdef Twelve}
['golf', 'golf']
['hotel', 'hotel']
['india', 'india']
['juliet', 'juliet']
['kilo', 'kilo']
['lima', 'lima']
{$endif};
for ix := 0 to TestCount - 1
do s := Cache.Lookup(RandomKey);
Result := s;
end;

function AssignCacheProc:String;
var
ix : Integer;
s: String;
Cache : TCache;
begin
TCache.Define(Cache,
procedure begin
s := 'Else what?';
end)
['alpha', procedure begin
s := 'Definitively any case';
end]
['bravo', procedure begin
s := 'B all you can B';
end]
['charlie', procedure begin
s := 'Checkpoint C';
end]
['delta', procedure begin
s := 'Checkpoint D';
end]
['echo', procedure begin
s := 'Checkpoint E';
end]
['foxtrot', procedure begin
s := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', procedure begin
s:= 'golf';
end]
['hotel', procedure begin
s:= 'hotel';
end]
['india', procedure begin
s:= 'india';
end]
['juliet', procedure begin
s:= 'juliet';
end]
['kilo', procedure begin
s:= 'kilo';
end]
['lima', procedure begin
s:= 'lima';
end]
{$endif};

for ix := 0 to TestCount - 1
do Cache.Lookup(RandomKey)();
Result := s;
end;

function AssignCacheFuncStack:String;
var
ix : Integer;
s: String;
Cache : TCache;
begin
TCache.Define(Cache,
function:String begin
Result := 'Else what?';
end)
['alpha', function:String begin
Result := 'Definitively any case';
end]
['bravo', function:String begin
Result := 'B all you can B';
end]
['charlie', function:String begin
Result := 'Checkpoint C';
end]
['delta', function:String begin
Result := 'Checkpoint D';
end]
['echo', function:String begin
Result := 'Checkpoint E';
end]
['foxtrot', function:String begin
Result := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', function:String begin
Result := 'golf';
end]
['hotel',function:String begin
Result := 'hotel';
end]
['india',function:String begin
Result := 'india';
end]
['juliet', function:String begin
Result := 'juliet';
end]
['kilo', function:String begin
Result := 'kilo';
end]
['lima', function:String begin
Result := 'lima';
end]
{$endif};

for ix := 0 to TestCount - 1
do s := Cache.Lookup(RandomKey)();
Result := s;
end;

function AssignCacheFunc:String;
var
ix : Integer;
s: String;
func: TFunc;
Cache : TCache;
begin
TCache.Define(Cache,
function:String begin
Result := 'Else what?';
end)
['alpha', function:String begin
Result := 'Definitively any case';
end]
['bravo', function:String begin
Result := 'B all you can B';
end]
['charlie', function:String begin
Result := 'Checkpoint C';
end]
['delta', function:String begin
Result := 'Checkpoint D';
end]
['echo', function:String begin
Result := 'Checkpoint E';
end]
['foxtrot', function:String begin
Result := 'Checkpoint F';
end]
{$ifdef Twelve}
['golf', function:String begin
Result := 'golf';
end]
['hotel',function:String begin
Result := 'hotel';
end]
['india',function:String begin
Result := 'india';
end]
['juliet', function:String begin
Result := 'juliet';
end]
['kilo', function:String begin
Result := 'kilo';
end]
['lima', function:String begin
Result := 'lima';
end]
{$endif};

for ix := 0 to TestCount - 1
do begin
func := Cache.Lookup(RandomKey);
s := func;
end;
Result := s;
end;

function AssignStringList:String;
var
ix, fx : Integer;
s: String;
func : TFunc;
obj : TObject absolute func;
StrList : TStringList;
begin
StrList := TStringList.Create;

StrList.AddObject('alpha', TObject(function:String begin
Result := 'Definitively any case';
end));
StrList.AddObject('bravo', TObject(function:String begin
Result := 'B all you can B';
end));
StrList.AddObject('charlie', TObject(function:String begin
Result := 'Checkpoint C';
end));
StrList.AddObject('delta', TObject(function:String begin
Result := 'Checkpoint D';
end));
StrList.AddObject('echo', TObject(function:String begin
Result := 'Checkpoint E';
end));
StrList.AddObject('foxtrot', TObject(function:String begin
Result := 'Checkpoint F';
end));
//{$ifdef Twelve}
StrList.AddObject('golf', TObject(function:String begin
Result := 'golf';
end));
StrList.AddObject('hotel',TObject(function:String begin
Result := 'hotel';
end));
StrList.AddObject('india',TObject(function:String begin
Result := 'india';
end));
StrList.AddObject('juliet', TObject(function:String begin
Result := 'juliet';
end));
StrList.AddObject('kilo', TObject(function:String begin
Result := 'kilo';
end));
StrList.AddObject('lima', TObject(function:String begin
Result := 'lima';
end));
//{$endif}

StrList.Sorted := True;

for ix := 0 to TestCount - 1
do begin
fx := StrList.IndexOf(RandomKey);
if fx >= 0
then begin
obj := StrList.Objects[fx];
s := func;
end
else s := 'ElseWhat';
end;
Result := s;
end;

function TimeIt(proc: TFunc; name:String):Integer;
var
start : Cardinal;
LastLookup : String;
oldSeed : Integer;
begin
OldSeed := RandSeed;
start := GetTickCount;
LastLookup := Proc;
Result := GetTickCount - start;
Writeln(Format('%-18s %6d - %s', [name, Result, LastLookup]));
RandSeed := OldSeed;
end;

function LastKey:String;
var
ix : Integer;
s : String;
OldSeed : Integer;
begin
OldSeed := RandSeed;
for ix := 0 to TestCount - 1
do s := RandomKey;
RandSeed := OldSeed;
Result := s;
end;

procedure Test;
const
Repeats = 10;
var
ix : Integer;
begin
for ix := 0 to Repeats - 1
do begin
Randomize;
// RandSeed := 2003112605;
Writeln;
Writeln(Format('strings=%d, repeats=%d, seed=%d', [Elements, TestCount, RandSeed]));
TimeIt(AssignTest,'TStringSwitch');
TimeIt(AssignTestS2IFunc, 'AnsiIndexText Func');
TimeIt(AssignStringIndexFunc, 'StringIndex');
TimeIt(AssignTestS2I, 'AnsiIndexText');
TimeIt(AssignStringList,'TStringList');
TimeIt(AssignCacheProc, 'TCache Proc');
TimeIt(AssignCacheFunc, 'TCache Func');
TimeIt(AssignCacheFuncStack, 'TCache Func Stack');
TimeIt(AssignCaseStringCache, 'TCaseStringCache');
TimeIt(AssignCacheString, 'TCache String');
TimeIt(AssignTestIf, 'If/then/else');
TimeIt(LastKey, 'LastKey');
end;
end;

begin
try
Write('Press Enter to start: ');
Readln;

Test;

finally
Writeln;
Write('Press Enter: ');
Readln;
end;
end.

5 thoughts on “A generic cache

  1. For TStringList, you forgot to include the fix I mentioned in my article, subclass TStringList, override CompareStrings withfunction TCIStringList.CompareStrings(const S1, S2: string): Integer;begin Result:=CompareStr(S1, S2);end;”CI” for case insensitive, the dictionary classes use a binary hash on the string, so the TStringList equivalent is a case-insensitive comparison.Your code is doing full-blown case-sensitive Unicode comparisons, which as I mentioned in my post are a lot more complex.And merely setting the TStringList to CaseSensitive=True isn't enough, you need the override fix.

    Like

  2. @Eric – You are right, that makes a massive difference! I'll update the article at the end of the week. Too much to do this week. I wonder, is it possible to optimize the hash for the dictionary, further? Hash tables should be faster than binary searches on a list, or what?

    Like

  3. Doubtless the hash could be improved, the TDictionnary uses HashLittle which is probably too complex for such a case. But a simpler hash would increase the risk of collisions, which could cripple the reliability of the dictionary performance in practical use.However in the grand scheme of things, I'm not convinced a hash can “win” in such a case, the strings are too short, and the complexity of a generic hash dictionary too high. It has to face a O(Ln n) binary search, which scales very well for both low and high entry counts.A hash dictionary needs both very complex comparisons and dynamic content, but here the content is static, so all the extra data structures of the dictionary, tailored for “dynamicity”, are just a waste of RAM and CPU.If you were to add/remove strings constantly f.i., the TStringList underlying list structure would make it quite unsuitable. But then, a binary tree could probably be a better choice than a dictionary if the string are still of the same complexity.

    Like

  4. A short note, from just reading the code : If I'm not mistaken, the FCache variable can be removed.And perhaps it's a good idea to let Lookup return a boolean (and return the value via an out variable), so you don't have to check for DefaultValue results (and can thus remove that too, unless you really need that sort of functionality).Oh, and some might call it dirty, but I rather like how you (ab)use the indexed default property to function as a repeatable setter!

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.