Review: Framework Design Guidelines, 2nd Ed. by Krzysztof Cwalina and Brad Abrams

Subtitle: “Conventions, Idioms, and Patterns for Reusable .NET libraries”. Highly relevant to Delphi Prism users, but you can very well eliminate that .NET part and apply this to writing libraries for Win32 using Delphi 2009 as well (or any prior Delphi version, for that matter).

When you write code with the intent of making it public and reusable, there are certain aspects of designing classes, interfaces, types and so forth, where we really need experience to come up with easy to understand and easy to use code. This book delivers 6 years worth of wisdom and experience from the .NET teams and we can all do well by picking their brains for reusable knowledge.

The book present a sensible set of lessons learned and goals to strive for. Without lecturing, and always exemplifying – we get a step by step journey through the different aspects of designing a framework. It is concisely written and easy to read, and it is not without humor.

The book is richly commented by the developers themselves, embellishing on the guidelines, sometimes including design regrets and mistakes made, and sometimes also hinting towards why you might want to break a guideline under certain conditions.

As Anders Hejlsberg write in the foreword: “The guidelines have served us well through three versions of the .NET Framework and numerous smaller projects, and they are guiding the development of the next generation of APIs for the Microsoft Windows operating system”.

I won’t delve into all the details of the content, but it covers the basics of how to design a framework by being roleplaying the consumer as well as testing it on your peers, good naming, type design, member design, extensibility, exceptions, as well as usage guidelines. It rounds off with a collection of common patterns such as aggregate components, async patterns, dependency properties, dispose pattern, factories, and many more. There is also a brief C# coding style convention section and a tutorial on using FxCop to enforce the framework design guidelines.

The book is not C# centric, but embellish the fact of .NET being a multi-lingual platform, and remind you to avoid certain constructs that may be too tightly bound to a specific language.

“Framework Design Guidelines (2nd Ed.)” by Krzysztof Cwalina and Brad Abrams (Addison-Wesley ISBN-13: 978-0-321-54561-9) falls under the collection of books that I wish I had available to read when I started out coding. Together with “Code Complete (2nd Ed.)” by Steve McConnell, it should be mandatory reading for every developer, regardless of language or development niche.

Highly recommended.

Windows 7 – How it was built

Very interesting read at http://blogs.msdn.com/e7/archive/2008/10/15/engineering-7-a-view-from-the-bottom.aspx, where Larry Osterman describes some of the differences between the Vista project and the Windows 7 project.

A part that made me chuckle, was: “This is where one of the big differences between Vista and Windows 7 occurs: In Windows 7, the feature crew is responsible for the entire feature. The crew together works on the design, the program manager(s) then writes down the functional specification, the developer(s) write the design specification and the tester(s) write the test specification. The feature crew collaborates together on the threat model and other random documents. Unlike Windows Vista where senior management continually gave “input” to the feature crew, for Windows 7, management has pretty much kept their hands off of the development process.

What I find most interesting is how the article ends – where the choice of words indicate that Windows 7 already is feature complete.

repeat Mistake until LessonLearned;

If we indeed do learn from our mistakes, I should have been a genius by now. Instead I feel silly for forgetting to do the most obvious checks and the most trivial safety measures. And I don’t only do it once – I do it all the time!

Coding starts with an idea, some thinking about implications, hopefully some deliberation on requirements, possibly the choice of some pattern for implementation and maybe even some kind of plan for testing that it all work according to plan. Even if we put all of the above into action, we will most likely repeat some common coding mistakes.

“[A]nd then it occurred to me that a computer is a stupid machine with the ability to do incredibly smart things, while computer programmers are smart people with the ability to do incredibly stupid things. They are, in short, a perfect match.”
~Bill Bryson

Here are some of my more typical causes of involuntary visits in the debugger.

I know that many of my mistakes stem from the brain being ahead of the fingers in the typing process. I have the idea all figured out in my head, and now I want to see it in action. In my desire to “save time” – I convince myself that I can add the checks later, but for now I can get away with putting up just the basic scaffolding and leave the safety rails for later. Naturally, when later comes – I am already on a different problem, taking the same shortcuts.

It boils down to bad habits.

Bart Roozendaal wrote about pre and post-conditions yesterday, and although these may save my bacon to a certain degree – it still boils down to the same issue: i.e. my failure to take time to actually implement the safety measures immediately, and instead postponing them until later.

So… what should I do to catch my most common mistakes? Pick up some good habits and stick with them. Time spent early will be saved ten-fold later.

Here are some suggestions:

• Forgetting to check for assignment (NIL pointer) and Invalid typecasts
Assert. Assert. Assert. Make sure that you qualify every reference before you use it.
In the context of Bart’s post, maybe pre/post requirements would be even more useful if they would generate tips (lightweight hints?) at the location where the method (which have the requirements) is called.

• Dangling pointer (Non-NIL, but invalid)
FreeAndNil(ObjectRef); Just do it. Avoid passing and keeping non-volatile copies of the reference. If you really need the reference, implement some sort of common storage that encapsulate the object(s) you need to reference i.e. create a secure reference to a reference. Keep in mind that the such common storage should exist before and after the life of which ever object it references.

• Index/access out of string/array bound
Take a minute and check the boundary cases – what happens at the minimum and maximum index? Will index +/- length exceed the boundary?

• Numeric value or enumeration value out of range, Boolean evaluation mistake, Numeric evalution mistake (implicit cast accuracy problem)
Be explicit. Adding more brackets and explicit casts will not slow down your code. Break down the expression logic into smaller parts for clarity if you need to.

One book that is a veritable goldmine of healthy coding advice is Code Complete (2nd ed.) by Steve McConnell. The book’s advice is generally language agnostic, and you will most likely recognize much of the advice given as common sense – but the real value is to use the book to remind yourself of using that common sense. If I had read this book as a student, I would have saved myself a lot of trouble.

If you don’t wanna shop for books – do as Lars D and explain the code to your favorite Teddybear.

“What I mean is that if you really want to understand something, the best way is to try and explain it to someone else. That forces you to sort it out in your own mind. And the more slow and dim-witted your pupil, the more you have to break things down into more and more simple ideas. And that’s really the essence of programming. By the time you’ve sorted out a complicated idea into little steps that even a stupid machine can deal with, you’ve certainly learned something about it yourself.”
~Douglas Adams

Optimization – Understand your hardware

Paul (of snippet formatting fame) mentioned that he was comfortable with his coding style, but that he would like to learn more about optimizing his code.

The generally recognized and recommended approach to optimization is that you never optimize code until you really must. The reasoning behind this is that you need actual running code before you can pinpoint the real bottlenecks and there is no point in spending time on optimizing code that is run once every blue moon, since you hardly will have any significant gain in performance.

The Delphi compiler is very good at optimizing code. Most of the time, you would be hard pressed to write better assembly code than what the compiler produces.

However, we are the ones that put down the premises for how it will organize and access data. If we tell the compiler to work the rows or the columns, the compiler generally will have to do what we say. We decide data size, data organization, and data access. The best kind of optimization we can do is to think hard about how we organize and access our data.

Once upon a time, before computers became a household appliance, around the time when the computing jungle still reeked of digital dino-dung, I decided to pursue a career in designing microprocessors. This was before I realized that it was – relatively speaking – cheaper to make mistakes in software than in hardware.

I learned some useful lessons on accessing data the hardware way, though. Accessing memory is costly! Very costly! You should avoid it if you can! 🙂

The ridiculously simple example is that if you store your data in rows, it will potentially be very costly to access it column by column. If you need to process large amounts of data in memory (array / matrix / SSD type operations), the size and alignment of your data matter more than you think.

Here is a small project that you can play around with to see effects of data alignment. The demo allocates 10.000 items at the size of 512 bytes as one block, and adds an offset (0 .. 25) from the original block start to the item address. Each block gets a key value from a Random double; Random seed is reset for each offset iteration. Each fill and sort is repeated 5 times to try to avoid random interference from the OS etc. The “read and write” of the data for sorting, doesn’t actually move all the data, just the key (Double) and a simulated move of the block size into a local buffer. A fullblown sort with data move would add a few move moves, which would add more boundary overhead.


This is average fill+sort time (raw TDateTime value). Notice the difference on 8 byte boundaries.

Well known C++ guru, Herb Sutter held a presentation “Machine Architecture: Things Your Programming Language Never Told You” for the North West C++ Users Group. He can convey this knowledge far more entertaining and far more effective than I can. BTW, he also have several interesting articles on concurrency.

To quote the notes on the video: High-level languages insulate the programmer from the machine. That’s a wonderful thing — except when it obscures the answers to the fundamental questions of “What does the program do?” and “How much does it cost?” The C++/C#/Java programmer is less insulated than most, and still we find that programmers are consistently surprised at what simple code actually does and how expensive it can be — not because of any complexity of a language, but because of being unaware of the complexity of the machine on which the program actually runs. This talk examines the “real meanings” and “true costs” of the code we write and run especially on commodity and server systems, by delving into the performance effects of bandwidth vs. latency limitations, the ever-deepening memory hierarchy, the changing costs arising from the hardware concurrency explosion, memory model effects all the way from the compiler to the CPU to the chipset to the cache, and more — and what you can do about them.

Herb Sutter is an excellent speaker, so find a quiet spot, pull out the popcorn, and set aside 1 hour and 56 minutes for a very entertaining and very enlightening session on the true cost of accessing memory.

Video link: http://video.google.co.uk/videoplay?docid=-4714369049736584770

Writing Readable Code – Paul’s Snippet Rewritten

Five brave people have contributed rewritten versions of Paul’s code. Let’s take a look at their approach.

I reformatted Paul’s example using two spaces for every indent level (after then, begin, else, etc. ), hoping that this is the way he originally wrote it.

IMO, Paul’s formatting (or my assumptions about his formatting) does not reflect the true code path, and trying to decipher the code paths become unnecessarily hard.

  try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then begin
if ConditionD then
code(2);
code(3);
end
else if ConditionE then begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end else begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

Here is a small exercize. If we say that the string “ABCDEFG” are all conditions TRUE, and the string “abcdefg” are all conditions false:
• What codes are run by “ABCDEFG”?
• What codes are run by “aBcdefg”?
• What string(s) would make code(6) run?

TS contributed a very nicely formatted version, which is effective in guiding us through the potential code paths. I like his style.

  try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then
begin
if ConditionD then
code(2);
code(3);
end // without the semi-colon
else if ConditionE then
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end
else
begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

SS didn’t quite manage to reflect the flow in the code and his formatting is similar to Paul’s code.

  try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then
begin
if ConditionD then
code(2);
code(3);
end //; I assume this semi-colon has to go
else if ConditionE then
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end
else
begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

Jolyon scores high on restructuring and simplifying, but he made one mistake in his change. Under which condition will his code behave differently form the original?

  procedure WhenC;
begin
if ConditionD then
code(2);
code(3);
end;

procedure WhenE;
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end;

begin
try
if ConditionA then
Code(1)
else if NOT ConditionB then
EXIT;

if ConditionC then
WhenC
else if ConditionE then
WhenE
else
begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

AO goes even further than Jolyon in reformatting and comments: As I believe that proper formatting is not the only way to increase readability of code, I also refactored it a bit to reduce nesting where possible. If this were an actual code, I would have probably gone even further and introduced separate routines for the individual blocks, depending on their complexity.

  try
{ This comment explains why ConditionA is handled first. }
if ConditionA then begin
Code(1);
Exit;
end;

{ This comment explains why the aggregate of ConditionB and ConditionC is handled next. }
if ConditionB and ConditionC then begin
{ This comment explains why ConditionD is relevant only in this block. }
if ConditionD then
code(2);

code(3);
Exit;
end;

{ This comment explains why ConditionE is handled next. }
if ConditionE then begin
code(4);
morecode('X');

{ This comment explains why ConditionF and ConditionG are relevant only in this block. }
if ConditionF then
code(5)
else if ConditionG then
code(6);

Exit;
end;

{ This comment explains the default handling. }
code(7);
morecode('Y');
finally
{ This comment explains why the following code must execute no matter what. }
code(8);
end;

This is readable, but personally I think AO went a bit too far. Like Jolyon, he also missed a structural detail in the refactoring and broke the code. Which condition state(s) will cause the code to misbehave?

I am divided on the use of Exit. It can add clarity, but it can also be a big problem as you leave a lot of code “dangling”. If you decide to move the exit – you have to be really careful to ensure that any states that suddenly move in or out of scope behave correctly. If there is a nesting church and a chunking church, I’m probably a “Nestorian”.

I do agree that refactoring is a valuable tool to clarify and simplify, and we should make an effort to break down our code into manageable blocks, but in this particular case it probably isn’t necessary.

Another thing: I avoid using {curly braces} for in-code commentary and use // instead. Why? Because if you need to comment out code containing select count(*) from table, those curlies will work where the (* comment *) fail. Should CodeGear add support for comment nesting? I don’t know…

Anyways…
Here’s how I would format the example. This is very similar to TS’s example, except that I showel all the reserved words to the left side to leave the logic more visible and commentable, but I also add indentation to the innermost conditional code.

  try
if ConditionA
then Code(1)
else if ConditionB
then if ConditionC
then begin
if ConditionD
then code(2);
code(3);
end
else if ConditionE
then begin
code(4);
morecode('X');
if ConditionF
then code(5)
else if ConditionG
then code(6);
end
else begin
code(7);
morecode('Y');
end;
finally
code(8);
end;

MJ adds a contribution with the following comment: “To be sure not to create any future bugs you should consider adding a begin end section after every if statement even if it is not required, but that will make the code more unreadable.”

  try
if ConditionA then
Code(1)
else if ConditionB then
begin
if ConditionC then
begin
if ConditionD then
code(2);
code(3);
end
else if ConditionE then
begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end
else
begin
code(7);
morecode('Y');
end;
end;
finally
code(8);
end;

Good Points! In all honesty, I also screwed up the code blocks on first try. Conditional code will bite you if you are not very very careful. You should indeed think about what may happen if you need to add more code and/or conditions. Personally, I don’t think a few more enclosures makes the code less readable. Here is how I would be more explicit in the use of enclosures to make the code less ambiguous.

  try
if ConditionA
then Code(1)
else begin
if ConditionB
then begin
if ConditionC
then begin
if ConditionD
then code(2);
code(3);
end
else begin
if ConditionE
then begin
code(4);
morecode('X');
if ConditionF
then code(5)
else if ConditionG
then code(6);
end
else begin
code(7);
morecode('Y');
end;
end;
end;
end;
finally
code(8);
end;

There is no one true correct way of formatting.
The point I am trying to make is that code structure matter for understanding the code at first glance, and we should be mindful about how we lay it out. We should strive for consistency, but always keep clarity and unambiguity as priority one. Bend your rules, if you need to.

Paul, Thank you for creating such a devious code snippet!

P.S. If you haven’t figured out the answers yet, load up your Delphi and run all the examples in StructureDemo.dpr.
No peeking until you have some suggestions! 🙂
(Hint: else starts a new block)

Edit: Added an example of using Exits instead of nesting to the initial post on formatting. It would be interesting to see DelphiFreak have a go at Paul’s snippet.

Writing readable code – Comment for Context

While waiting for more suggestions to Paul’s snippet, and while weaving and dodging through the virtual fireballs from the previous posts on code formatting and comments, I fearlessly continue my Don Quixote Crusade for readable code. I am not quite done with comments yet.

It is said that “Real programmers don’t write comments – It was hard to write – it should be hard to read”. That doctrine is way overdue for deletion.

Comments are the context frame of our code.

Comments are like garlic.
Too little and the result is bland and unpalatable, too much and it stinks things up. Comments should only in rare occasions assume that the reader is too dumb to read the code properly, so as a general rule – we should not rewrite our code in plain english, line for line. Such an approach tend to fog up [sic] the source, and it becomes a drag to maintain (Yeah, I know… garlic metaphors stink…).

Comments are like news headlines.
// File ready for saving!
// No more filehandles, says OS
// Raises exception for World+dog
They should be short, concise and to the point. In good tabloid tradition, they should also only touch on the very general points, and oversimplifying what really is going on.

Comments are like foreplay.
They serve to get us readers in the mood to appreciate the sleek lines, the richness of it’s properties, and possibly arouse our interest to the point where we are ready to ravage the code.

Comments are like, bi-se… uh directional.
Err, well … what I am trying to say is that sometimes we can write them before we do stuff, while at other times – it can be just as effective to write them after something has been done in the code. The first would probably indicate how we are going to do something, while the latter would focus on the result of what we just did.

Pick up the garbage!
Do you leave your workshop tools lying about in case you need them quickly, or do you store them safely away? Once you finalize your code, clean it up. Put “I deleted this, moved that, added that” comments in your version control commit/check-in comments, and don’t litter the source code with it. Once the code has been created or removed, the reason is uninteresting. What the code is supposed to do, is so much more valuable information. Remove unnecessary comments.

Don’t spread FUD.
Your commentary should embellish the correctness and function of your code, not underline how insecure, unexpectedly behaving, mysterious, or potentially unreliable code it is. If there is remaining work to be done, that should be written as a To-Do point, containing sufficient information to guide you to a starting point for that work.

It takes practice to write good commentary (life-long practice, some would say), but unless your code is totally clear and unambiguous to the point of self-explaining to your grandmother, you should probably add some comments.

To round it off, here are a couple of disturbing – but funny – comments I have come across in production code:


// ---- Better let sleeping dogs lie? Note the date. ----

try
if ValidParentForm(self) nil then
begin
ValidParentForm(Self).ActiveControl:=NIL;
//Turn this back on when I return from my holiday and have time to test it
//N.N. 20.07.2001
{if ValidParentForm(Self).ActiveControl=lFocusedControl then
//Control doesn't let go of focus, most likely because of some error message
exit; //Aborting the routine}
end;
if DataSet is TExtendedDataSet then
TExtendedDataSet(DataSet).Save
else
DataSet.Post;
finally
. . .

// ---- Unusual if/then/else construct ----

if (EditSomeProp.Text '')
and ((RequestTable[i].NUMREF+1) > StrToInt(EditSomeProp.Text)) then
// Watch out for Elsie here
else
begin
. . .

Should I or should I not uncomment the first comment? Well, it has been inactive for 7 years, so I think I’ll leave it as is, or remove it.
The second one is history, due to a rewrite.

Writing Readable Code – Paul’s Snippet – Show your formatting!

Paul made an comment in my previous post, but unfortunately the Blogger comment system ate all the whitespace. Paul, if you read this – please reformat the code below and email it to me as a zipped attachment! I would also love to receive everybody else’s reformatted version as well!
Your anonymity is guaranteed (Unless you explicitly permit me to name you and/or link to a site or something).

Please note that there was a syntax problem in the original snippet, and I have indicated that below. I suggest that unless Paul instruct us otherwise, we take out the semi-colon after the end on that line.


try
if ConditionA then
Code(1)
else if ConditionB then
if ConditionC then begin
if ConditionD then
code(2);
code(3);
end; // I assume this semi-colon has to go
else if ConditionE then begin
code(4);
morecode('X');
if ConditionF then
code(5)
else if ConditionG then
code(6);
end else begin
code(7);
morecode('Y');
end;
finally
code(8);
end;