XE8 Findings

XE8 Findings

Found some minor tweaks in XE8. Will add to this post as I find more.

Breaking changes in FireDAC.UI.Intf

TFDScriptOutputKind used to be TFDScriptOuputKind

I actually QPd this, so blame me 😛

Breaking changes in FireDAC.comp.client

TFDConnectionLoginEvent = procedure (AConnection: TFDCustomConnection; AParams: TFDConnectionDefParams) of object;

used to be

TFDConnectionLoginEvent = procedure (AConnection: TFDCustomConnection; const AConnectionDef: IFDStanConnectionDef);

TFDErrorEvent = procedure (ASender, AInitiator: TObject; var AException: Exception) of object;

used to be

TFDErrorEvent = procedure (ASender: TObject; const AInitiator: IFDStanObject; var AException: Exception) of object;

TFDConnectionRecoverEvent = procedure (ASender, AInitiator: TObject; AException: Exception; var AAction: TFDPhysConnectionRecoverAction) of object;

used to be

TFDConnectionRecoverEvent = procedure (ASender: TObject; const AInitiator: IFDStanObject; AException: Exception; var AAction: TFDPhysConnectionRecoverAction) of object;

This one as a bit annoying as I actually was accessing the IFDStanObject.Name property.  I guess .classname will have to do.

 

22 thoughts on “XE8 Findings


  1. The changes in System.Generics.Collections are interesting too. Stefan Glienke can probably comment much better than me – it’s very similar to this blog post: http://delphisorcery.blogspot.com.au/2014/10/new-language-feature-in-xe7.html


    Lots of switches based on IsManagedType(T), or SizeOf(T), etc, presumably with the other cases being optimised out at compiletime, leading to fast special-cased behaviour. Look at TList.Insert, for example.


    Also the ToolsAPI has some hooks that seem to be related to GetIt.


  2. David Millington The changes in System.Generics.Collections are their attempt to fight binary size. I leave it to you to find out if that was successful.


    As for “fast special-cased behavior – execute this code in XE7 and in XE8:


    program Project1;


    {$APPTYPE CONSOLE}


    uses


      Diagnostics,


      Generics.Collections,


      SysUtils;


    var


      list: TList;


      i: Integer;


      sw: TStopwatch;


    begin


      list := TList.Create;


      sw := TStopwatch.StartNew;


      for i := 1 to 10000000 do


        list.Add(i);


      Writeln(sw.ElapsedMilliseconds);


      Readln;


    end.


  3. XE7 32-bit Release: 120-130ms


    XE7 64-bit Release: 130-180ms


    XE8 32-bit Release: 150-160ms


    XE8 64-bit Release: 160-180ms


    Also XE8 shows (briefly) a recompile dialog every single time it’s run, and running from inside the IDE (although ‘running without debugging’) seriously impacts the execution speed – it can take twice as long, eg typical 32-bit Release times under XE8 are 300ms when run without debugging, which drops to 160 the moment you run truly without debugging from Explorer. Goodness knows what it’s doing. That doesn’t happen in XE7.


  4. Stefan Glienke   Interesting benchmark there. Even with pre-allocated memory using Capacity, the difference is clear.


    It also seems that basic array access perf has degraded in XE8. Try this variant:


    program Project1;


    {$APPTYPE CONSOLE}


    uses


      Diagnostics,


      Generics.Collections,


      SysUtils;


    const


      N = 400000000;


    var


      list: TList;


      i: Integer;


      sw: TStopwatch;


      arr: TArray;


    begin


      list := TList.Create;


      list.Capacity := N;


      sw := TStopwatch.StartNew;


      for i := 1 to N do


        list.Add(i);


      Writeln(sw.ElapsedMilliseconds);


      sw := TStopwatch.StartNew;


      SetLength(arr, N);


      for i := 0 to N-1 do


        arr[i] := i;


      Writeln(sw.ElapsedMilliseconds);


      Readln;


    end.


  5. David Heffernan On release mode and on win32 its the same here (I had to reduce N to 100 mio though as it was giving me a out of memory exception)


    It also creates the exact same code (O+, W-):


    for i := 0 to N-1 do


    xor ebx,ebx


    arr[i] := i;


    mov eax,[$004e03f0]


    mov [eax+ebx*4],ebx


    inc ebx


    for i := 0 to N-1 do


    cmp ebx,$05f5e100


    jnz $004d67b5


    One problem with the changes to TList is that now the extra TListHelper array is causing some additional cycles for some members (see TListHelper.FItems).


    Also inlining is still not good enough – it does not inline some things that should be inlined to gain some performance. The main performance drop in this example is caused by not inlining TListHelper.InternalGrowCheck.


  6. David Heffernan Same story – identical code generated.


    for i := 0 to N-1 do


    mov [rel $0002dbad],$00000000


    arr[i] := i;


    mov rax,[rel $0002dbc6]


    movsxd rcx,[rel $0002db9f]


    mov edx,[rel $0002db99]


    mov [rax+rcx*4],edx


    add dword ptr [rel $0002db8f],$01


    for i := 0 to N-1 do


    cmp [rel $0002db85],$05f5e100


    jnz Project1 + $143


    nop


  7. Stefan Glienke  Very odd. I can see same code too. But def slower in XE8 for me. Presumably just a duff test case. Perhaps something about memory layout makes the difference. I don’t know.


    Anyway, do you know why TList.Add is slower? I think that this size reducing helper just has slower code. So instead of this in XE7


    System.Generics.Collections.pas.917: FItems[Count] := Value;


    000000000054039B 488B4308         mov rax,[rbx+$08]


    000000000054039F 48634B10         movsxd rcx,[rbx+$10]


    00000000005403A3 893488           mov [rax+rcx*4],esi


    We have this in XE8


    System.Generics.Collections.pas.2419: PCardinal(FItems^)[FCount] := Cardinal(Value);


    0000000000453CCA 488B442440       mov rax,[rsp+$40]


    0000000000453CCF 488B4030         mov rax,[rax+$30]


    0000000000453CD3 488B4C2440       mov rcx,[rsp+$40]


    0000000000453CD8 486309           movsxd rcx,[rcx]


    0000000000453CDB 8B13             mov edx,[rbx]


    0000000000453CDD 891488           mov [rax+rcx*4],edx


    Pretty sucky if you ask me. 


  8. Stefan Glienke  No, I didn’t see that text before. Must have crossed with your edit. I’m dubious that lack of inlining TListHelper.InternalGrowCheck is a big issue. I much more suspicious of the extra indirection that can be seen in the access of the items array. That rather awful


    function TListHelper.GetFItems: PPointer;


    begin


      Result := PPointer(PByte(@Self)+ SizeOf(Self));


    end;


    I cannot help feeling that Emba have thrown the baby out with the bath water.


  9. David Heffernan You can be dubious all day I profiled it and I know it’s the main cause. I have been working on some optimizations in Generics.Collections for some while and have been trying to remove some code from the generic type into some non generic type to reduce the overhead and came up with a slightly similar approach and also in my case the non inlining of the GrowCheck method caused a ~15% performance drop.


    Of course the extra indirection is not making it any better. And the reduction in binary size is also insignificant (every TList class where T is a class still adds to the binary size although it has 100% the same code).


  10. Stefan Glienke In XE7, I can see a 15% difference in the 32 bit compiler with GrowCheck inlined, compared with it not inlined. For the 64 bit compiler, inlining makes no difference in my test.


  11. David Heffernan I did not say that but it’s not the main reason in this particular case. I think we can argue all day and in the end we are both of the same opinion, these changes are bad and missing their goal. In fact I argued enough about the generic bloat issue in the past so I will leave that to other people now.


    I solved the problem in Spring4D 1.2 by kind of hardcasting a TObjectList to a IList when T is a class which only causes an overhead of a few hundred bytes for each list of T where T is a class type. Other generic types will follow when necessary. Also using the new DynamicArray can reduce the binary bloat when you just need some syntax sugar on an array (more than what was added in XE7).


  12. David Millington About: “Also XE8 shows (briefly) a recompile dialog every single time it’s run”. Do you have IDE Fix Pack installed in XE7? If so then the missing IDE Fix Pack for XE8 may be the reason for the constant recompile 😉


  13. Andreas Hausladen I do indeed have IDE Fix Pack installed!  It’s an essential 🙂 (And while you’re here, thankyou for making it.) I didn’t realise it affected the compile dialog when there were no changes to the project, though.


  14. +Stefan The thing is that we are looking at different compilers. I focus on x64 ‘cos that’s where my code runs predominantly. You are looking at x86.


    For x64 the grow check inlining seems not to matter.


    It seems that Emba care more about executable size than perf. What they ought to do is solve the problem properly in the compiler/linker.