As most people know, our Island compiler backend is based on LLVM and uses that to optimize and inline code. However, one problem with that is that LLVM works on the "module" level, per object file, not for the final executable or between different libraries.
Elements creates one object file per class and/or generic instantiation, so when calling, say,
List<Integer>.Item, it calls the underlying method, instead of inlining what is a fairly simple method.
Until a short while ago, we used LLVMs link time optimization to get around this. This works by emitting optimized LLVM bitcode instead of native code and letting the linker optimize the whole thing. But while working on the Go front-end, we found that this became very slow, very fast: compiling against Go.lib took over 15 minutes for a simple "Hello world" application.
So over the past two weeks I've been working on automatic compile time inlining. It works like this: if the compiler is targeting optimized builds and a method is not
external, it becomes eligible for inlining. The compiler then calculates how complex that function is, taking into account other inline methods it will call. If it exceeds a certain value, it disqualifies it from inlining. Otherwise, it compiles the method as it normally does, but it also stores a serialized form of the method body in the .fx file. When using this library and having optimizations enabled, it inlines the code automatically.
All of this can be disabled of course, both on the project level, and by applying the existing
[DisableInlining] attribute to a method.
This means that code like this...
namespace ConsoleApplication999; type MyList = public class private fData: Object; public constructor; begin end; method &Set(aData: Object); begin if aData = nil then raise new Exception('Data must be assigned'); fData := aData; end; [DisableInlining] method Emit; begin writeLn(fData); end; end; Program = class public class method Main(args: array of String): Int32; begin var lItem := new MyList; lItem.Set('Test'); lItem.Emit; end; end; end.
...generates code like this:
MAIN: # Frame setup pushl %ebp movl %esp, %ebp pushl %esi # New MyClass, which is 8 bytes. pushl $8 pushl $__rtti_for_MyList calll __NewClass # Calls the constructor addl $8, %esp movl %eax, %esi pushl %eax calll _MyList_Constructor # stores "Test" in the class instance (%esi) + 4 # the compiler knows that test string isn't null so removed the exception and check. addl $4, %esp movl $L_test_string+4, 4(%esi) # call emit. pushl %esi calll _MyList_Emit addl $4, %esp # Return 0 xorl %eax, %eax # Frame teardown popl %esi popl %ebp retl
This completely eliminates the call to
_MyList_Set, inlines it, and removes the exception check.
While far from the last optimization we'll do, this was a huge change that forms the basis for future optimizations and vastly improved the (compiler side) inliner and ironed out some minor issues we found along the way. We'll be working on eleminating more code that's not called, decreasing the file size and overall optimizing things as we find them.
Let us know what you think!