As most people know, our Island compiler backend is based on LLVM and uses that to optimize and inline code. However, one problem with that is that LLVM works on the "module" level, per object file, not for the final executable or between different libraries.
Elements creates one object file per class and/or generic instantiation, so when calling, say, List<Integer>.Item[0]
, it calls the underlying method, instead of inlining what is a fairly simple method.
Until a short while ago, we used LLVMs link time optimization to get around this. This works by emitting optimized LLVM bitcode instead of native code and letting the linker optimize the whole thing. But while working on the Go front-end, we found that this became very slow, very fast: compiling against Go.lib took over 15 minutes for a simple "Hello world" application.
So over the past two weeks I've been working on automatic compile time inlining. It works like this: if the compiler is targeting optimized builds and a method is not virtual
, interface
or external
, it becomes eligible for inlining. The compiler then calculates how complex that function is, taking into account other inline methods it will call. If it exceeds a certain value, it disqualifies it from inlining. Otherwise, it compiles the method as it normally does, but it also stores a serialized form of the method body in the .fx file. When using this library and having optimizations enabled, it inlines the code automatically.
All of this can be disabled of course, both on the project level, and by applying the existing [DisableInlining]
attribute to a method.
This means that code like this...
namespace ConsoleApplication999;
type
MyList = public class
private
fData: Object;
public
constructor;
begin
end;
method &Set(aData: Object);
begin
if aData = nil then raise new Exception('Data must be assigned');
fData := aData;
end;
[DisableInlining]
method Emit;
begin
writeLn(fData);
end;
end;
Program = class
public
class method Main(args: array of String): Int32;
begin
var lItem := new MyList;
lItem.Set('Test');
lItem.Emit;
end;
end;
end.
...generates code like this:
MAIN:
# Frame setup
pushl %ebp
movl %esp, %ebp
pushl %esi
# New MyClass, which is 8 bytes.
pushl $8
pushl $__rtti_for_MyList
calll __NewClass
# Calls the constructor
addl $8, %esp
movl %eax, %esi
pushl %eax
calll _MyList_Constructor
# stores "Test" in the class instance (%esi) + 4
# the compiler knows that test string isn't null so removed the exception and check.
addl $4, %esp
movl $L_test_string+4, 4(%esi)
# call emit.
pushl %esi
calll _MyList_Emit
addl $4, %esp
# Return 0
xorl %eax, %eax
# Frame teardown
popl %esi
popl %ebp
retl
This completely eliminates the call to _MyList_Set
, inlines it, and removes the exception check.
While far from the last optimization we'll do, this was a huge change that forms the basis for future optimizations and vastly improved the (compiler side) inliner and ironed out some minor issues we found along the way. We'll be working on eleminating more code that's not called, decreasing the file size and overall optimizing things as we find them.
Let us know what you think!