As most people know, our Island compiler backend is based on LLVM and uses that to optimize and inline code. However, one problem with that is that LLVM works on the "module" level, per object file, not for the final executable or between different libraries.

Elements creates one object file per class and/or generic instantiation, so when calling, say, List<Integer>.Item[0], it calls the underlying method, instead of inlining what is a fairly simple method.

Until a short while ago, we used LLVMs link time optimization to get around this. This works by emitting optimized LLVM bitcode instead of native code and letting the linker optimize the whole thing. But while working on the Go front-end, we found that this became very slow, very fast: compiling against Go.lib took over 15 minutes for a simple "Hello world" application.


So over the past two weeks I've been working on automatic compile time inlining. It works like this: if the compiler is targeting optimized builds and a method is not virtual, interface or external, it becomes eligible for inlining. The compiler then calculates how complex that function is, taking into account other inline methods it will call. If it exceeds a certain value, it disqualifies it from inlining. Otherwise, it compiles the method as it normally does, but it also stores a serialized form of the method body in the .fx file. When using this library and having optimizations enabled, it inlines the code automatically.

All of this can be disabled of course, both on the project level, and by applying the existing [DisableInlining] attribute to a method.

This means that code like this...

namespace ConsoleApplication999;

  MyList = public class
     fData: Object;

    method &Set(aData: Object);
      if aData = nil then raise new Exception('Data must be assigned');
      fData := aData;

    method Emit;

  Program = class

    class method Main(args: array of String): Int32;
      var lItem := new MyList;



...generates code like this:

# Frame setup
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%esi
# New MyClass, which is 8 bytes.
	pushl	$8
	pushl	$__rtti_for_MyList
	calll	__NewClass 
# Calls the constructor
	addl	$8, %esp
	movl	%eax, %esi
	pushl	%eax
	calll	_MyList_Constructor
# stores "Test" in the class instance (%esi) + 4
# the compiler knows that test string isn't null so removed the exception and check.
	addl	$4, %esp
	movl	$L_test_string+4, 4(%esi) 
# call emit.
	pushl	%esi
	calll	_MyList_Emit
	addl	$4, %esp    
# Return 0    
	xorl	%eax, %eax
# Frame teardown    
	popl	%esi
	popl	%ebp

This completely eliminates the call to _MyList_Set, inlines it, and removes the exception check.

While far from the last optimization we'll do, this was a huge change that forms the basis for future optimizations and vastly improved the (compiler side) inliner and ironed out some minor issues we found along the way. We'll be working on eleminating more code that's not called, decreasing the file size and overall optimizing things as we find them.

Let us know what you think!