Published on

Keeping it small: helping the compiler to remove unused code in OCaml

Authors
  • avatar
    Name
    Chris Armstrong
    Twitter

I've been on a mission to understand how to remove unused code when compiling OCaml binaries. I went down this rabbit hole in my last article exploring symbol tables and undocumented options. I did not have much success.

However, my questions, and the article itself, have stimulated discussion with others in the OCaml community, so that I now have a clearer understanding of what OCaml does (at least at a macro-level) to remove unused code.

Why size matters

I'm interested in seeing how OCaml fares for general-purpose applications, such as developing web applications or as a cloud-based backend.

In most deployment scenarios where your application runs in a container or bare-metal instance, binary size is usually a secondary concern to memory and CPU usage once an application has started up.

On the other hand, deployment size has a huge impact on serverless applications. Larger binary sizes increase startup time, or more specifically cold-start time. Even then, cold starts are really only an issue for user-facing scenarios, such as APIs or web requests, but there are also cost savings to be had when you can reduce the overall time it takes for your code to run.1

What affects binary size?

Binaries have two main contributors to their file size:1

  • native code, stored in .text sections
  • static data, stored in .data sections

OCaml compilation model

OCaml code is compiled similar to C code. In order to understand how to minimise binary size, we need to know how its source code model maps to its binary model, and how it discards unused code.

The unit of compilation in OCaml is the file. Each OCaml file is a module, which is a first-order construct in the language.2 A module can contain type declarations, top-level let bindings (for functions and globally executed code), functor declarations, and importantly, nested modules.

Like with C, when a source code file (.ml) is compiled, an object file is produced (.o) and a OCaml-specific object file with extra linking information (.cmx). Multiple .cmx files can be combined into a library of object code (both a .a and .cmxa)3

showing the compilation process
The OCaml compilation process (simplified)

The linking model: how unused code is discarded

OCaml relies on a system linker to link together its code (such as lld or gnu-ld.

When a program is linked, any transitively unused compilation units are discarded.

When the compilation unit is a top-level module as a file, entirely unreferenced compilation units are discarded. Conversely, this means that any functions referenced from the top-level of a file or one of the modules within it, will cause all the code in that file to be included in the output.

For example, let's say we have two top-level modules, Greeting (greeting.ml) and Name_printer (name_printer.ml):

(* greeting.ml *)
module Hello = struct
  let print_hello () = Format.printf "Hello!\n"
end

module Goodbye = struct
  let print_goodbye () = Format.printf "Goodbye!\n"
end
(* name_printer.ml *)
module First_name = struct
  let print_first_name name = Format.printf "Your first name is %s.\n" name
end

module Full_name = struct
  let print_full_name first last = Format.printf "Your full name is %s %s\n" first last
end

We compile these and combine them into a library called messages.cmxa:

# `-c` skips linking step; `-a` creates a library
$ ocamlopt -c greeting.ml     # => greeting.cmx greeting.o
$ ocamlopt -c name_printer.ml # => name_printer.cmx + name_printer.o
$ ocamlopt -a greeting.cmx name_printer.cmx -o messages.cmxa

If we then write some code using both modules, we expect to see both modules referenced in the output:

(* conversation.ml *)
let () =
  let first_name = Array.unsafe_get Sys.argv 1 in
  Greeting.Hello.print_hello ();
  Name_printer.First_name.print_first_name first_name

Compiling and running the program then gives us:

$ ocamlopt -I messages.cmxa conversation.ml -o ./conversation.exe

$ ./conversation.exe Chris
Hello!
Your first name is Chris.

We then inspect the symbols included in the binary (OCaml modules are prefixed with caml in the symbol table).

$ objdump -x conversation.exe | grep camlName_printer | cut -b 62-
camlName_printer__code_end
camlName_printer__code_begin
camlName_printer__gc_roots
camlName_printer__Pmakeblock102
camlName_printer__data_end
camlName_printer
camlName_printer__print_first_name_2
camlName_printer__Pmakeblock80
camlName_printer__frametable
camlName_printer__print_first_name_0_2_code
camlName_printer__print_full_name_1_3_code
camlName_printer__print_full_name_3
camlName_printer__data_begin
camlName_printer__entry
$ objdump -x conversation.exe | grep camlGreeting | cut -b 62-
camlGreeting__Pmakeblock78
camlGreeting__Pmakeblock59
camlGreeting__data_end
camlGreeting__gc_roots
camlGreeting__const_block33
camlGreeting__print_goodbye_1_3_code
camlGreeting__print_hello_0_2_code
camlGreeting
camlGreeting__code_begin
camlGreeting__code_end
camlGreeting__const_block16
camlGreeting__print_hello_2
camlGreeting__print_goodbye_3
camlGreeting__entry
camlGreeting__frametable
camlGreeting__immstring14
camlGreeting__immstring31
camlGreeting__data_begin

Note that although we didn't use Greeting.Print_Goodbye or Name_printer.Full_name, their symbols are still included because they were part of the same file unit.

There are of course some more subtleties to this, which we'll discuss.

Dynamic Linking Environments

The OCaml Native compiler (ocamlopt) can be asked to produce code that is suitable for dynamic linking (either the shared library or the consumer).

In these cases, the -linkall flag is specified for the executable (.exe), which ensures that all libraries specified on the command line are linked in so they are available at runtime to any shared libraries that might require them.

If you use the -linkall flag (or any of the libraries you depend on were compiled with it), unused modules are still linked into your application.

Aliased modules

Libraries typically "wrap up" a number of separate modules to provide a single point of entry which assists with findability and code completion.

This is done with module aliasing, which looks like this:

(* french_greetings.ml *)
let print_hello () = Format.printf "Bonjour\n"
let print_goodbye () = Format.printf "au revoir!\n"

```ocaml
(* greetings.ml *)

(* ... *)

module French = French_greetings
(* The equivalents in French *)

If we then use Greetings.French in conversations.ml, we find that its wrapper module Greeting is not included in the output, even though we referenced French through it as a parent module.

(* conversations.ml *)
let () =
  let first_name = Array.unsafe_get Sys.argv 1 in
  Greeting.French.print_hello ();
  Name_printer.First_name.print_first_name first_name
# Recompile the dependent modules and rebuild our library
$ ocamlopt -c french_greeting.ml
$ ocamlopt -c greeting.ml
$ ocamlopt -a french_greeting.cmx greeting.cmx name_printer.cmx -o messages.cmx
# Recompile our program
$ ocamlopt messages.cmx conversation.ml -o conversation.exe

Inspecting the symbols again, we then find that Greeting is not included at all:4

$ objdump -x conversation.exe | grep camlGreeting | cut -b 62-
$ objdump -x conversation.exe | grep camlFrench | cut -b 62-
camlFrench_greeting__print_hello_0_2_code
camlFrench_greeting__frametable
camlFrench_greeting__immstring28
camlFrench_greeting__gc_roots
camlFrench_greeting__data_end
camlFrench_greeting__print_hello_2
camlFrench_greeting
camlFrench_greeting__entry
camlFrench_greeting__data_begin
camlFrench_greeting__print_goodbye_1_3_code
camlFrench_greeting__const_block15
camlFrench_greeting__immstring13
camlFrench_greeting__code_begin
camlFrench_greeting__print_goodbye_3
camlFrench_greeting__code_end
camlFrench_greeting__const_block30

This is because the compiler translates aliases and only includes the top-level modules directly.

This only works with pure module aliases: if you are instantiating a functor from another file, it will still include the whole module:5

(* greetings.ml *)

(* this will expand the functor Generate_greeting with module Language_French inside this module - referencing the alias French will bring in the whole module*)
module French = Generate_greeting(Language_french)

Exploiting module aliases and separate compilation units to reduce unused code

What this means is that if you're designing a library with lots of code that might go unused, you should:

  • split up your code into separate files, not just separate modules in the same file
  • use module aliasing to make it easier for consumers to find your separated modules
  • when using functors, expand the functor inside its own file (i.e. include Generate_greeting(Language_french))
  • ensure you are not compiling any of your code (or the code you depend on) with -linkall

In a follow up article, I'll explain my plans for doing this with smaws to minimise the size of AWS SDKs in compiled binary output.

Footnotes

  1. Yes, serverless is expensive vs a well-utilised server, but is usually cheaper for sporadic workloads, which is what most new applications have. 2

  2. See first-class modules

  3. .cmxa files can contain both individual object files (.cmx/.o) and other libraries (.cmxa/.a)

  4. Notice that the symbols for Greeting.French use the original module name, French_greeting.

  5. I suspect that instantiating a functor in a compilation unit causes the functor code to be expanded and included, as if it were declared as a nested module.