Analyze build times with clang

Notes published the
4 - 5 minutes to read, 1013 words

Long build times are a frustrating issue when working on large C++ codebases.

Unfortunately, there is no simple method for detecting where the compiler is spending so much time and, more importantly, why.

There are some guidelines, for example

  • avoid including too many files

  • avoid overusing templates

  • prefer forward-declarations where possible

  • removing dead code

  • switch from Windows to GNU/Linux.⁠[1]

  • Different Windows version might have different limitation on the maximum number of cores, depending on your hardware you might want to upgrade your license.

    • On Windows, accessing the file system is slower "by design"

    • Some processes (indexing, Cortana, AV, …​) are deeply integrated with the operating system and hard or impossible to disable

    • On Windows, tasks are scheduled differently than on GNU/Linux systems. Generally, it seems that there are different places where Windows does not use resources in an efficient way, but it is mostly possible to observe those effects only when there is a high workload.⁠[2]. On the bright side, for those projects that Microsoft is interested in, more performant algorithms are implemented by 🗄️ them 🗄️ in order to work more efficiently even on Windows.

    • The antivirus, even with exclusion rules, can have a significant overhead

  • avoid MSBuild and prefer make or ninja when using the Microsoft Compiler, as they are much better at parallelizing jobs

But those are just guidelines, some also have drawbacks, others cannot be applied, and we generally do not know how much time we are wasting and can win back.

With tools like quick-bench it is easy to measure the compile-time cost of using <cstdio> compared to <iostream>, or a tuple or pair instead of a struct, but it does not help to analyze an existing project of a certain size. Notice that in both examples, most of the differences in build times are mainly because of the #include directives.

Another common piece of advice is to use more powerful hardware, which is simpler and less time-consuming than analyzing anything but might cost a lot after some upgrades.

Nevertheless, investing some time in reducing build-time has multiple benefits.

Clang, since version 9, includes a compile-time parameter -ftime-trace that generates some data to help to analyze how much time it is spending on every file and how it is spending it.

The only change necessary in the build system is thus to add -ftime-trace as compile parameter for both C and C++.

For every source file foo.cpp, clang generates while compiling a corresponding JSON foo.o.json file in the same directory where the object file foo.o is located.

Those files look like

{
  "traceEvents": [
    {
      "pid": 21380,
      "tid": 7800,
      "ph": "X",
      "ts": 54466,
      "dur": 5206,
      "name": "Source",
      "args": {
        "detail": "/libc/usr/include/stdc-predef.h"
      }
    },
    {
      "pid": 21380,
      "tid": 7800,
      "ph": "X",
      "ts": 66689,
      "dur": 1371,
      "name": "Source",
      "args": {
        "detail": "/libc/usr/include/sys/cdefs.h"
      }
    },

    ...

    {
      "cat": "",
      "pid": 21380,
      "tid": 7800,
      "ts": 0,
      "ph": "M",
      "name": "thread_name",
      "args": {
        "name": ""
      }
    }
  ],
  "beginningOfTime": 1643286424387042
}

Most values are self-explanatory. "pid" stands for "Process ID" and "tid" for "thread ID". "dur" is the duration of an event and "args", if not empty, contains additional information related to the single event, like the file that is being processed (for example parsing an included file).

Once Clang finishes compiling your project, the next question is: how to parse those files? How to extract some useful metrics?

With a GUI

If you have one of the many chromium-based browsers, open the address chrome://tracing, and load the JSON file.

Notice that you can only load one JSON file at a time. This is an annoyance if one wants to look at multiple files at once to compare build times.

Fortunately, there is a simple fix; pack those JSON files in a zip archive and open them from chrome://tracing.

Maybe it is also possible to "merge" all those JSON files together. I did not find anything useful and did not research the issue further.

Note that loading many files increases a lot the load time, and generally makes the browser unresponsive.

Thus you could zip all files together with

find $build_dir -path '*/CMakeFiles/*.dir/*' -name '*.json' -type f -exec zip compileperf.zip {} +

but you probably do not want to.

If you do not want to use the browser, there is a standalone program, which is used internally by chromium.

From the command line

Having a graphical overview is very useful. It gives a feeling of how much slower some files are compared to others, but querying information, like finding the slowest file to compile, is unpractical.

It would be much simpler to extract the information from the JSON files directly.

The entry "Total ExecuteCompiler" gives the total time for compiling a single file, so I decided to print its duration out.

jq -c '.traceEvents[] | select(.name | contains("Total ExecuteCompiler")) | .dur' "$filename";

Now it’s possible to save this snippet in a script, for example

#!/bin/sh

set -o errexit

time=$(jq -c '.traceEvents[] | select(.name | contains("Total ExecuteCompiler")) | .dur' "$1";)
printf '%s %s\n' "$time" "$1";

And invoke it on all files, and sort the output by time

find <build dir> -name '*.json' -type f -exec <script> {} \; | sort --numeric-sort --reverse;

Knowing which files take more time than the others to compile, it makes sense to look at those at first, and eventually use chrome://tracing for analyzing the single JSON file.


1. It is interesting to see how many bug reports to the WSL project show how Windows is slower for handling many processes and accesses to the file system
2. On randomascii it is possible to find different articles on those 🗄️ issues 🗄️

Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me anytime.