[distcc] Using ccache on distcc server

Building large c-family project (C, C++, Object-C, etc) can be very slow due to long compiling time. There are simple way to improve compilation:

  1. ccache: Cache the compiled result, a.k.a object file locally, and reuse it when source code is not mutated. This gives huge improvement for incremental build.
  2. distcc: Distribute compilation tasks among a pool of machines (distcc server) via network. This scale up the build performance linearly to the number of machines you have (but still bounded by network and I/O).

People usually combine 1. and 2. with CCACHE_PREFIX, as this article described. We use ccache for local cache and sends to distcc server if rebuild is necessary.

There is a further improvement we can do — also let distcc servers cache object files by using ccache! The flow will be.

The concept is straightforward, but took me a while to get a workable solution. Before that, let me first introduce how ccache works: once you installed ccache, it overwrites PATH=/usr/local/ccache:$PATH. This is path masquerade: when you run g++ main.cc -c, the machine actually runs:

/usr/local/ccache is a symlink to /usr/bin/ccache. Ccache then search real compiler path in $PATH , ex. /usr/bin/g++. The command is eventually translated as:

Before executing that command, ccache compares preprocessed output (#include injection, macro expansion..) with previous ccache tasks in local cache, and reuse it if possible. Otherwise, it will invoke the compiler to do the job.

So when we setCCACHE_PREFIX=distcc, similar to above steps, ccache will expand the command, look up local cache, and invoke compiler (if cache miss) with prefix distcc

The distcc command here will then distribute compilation task to remote servers. On distcc server side, it will receive this command and execute it:

However, the above command does not utilize ccache on server side because it just plainly invoke the compiler but not ccache. We don’t want distcc server receives/usr/bin/g++ main.cc -c, we want it to receive:

This means that we need to force distcc to use ccache anyway, even without ccache in command prefix. Is that achievable? Yes, we can use flag DISTCC_CMDLIST to achieve that. From its man page:

DISTCC_CMDLIST

If the environment variable DISTCC_CMDLIST is set, load a list of supported commands from the file named by DISTCC_CMDLIST, and refuse to serve any command whose last DISTCC_CMDLIST_MATCHWORDS last words do not match those of a command in that list. See the comments in src/serve.c.

This allows distcc server to map a compiler path to another path:

Bingo! That’s what we want. Here are the instructions:

1. create a file /home/.distcc/DISTCC_CMDLIST :

2. in /etc/default/distcc, add theselines:

Line 1 tells distcc server to use DISTCC_CMDLIST file for the mapping.
Line 2 is necessary, to make sure child processes spawned in distcc know where the ccache directory located.
Line 3 tells distcc server to use ccache compiler masquerade

3. change ccache directory permission

4. restart distcc server

Good to go!

Now we get the benefits of cache, not just on local machine, but also on distcc servers. Caching prebuilt result is so crucial for heavy c-family project since compiling is so expensive: even a simple helloworld program can takes 20 ms to compile. Using ccache only need 1ms :)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store