Distributed compilation could greatly speed the build process of Gentoo packages (and not only Gentoo, of course). If you tend to use Gentoo on a laptop or a relatively old CPU you may want to build packages distributively across multiple hosts.
Different (Linux) distributions use different configurations and environment scheme and sometimes it is difficult to sift the configuration, which could be applied to your setup. This is not a tutorial on how to enable parallel processing in Gentoo but it is just our client-site setup.
By default, there is a limit of 4 parallel processes, which is utterly insufficient, because nowadays most servers have more than 8 cores/logical compute units (not to mention that probably most would have 16 and above cores compute units).
The environment variable DISTCC_HOSTS controls, which hosts will receive files for the compilation of what they support and what is the limit of parallel processes.
In Gentoo we set this variable in the /etc/portage/make.conf. Here what you may include in make.conf to have 16 parallel remote processes and up to maximum 4 local (if the remote fails):
MAKEOPTS="-j16 -l4" FEATURES="distcc" DISTCC_HOSTS="192.168.0.101/16"
We use the environment DISTCC_HOSTS (here in Gentoo put in the make.conf, but in another Linux distribution an environment variable with this name should be set) because it is easy to set up and control globally for the Gentoo emerge system.
According to the documents:
In order, distcc looks in the $DISTCC_HOSTS environment variable, the user’s $DISTCC_DIR/hosts file, and the system-wide host file.
So when using emerge to build the packages, the emerge will rely on $DISTCC_HOSTS in make.conf (/etc/portage/make.conf or /etc/make.conf if you still use the old path), “/var/tmp/portage/.distcc/” (the build process uses “portage” user and group, not root!) and “/etc/distcc/hosts”. The first option used in the order above will be set the hosts and the limitation for the distributed processing. So if you use $DISTCC_HOSTS in make.conf (or environment) you wouldn’t need to set the “hosts” file.
Separate the different hosts with white space if you have more than one and always use the notation “/LIMIT” for each host. The default value is only 4 parallel processes (i.e it is implicitly added /4 to each hosts in the configuration!)
Monitoring
Verify you are using more than the default 4 parallel processes by the console monitoring program:
root@srv ~ # DISTCC_DIR="/var/tmp/portage/.distcc/" distccmon-text 1 24877 Compile gegl-region-generic.c 192.168.0.101[3] 24786 Compile gegl-sampler.c 192.168.0.101[4] 24891 Compile gegl-tile-source.c 192.168.0.101[5] 24659 Compile gegl-buffer-load.c 192.168.0.101[8] 24660 Compile gegl-buffer-save.c 192.168.0.101[9] 24664 Compile gegl-buffer-linear.c 192.168.0.101[10] 24769 Compile gegl-sampler-nearest.c 192.168.0.101[11] 24986 Connect gegl-tile-handler.c 192.168.0.101[14] 24919 Preprocess localhost[0] 24774 Preprocess localhost[1] 24953 Preprocess localhost[1] 24898 Preprocess localhost[2] 24968 Preprocess localhost[3] 24975 Preprocess localhost[4] 24832 Preprocess localhost[7] 24832 Compile gegl-sampler-lohalo.c 192.168.0.101[0] 24953 Compile gegl-tile-storage.c 192.168.0.101[1] 24898 Compile gegl-tile.c 192.168.0.101[2] 25045 Compile gegl-tile-handler-chain.c 192.168.0.101[3] 24919 Compile gegl-tile-backend.c 192.168.0.101[6] 24968 Compile gegl-tile-backend-file.c 192.168.0.101[7] 24660 Compile gegl-buffer-save.c 192.168.0.101[9] 24664 Compile gegl-buffer-linear.c 192.168.0.101[10] 24769 Compile gegl-sampler-nearest.c 192.168.0.101[11] 24774 Compile gegl-sampler-cubic.c 192.168.0.101[12] 24975 Compile gegl-tile-backend-ram.c 192.168.0.101[13] 25078 Compile gegl-tile-handler-log.c 192.168.0.101[14] 25068 Preprocess localhost[0] 25009 Preprocess localhost[0] 24832 Compile gegl-sampler-lohalo.c 192.168.0.101[0] 24898 Compile gegl-tile.c 192.168.0.101[2] 25101 Compile gegl-tile-handler-zoom.c 192.168.0.101[4] 25009 Compile gegl-tile-handler-cache.c 192.168.0.101[5] 24968 Compile gegl-tile-backend-file.c 192.168.0.101[7] 25068 Compile gegl-tile-handler-empty.c 192.168.0.101[8] 24660 Compile gegl-buffer-save.c 192.168.0.101[9]
Note we set the DISTCC_DIR=”/var/tmp/portage/.distcc/” because as said the Gentoo emerge command uses “portage” user and directory “/var/tmp/portage” to build the packages (by default) and the distcc state directory is under “/var/tmp/portage/.distcc/”. You can see the numbe of parallel processes every second by this command.
Bonus – client and server
Quick mentioning how to set up the system on the client and the server. Check out the Gentoo official wiki https://wiki.gentoo.org/wiki/Distcc Here we summarize it up the important part.
- As stated in the official Gentoo wiki – the same GCC and binutils versions should be used on the server and client. To be sure just create the hosts by rsyncing one of the client!
- On the Gentoo client and Gentoo server, you must just install sys-devel/distcc
emerge -v sys-devel/distcc
- On the server start the distcc daemon, changing only the configuration file “/etc/conf.d/distccd” by adding to the DISTCCD_OPTS “-j 16” (16 is the maximum parallel processes allowed, so you may change this value accordingly) and allowing the IP or network of the machines, which will use this host to send compile jobs (the whole network 192.168.0.0/24 is allowed).
DISTCCD_OPTS="${DISTCCD_OPTS} -N 15 -j 16" DISTCCD_OPTS="${DISTCCD_OPTS} --allow 192.168.0.0/24"
And start the server:
/etc/init.d/distccd start
Or if you use systemd check out the official wiki.
- On the client add the above lines from the top in “/etc/portage/make.conf” (or /etc/make.conf if you still use the old path).
- start building a package with emerge on the client