Minimize Debian

More than once I needed to create a minimal environment, and mostly I’ve decided to use Debian. It is an operating system I know relatively well, it is widely used, and has a ton of software packaged officially.

The occasion I needed such a system have been very different:

and I’m sure there are many other use cases where a small environment has many benefits.

The main reasons where

  • trying to reduce the amount of downloaded data for installing and completely upgrading the system

  • avoid programs running in the background "by accident", especially on systems with limited amount of RAM, Disk and CPU capacities

  • remove redundant data, as when working in a virtual environment, I can normally access documentation and other tools on the host environment

  • avoid throwing away still perfectly working SD cards that are inappropriate for smartphones as modern (or "smart") applications tend to create a lot of unnecessary data

So some times the constraints where because the medium was physically limited, while most times it was about speed and size complexity.

The Debian project has even an official guide with some tips on how to reduce the installation footprint, and Ubuntu too.

Minimize installation footprint

A minimal installation is probably the easiest way to begin with for creating the most small-sized possible system. Also even with a "normal" (desktop") installation, it is relatively easy to remove a lot of data.

First, it’s necessary to see what’s installed on the system. To list all installed packages, execute on of the following command

apt list --installed
aptitude search '~i'
dpkg --list
dpkg-query -l
dpkg-query -f '${binary:Package}\n' -W

And search through what are those that are obviously not necessary.

For example avoid having installed multiple browsers or mail clients, remove never-used programs or games, and so on.

Both synaptic (in case there is a GUI) and aptitude can show categories of software.

You might not need database software or amateur radio packages. Unless you develop, debug and most development packages can be removed too. And similar arguments hold for every category, depending on the main usage of the environment.

Remove unused packages and files

On a minimal already working installation, there are of course fewer packages to remove, and thus fewer chances to break the system. On a full-fledged desktop environment, it is harder to see why some packages have been installed, probably as a dependency of other packages, so there needs to be some research to understand what packages are really necessary and what is not, and why they were installed in the first place.

Some low-hanging fruits are, for example, language packages.

All packages that match the pattern hypen-<lang>, aspell-<lang>, manpages-<lang>, task-<lang>, i<lang>, firefox-esr-l10n-<lang>, l10n and that can be removed without breaking any dependency, can be removed if you feel comfortable using an English system.

Kernels tend to occupy a lot of space, so if the newest version works without issues, remove all others.

Documentation is also optional for a working system. Especially in a virtualized environment, even more, if the host tends to have the same packages, there is probably no need to have two copies of the documentation. Also, most command-line utilities have a builtin help, so in some use-cases, man pages are also redundant on the host system.

So packages like man-db, manpages*, and *-doc can be removed, again, if they do not break any dependency.

Do not install optional dependencies by default

They can add value, new functionalities, but they might not be needed.

By default, on a minimal system, it might make more sense to install packages with only the necessary dependencies, and eventually add those optional later.

When installing new programs, apt accepts the --no-install-recommends, otherwise it is possible to change the default behaviour by adding APT::Install-Recommends "0" ; APT::Install-Suggests "0" ; ` to `/etc/apt/apt.conf.

Compression settings

It is possible to compress the package indexes.

This will probably increase CPU usage, so it might not be obvious if it makes sense to enable this setting

Acquire::GzipIndexes "true";
Acquire::CompressionTypes::Order:: "gz";

Remove package components

The Debian package system permits to blacklist (and whitelist) specific paths for avoiding installing on the system unnecessary files.

Such configuration files are located under /etc/dpkg/dpkg.cfg.d.

For example

path-exclude /usr/share/man/??
path-exclude /usr/share/man/??_*

would avoid saving on disk all localized man pages, independently from what packages are installed.

Otherwise, just avoid installing manpages and remove the content of the whole /usr/share/man directory to be sure.

path-exclude /usr/share/man/*

The same holds generally for other documentation systems:

path-exclude /usr/share/doc/*
path-exclude /usr/share/doc-base/*
path-exclude /usr/share/gtk-doc/*
path-exclude /usr/share/help/*
path-exclude /usr/share/info/*
path-exclude /usr/share/man/*
path-exclude /usr/share/man-db/*

and unless the system gets distributed, licenses and copyright notices can be removed too

path-exclude /usr/share/common-licenses/*

Depending on specific programs, other files from the shared folder can be removed too

path-exclude /usr/share/groff/*
path-exclude /usr/share/linda/*
path-exclude /usr/share/lintian/*

As normally virtual machines, chroot, and other minimal environments do not need the wireless module, it can be removed too. I’m mentioning it explicitly because currently, it is on my systems one of the biggest directories in size under /usr/lib/modules/. The same holds for Bluetooth, and probably other modules and drivers, depending on the platform.

path-exclude /usr/lib/modules/*/kernel/drivers/net/wireless/*

Notice that those settings might break upgrading the systems. For example

path-exclude /usr/share/nvim/runtime/doc/*

broke the upgrade process, as the update package searched a file in this directory.

Other possibilities

hibernation

On a virtualised system, hibernation is hardly useful:

if rm /etc/initramfs-tools/conf.d/resume >/dev/null; then :;
  update-initramfs -u >/dev/null;
fi

Remove settings and clean leftovers

Also remember to remove all unnecessary dependency packages and settings leftovers:

apt-get --assume-yes autoremove --purge >/dev/null;
apt-get --assume-yes clean >/dev/null;
apt-get --assume-yes autoclean >/dev/null;
dpkg -l | awk '/^rc/ {printf( "%s%c", $2, 0 )}' | xargs --null --no-run-if-empty dpkg --purge;

With aptitude:

aptitude clean;
aptitude autoclean;
aptitude purge '~c';

Remove cache files

After settings up a system, especially if this needs to be shared with someone else, it might be nice to clean logs and other cache files. python and other programs, like the bash shell or file manager, might have created such files automatically, like the .cache and pycache folders, and *.pyc, .xsessions-errors and Thumbs.db files.

Search to which package belongs a file

To find out what Debian package a particular file belongs to:

dpkg -S <path to file>;

if possible, it is better to remove the offending package altogether instead of some of its files.

Keep disk usage under control

du is surely able t do all the tasks, but I found ncdu very practical to use, as it provides an interactive interface:

ncdu -x /;

It is possible to navigate in subfolders and see what files and directories are taking most spaces, and eventually delete those.

For verifying how much disk space has been used, df does a good job:

df -h;