Git logo by Jason Long is licensed under the CC BY 3.0 License

Automate git interactive add

Notes published the
3 minutes to read, 630 words

It might seem a contradiction, but isn’t the point of an interactive add to be…​ interactive?

For a non-interactive add, isn’t it sufficient to pass to git add the list of files to add?

No.

The interactive git add is very useful if one wants to apply only some changes to one or multiple files, but not all of them. Git is a content tracker, not a file tracker, and with git add --interactive (or git add --patch), one can avoid committing the whole file.

A common use-case: a transformation tool applied to your codebase. This can produce a big diff, and reviewing it all at once is a daunting task and probably not a very productive way to spend your time.

For example, the fix add-override of clang-tidy does multiple things.

It adds the keyword override to the virtual destructor of subclasses. This is, for many developers, unexpected, but the change is technically correct. It also adds override to the overridden function, as one would expect, but it also removes the virtual keyword, as an overridden function is always virtual.

I would personally prefer to make three separate commits, especially because two of the three changes are not used in many projects. Having all changes in a single commit makes it more difficult to revert only a part of it.

Thanks to git add --patch, it is trivial to add only the wanted changes. Eventually, one could use git checkout --path to dismiss the changes we do not want.

But if there are a lot of changes, no matter how trivial, being able to automate it is important. Looking at thousands of files takes time, and surely some errors would slip in.

For those tasks, it also makes more sense to not drop anything and save everything as a separate commit. This way, the work is saved somewhere safe, and it is always possible to drop a commit afterward.

The package patchutils provides the binary grepdiff.

grepdiff makes it possible to find all changes that match a given regex in a patch, just like grep.

Contrary to grep, it is also able to create a new patch with only the matching lines. Thanks to the new patch, it is possible to tell git exactly what to add with git apply.

For example, after using modernize-use-override with run-clang-tidy, the following command would commit all changes to destructors that were marked as virtual, where virtual has been removed, and override added.

git diff --unified=0 | grepdiff 'virtual ~' --output-matching=hunk | git apply --cached --unidiff-zero && git commit

This command would add all destructors that are, for good or bad, not marked as virtual (as those marked as virtual have already been processed), where override has been added.

git diff --unified=0 | grepdiff '~' --output-matching=hunk | git apply --cached --unidiff-zero && git commit

Now it’s the turn of functions marked as virtual (destructors excluded, as those have already been processed). This commit adds override and removes virtual to those functions.

git diff --unified=0 | grepdiff 'virtual' --output-matching=hunk | git apply --cached --unidiff-zero && git commit

The remaining changes are functions that are not marked virtual but do override some other functions. All other cases have already been processed.

Note that this approach is not error-free.

First of all, code can be formatted in strange ways, and grepping is thus not foolproof. Having a consistently formatted codebase is thus important if one wants to easily automate various tasks.

But even if grepping would work perfectly, even with --unified=0, sometimes a diff consists of more than one line. For example, if two consecutive lines have been changed.

In my case, the fastest approach was to use git restore --staged on the whole file and manually add only the changes with git add --patch.


Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me anytime.