Clean zip archives

1 minute to read, 242 words
Categories: scripting
Keywords: archive scripting zip

When creating an archive, macOS users might add files like .DS_Store and the directory __MACOSX by accident.

Users of macOS do not even see those files, as they are hidden on their operating system.

It is not clear why those files are automatically added, but they are not necessary on any system (Windows, macOS, GNU/Linux, …​) and just take up space, in particular __MACOSX that seems to contain a copy of all files.

As those files are not needed by anyone, it is safe to remove them.

zip -d '*/.DS_Store' '*/__MACOSX/*' '*/Thumbs.db';

Similar issues can also happen in other environments. For example, Windows Explorer generates Thumbs.db hidden files. Also, other programs generate hidden or cache files in different folders, for example, Picasa creates the hidden .picasa.ini file, and python creates the pycache folder or *.pyc files.

Given an archive, it is possible to remove all those files with the following command:

zip -d '*/.[Pp]icasa.ini' '*/__MACOSX/*' '*/Thumbs.db' '*/__pycache__/*' '*.py[co]';

To create an archive and exclude all hidden files

zip -r archive -x '*/.[Pp]icasa.ini' '*/__MACOSX/*' '*/Thumbs.db' '*/__pycache__/*' '*.py[co]'

In case you already have extracted the archive, it is possible to delete those files with:

find archive -type f \( -name '.[Pp]icasa.ini' -o -name '*.py[co]' -o -name 'Thumbs.db' \) -exec rm -f {} + -o -type d \( -name '__MACOSX' -o -name '__pycache__' \) -prune -exec rm -rf {} +;

Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me here.