Clean zip archives

Notes published the
1 - 2 minutes to read, 319 words

When creating an archive, MacOS users might add files like .DS_Store and the directory __MACOSX by accident.

Users of macOS do not even see those files, as they are hidden on their operating system.

It is not clear why those files are automatically added, but they are not necessary on any system (Windows, macOS, GNU/Linux, …​) and just take up space, in particular __MACOSX that seems to contain a copy of all files.

As those files are not needed by anyone, it is safe to remove them.

zip -d archive.zip '*/.DS_Store' '*/__MACOSX/*' '*/Thumbs.db';

Similar issues can also happen in other environments. For example, Windows Explorer generates Thumbs.db hidden files. Also, other programs generate hidden or cache files in different folders, for example, Picasa creates the hidden .picasa.ini file, and Python creates the pycache folder or *.pyc files.

Given an archive, it is possible to remove all those files with the following command:

zip -d archive.zip '*/.[Pp]icasa.ini' '*/__MACOSX/*' '*/Thumbs.db' '*/__pycache__/*' '*.py[co]';

For convenience, you might want to add something similar to the following zip-clean script in your PATH:

#!/bin/sh

zip -d "$1" '*/.[Pp]icasa.ini' '*/__MACOSX/*' '*/Thumbs.db' '*/__pycache__/*' '*.py[co]';

To create an archive and avoid archiving those hidden files at all, use the option -x for excluding files

As the option '-x' can appear more than once, you can drop the following script in your PATH:

cleanzip
#!/bin/sh

exec zip -x '*/.[Pp]icasa.ini' '*/__MACOSX/*' '*/Thumbs.db' '*/__pycache__/*' '*.py[co]' "$@";

and use it instead of using zip directly.

In case you already have extracted the archive, or you generally want to delete those files, it is possible to delete them with:

find archive -type f \( -name '.[Pp]icasa.ini' -o -name '*.py[co]' -o -name 'Thumbs.db' \) -exec rm -f {} + -o -type d \( -name '__MACOSX' -o -name '__pycache__' \) -prune -exec rm -rf {} +;

Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me anytime.