Digital audio library
After recovering, I put some thought into how to organize the audio library.
I am not an audiophile, in fact, I normally do not listen to music at all, and I definitively do not sing, except in extremely rare circumstances.
Nevertheless, I like to have an overview of what I have, be it inside my PC or physical objects.
So I could either toss the CD and audio library I have or try to organize it. As I’ve spent so much time recovering the digital archive, I obviously decided not to toss it.
The first step was to decide to digitalize all CDs I own. It makes it easier to have a consistent overview. The digitalized CDs are currently in a Box, unused.
This is not the first thing I did, but in retrospect, it should be the first step; create an account on MusicBrainz.
As it only requires an email address, an account can be totally anonymous.
I suppose there are similar websites, the main reason I’ve created the account is that multiple programs use MusicBrainz database as a backend for querying metainformation about music files, like artist name, album, genre, and so on.
There is no reason for creating an account if I would have only queried the database.
In fact, I’ve used it before reorganizing my music collection on multiple occasions for years.
As I know I have some CDs that were not on MusicBrainz, or that were not complete, or where the cover art or some other information was missing.
Multiple reasons helped to decide to create an account and invest some time in adding the data I needed:
if I will ever need to retag my music collection again, I do not need to add the metadata by hand a second time
if there are some errors, those might be caught and corrected by others
if no metadata is missing, I have one unified way to handle all of it
Just after the first couple of days, because I made some typos, someone else reviewed and corrected my first contributions. Thus it already paid out during the first week.
Most things to do are trivial (adding cover art, missing artist, genre, …), while adding a new release is less intuitive, but well documented.
The next thing to do should be to search for a program for copying all disks to the drive of the PC.
Many audio players can do it, but I would have preferred a separate program, for two main reasons.
The first is that I did not decide yet how to manage my music collection. For example which format to use, and how to organize all the files. Also, I might change the music player, and I definitively want to always use the same process for ripping my CDs to avoid, for example, too many differences in audio quality.
My main requirement was that the ripping process should scan automatically MusicBrainz for embedding at least some pieces of information, and I do not want to fiddle with too many settings (audio quality, …)
I’ve decided to use abcde (A better CD Encoder), I did not make any comparison, but I liked the name. With Debian’s default settings (create
.ogg file, search metadata on MusicBrainz) I was happy enough not to change anything.
I was unsure if
.ogg is a well-supported format. As far as I could see, it is supported on Android, Rockbox, and PCs. Good enough for me.
| Obviously I’ve inherited a device that I like and does not support |
I did explicitly want to avoid using something like FLAC because it simply takes too much space. Yes, external hard drives might be (relatively) cheap, but portable mp3 players and phone drives are, in comparison, small, and loading gigantic files also hurts battery life. Of course I could simply convert those files to mp3 or something else on-the-fly, but It would make synchronizing devices much more difficult. For example; do I have all the songs with the corrected metadata on my phone? If the formats are different, I need to somehow compare the metadata by opening all the files. If the format is the same, a binary comparison is sufficient (and also permits finding files that are invalid, or files that are changed by accident).
Currently, it is a simple drag-and-drop from any file manager, or with a diff tool, and I like it being it so simple (and fast!).
While copying all files to my PC, I’ve used Picard to find the exact information about any CD by comparing the Barcodes (and I’ve added those that were missing).
I’ve also used Picard’s default folder structure convention:
artist name/album name, as it seems a sensible decision, but I’ve preferred to handle such structure manually, as there are a couple of exceptions.
The biggest one is that such folder structure does not prevent collisions. For example, 50 Cent released "Get Rich or Die Trying" (identical album name, and of course same artist) in different years, like 2003 and 2005.
For those albums, I’ve also added the year in the folder name.
Of course, there are other possible collisions; it is not uncommon to make a release in a different part of the world with the same name, but not necessarily the same content. This issue could get avoided by adding the MusicBrainz ID (MBID) of every album in the folder name. As those collisions are rare in my library, it is probably not worth it.
| In case of a collision in the naming scheme, if you decide that Picard should move the files to the corresponding folders, rest assured that it would delete any file. For example, instead of overwriting |
Some music files, like Classical Music, Audio Tracks, Ringtones, Audiobooks, Kids Music or Podcasts, are in a separate sub-folder, because the way I consume those is very different.
I am mainly using "handcrafted"
.m3u files, located in the
The paths are relative, so that it is easy to copy those between devices, for example
#EXTM3U #PLAYLIST: Christmas ../Compilation/Its Christmas/01 John Yoko _ Plastic Ono Band - Happy Xmas (War Is Over).ogg ../Compilation/Its Christmas/02 Band Aid - Do They Know It's Christmas_.ogg ../Compilation/Its Christmas/03 Roy WoodwithWizzard - I Wish It Could Be Christmas Everyday.ogg
Obviously the files are no completely handcrafted, for example I used
printf '#EXTM3U\n\n#PLAYLIST: Christmas\n' > christmas.m3u find "../Compilation/Its Christmas" -xtype f \( -name '*\.ogg' -o -name '*\.flac' -o -name '*.mp3' \) | sort >> christmas.m3u # and other folders containing mainly christmas music
for creating the playlists.
| I am using |
I used for some time Amarok, it works reasonably well and it supports both GNU/Linux and Windows systems. Nowadays I use mostly
cmus (also available on Windows thanks to Cygwin), but as long as it
does not touch or organize my Music files (as some players like to do)
does not spin up unnecessarily the disk
shows a view by "Album Artist" and "Album"
shows some metadata
I should be happy with it.
A big plus is if I can open the audio player with a given directory as parameter. For example if I want to set some Classical Music. I know it is a small subset located at a specific place, and if I make some change in the settings, I would prefer if those would not change the settings for the whole audio library.
Another plus is support for
.nomusic files, or something similar.
These notes would be a complete waste of time if they did not mention how I am ensuring that there is a backup of the digital library.
I am currently using two methods, simply because I am a slow learner.
The first is a copy of the
~/Music folder on an external drive.
It is a manual process, so I do not do it regularly, but it is foolproof process and easy to understand.
The second, "automated" method, involves git annex.
The short explanation is that my
~/Music folder is a git repository and that all files are read-only. When I want to change something (for example correct some metadata), the files need to be unlocked/made writable, edited, and then committed (which makes them read-only again).
After committing, changes are pushed on a different machine, thus ensuring a second copy of all files.
The first question would be, why not use
git directly instead of
There are mutliple reason, the first one is that git is slow when handling large binary files.
The second reason is that such a repository would get very big after a few commits, I am generally not interested in the whole history of an audio file. If it would cost an unnoticeable amount of space, then I would not have anything against it, but if it makes the
~/Music bigger by a factor of ten or hundred, then it is not an issue that can simply get ignored.
git annex solves both problems by replacing the files with a symlink to a read-only file inside the
Using symlink adds many new possibilities, but unfortunately also a lot of complexities.
git annex strives to permit every workflow, and keep the interface as simple to use as possible, but as I am not using it regularly (my music collection does not change that often), I am still unsure how some operations work, as there are some differences when using it compared to
git. The walkthrough provides a good overview, but there are some limitations that I do not like (mainly Windows support).
For the sake of completeness, this is an overview of the most used commands.
When adding files, it is possible to use
git add, or
git annex add. In the second case, only a symlink will be committed. This has multiple implications (like the possibility not pulling all files), but the main advantage is that git remains fast in all its operations.
git annex add, the file is moved inside
.git/annex and replaced with a symlink. The file inside
.git/annex is also read-only, making it (generally) not possible to change.
If one wants to modify an already committed file, it is possible to unlock it with
git annex unlock <files and/or directories>, edit it, and add it again with
git annex add.
When pushing files with
git push, only the symlink is pushed, not the file itself(!).
For pushing the content to a separate pc (or pulling it on my second machine), I use mostly
git annex sync --content.
| Why not use |
git annex supports Windows, but I do not find the current status usable.
Windows and NTFS both support symlinks and read-only files, but
git annex does not take advantage of it, it’s actually worse, it does not recognizes symlinks as such. (Granted, symlinks on Windows are a mess)
Instead, files are copied, which means that the
~/Music folder would be as twice as big. Also every time I execute the test suite, some tests are failing. I am sure those are testing some corner error cases, maybe even using strange characters, but it does not add much confidence.
For the Windows machines I am interested in, I use the external backup memory. Data are not changed that often and synchronizing by hand is not an issue (the data never changes on the windows machines, if it does it would be by accident).
git annex is not as used as other approaches, and after Gitlab dropped support for it, most hopes that Windows would enhance the support for it vanished.
My main fear is that, even if uses a simple format, it might vanish because there is too little interest in supporting it.
git annex is not only useful as a backup system, but also for synchronizing data. It is the main use-case for which it has been designed too.
I am currently not taking advantage of these functionalities.
The main reason is that most, if not all portable audio players use FAT32 as a filesystem, which does not support symlinks.
Android phones do support filesystems like ext3 or ext4, but cannot cope with SD cards that are not formatted in FAT32 or exFat (which does not support symlinks either).
git annex actually works with FAT partitions too, but has a different workflow, as there are no symlinks.
At that point, it is just easier (for me) to synchronize those devices by hand, as most audio files do not change, and if they do, it is always from the same machine.
Do you want to share your opinion? Or is there an error, some parts that are not clear enough?
You can contact me anytime.