Recover files from damaged media

Notes published the
7 - 9 minutes to read, 1704 words

That the hardware works correctly is often taken for granted, but sometimes the media where our data is stored might partially break.

The solution is simple: restore the data from your backup.

But what if you do not have a backup?

Then back up all the data you care about (a simple copy on a different drive is a good start), and then attempt to recover as much data as possible.

Workspace

Before trying to recover the data, some assumptions

  • Use a GNU/Linux-based system

  • The system recognizes the faulty device

  • Trying to recover data from a faulty device, not data you have deleted by accident

  • Workspace with enough space for copying the whole drive at least twice

It does not matter if the device is a USB stick or a DVD, in fact, I had to recover the data from a scratched Blu-ray disc. Blu-ray discs can hold a lot of data, in my case, it contained approximately 41GB of data, thus I made sure to have at least 80GB of data free.

41GB would have been the minimum required, as I want to make a copy of the data. As it might not be possible to recover all the data, the copy might be corrupt. Depending on the data it might be possible to repair it, as the repairing process alters the data and might destroy some information, it is recommended to do a copy of the recovered data.

Optical media

In the case of optical media (CD, DVD, Blu-ray), it is possible to try to clean them up.

As explained on this guide, one should

To clean a CD or DVD a microfiber cleaning cloth or cotton-based tissue or cloth is recommended.

Any materials used should be checked to ensure they are non-abrasive. In most instances water will suffice as a cleaning agent. If water does not remove everything then a more powerful cleaner such as isopropyl alcohol can be used, but this should only be used in the most extreme of cases.

Dedicated CD and DVD wipes are also available which are already coated in a suitable cleaning solution.

For Blu-ray discs a soft cloth can be used instead of the tissue because it has a much harder surface, and a very mild detergent can also be used if necessary.

and (emphasis mine)

Hold the disc from the outer edges with the index finger and thumb of one hand. Avoid directly holding the playing surface, as this is likely to introduce further smudges, and possible damage. If a detergent is being used then it should be sprayed on the cloth or tissue rather than directly to the disc.

The discs must be cleaned in a specific pattern; Do not move the cloth in a circular pattern, rather wipe the disc from the centre, out towards the edges. This greatly reduces the chances of spoiling the disc as any scratches caused by such a motion will have very minimal effect on the readability of music or data. Specifically, Blu-ray discs should never be wiped in a circular pattern.

Pay attention during the process, you do not want to scratch the disk further.

With some luck, your disc player might be able to read all the data correctly.

I wasn’t lucky.

Memory cards

Adapters …​ USB adapters are, in my experience, the worst.

Luckily my computer has a slot for SD cards, and adapters from mini-SD to SD seem to be good enough.

external drives, USB sticks, and memory cards

mount as read-only, instruct the OS not to try to repair the filesystem.

Use ddrescue to scrape as much data as possible

After realizing that I was unable to read some data from the Blu-ray disc, I fired up ddrescue.

This program is able to copy the data from the faulty device and store the information about which data it was not able to read.

This makes it possible, for example, to try to recover the data from different machines. A particular DVD player might be able to read some data that another player is not able to.

Or you might have different copies of the same data (like a commercial music disc, both ruined) and by merging the data together, you might be able to get a perfect copy of the data, even if it is not possible to restore the data from a single media.

ddrescue needs access to the raw data, in my case, the disc drive is located at /dev/sr0 (and /dev/cdrom), while the USB stick is located at /dev/sdb (but the location might change, be sure to double-check).

Accessing the raw data means that ddrescue does not care if an external drive is formatted in NTFS, ext4, or some filesystem not recognized by the OS.

It also means that this approach cannot be used on devices that do not give direct access to the data and use a web interface (like the tablet of ely) or the MTP protocol (like Android), unless one is able to execute the program on the device itself (good luck).

It seems that you are out of luck even with Audio CD’s

As ddrescue uses standard library functions to read data from the device being rescued, only mountable device formats can be rescued with ddrescue. CD-ROMs and DVDs can be rescued, "compact disc digital audio" CDs can’t, […​]

I’m not exactly sure why audio CD cannot be rescued with ddrescue, and if there are equivalent programs.

For savaging a digital disc, remove all media you are not interested in, and use one of the following commands

# insert disc

# execute ddrescue, save data to file.dd
ddrescue --sector-size=2048 --no-scrape                /dev/sr0 file.dd rescue.map

# if there are errors, execute ddrescue again, and add new data (if any) to file.dd
ddrescue --sector-size=2048 --idirect --retry-passes=1 /dev/sr0 file.dd rescue.map

Other scenarios (like using multiple drives or optical media) are described in the manual 🗄️.

Note 📝
You might want to create, from time to time, a copy of file.dd while ddrescue is working and see if you are already able to extract the data you are interested in.
# attach an external device
# ensure it is not mounted
# or that any other program will try to write something on it (for example: repairing file system)

# execute ddrescue, save data to file.dd
ddrescue --no-scrape                /dev/sdb file.dd rescue.map

# if there are errors, execute ddrescue again, and add new data (if any) to file.dd
ddrescue --idirect --retry-passes=1 /dev/sdb file.dd rescue.map

The documentation also provides some examples with what to do in case of errors 🗄️.

Note that since depending on the media, reading the data might take a while (hours, days, weeks, …​)

And while your computer is doing such a sensitive operation, you might not want to stress it with other tasks.

Sounds like a perfect use case for an old PC.

If you have a spare desktop or laptop at home, are not interested in a home media center, gaming console, or a webserver, using it for tasks that will take a lot of time seems to be the best way to use it.

Recover data from the output of ddrescue

With some luck, ddrescue will have created an exact copy of the original data.

Once ddrescue finishes creating a copy of the faulty device, it’s time to extract the data from it.

In that case, mounting the archive and copying the data out of it, or "burning" the .dd file to another disk would be sufficient.

But at this point, and especially if ddrescue was not able to copy all the data, it might make sense to verify the data for correctness or consistency and eventually try to repair it.

In my case, at the end of the process, the output looked like

     ipos:   29511 MB, non-trimmed:        0 B,  current rate:       0 B/s
     opos:   29511 MB, non-scraped:  896835 kB,  average rate:    1016 B/s
non-tried:        0 B,  bad-sector:   23404 kB,    error rate:       0 B/s
  rescued:   43016 MB,   bad areas:    11428,        run time:  2d 10h 43m
pct rescued:   97.90%, read errors:    14690,  remaining time: 15d 20h 55m
                              time since last successful read:      2m 47s
Trimming failed blocks... (forwards)
Finished

97.90% might seem like a high number, but 2% of 41GB is more or less 800MB of data. Depending on which data is missing, it might mean that some files will not be recoverable.

Executing ddrescue again might rescue some other bits, but depending on your hardware, you might not be able to get any more data.

To access the file in the binary image created by ddrescue, you need to mount it.

A simple mkdir /mnt/recover/; mount -o loop,ro file.dd /mnt/recover/ might be sufficient. If the file system is corrupt, you might need to try to repair it before being able to mount it, or you might need another program to access the files.

For example, some archive managers can access the content of .iso files too, and it could be that those are more fault-tolerant.

Depending on which data is corrupt, you might use appropriate programs to validate and eventually repair the broken files.

Data recovery is an interesting topic, most programs that can be used for trying to recover deleted files can be used in this situation too. The first step, if you are not able to mount the image, would be to try to repair it. Testdisk is probably the most appropriate tool for this step, as it can work with many different filesystems.

Otherwise, for ext-based filesystems, fdisk and e2fsck might be sufficient.

Once you can mount the archive, you should be able to copy out the content you are interested in.

If you cannot mount the archive, or if some files are missing after repairing the filesystem, you can try to recover the data by using the same technique used for recovering deleted files. The main disadvantage is sorting the mess out, as the folder structure and file names might not be preserved. I’ve waited years before sorting out my family music collection.


Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me anytime.