Sometimes disks have errors or an SSD disk has a bad NAND cell. Saving the whole hard disk data may not be needed and when only a specific file or two are important and which cannot be copied by cp or rsync because of “Unrecovered read error”.
Furthermore, the SSD reallocates the bad cells, when there are writes to the cell(s), which may not occur years, but reading may be each day. Reading from a sector with bad NAND cells will result in slow IO (multiple read commands are executed before giving up). Copying the file to a new place without only 512 bytes may not harm the data, but it is difficult to be done with the generic tool for copying.
This article is to save single files from a mounted ext4 file system with bad sectors using the ddrescue tool – https://www.gnu.org/software/ddrescue/ In fact, the ddrescue could save files or whole devices.
STEP 1) Install ddrescue.
Installing ddrescue is pretty easy. The tool is included in almost all Linux distributions and it doesn’t have many dependencies. Apparently, there is another dd_rescue tool, which is different than this one, just follow the link above for the tool used here.
CentOS 7/8 or Fedora:
yum install -y ddrescue
Ubuntu last 10 years versions:
apt install -y gddrescue
Gentoo:
emerge -v ddrescue
STEP 2) Rescuing a single file with read errors because of bad sectors in a mounted file system.
[root@srv Snapshots]# ddrescue -v \{9f02ae0a-6dae-4729-b6a6-ec3f0550f294\}.vdi test2.vdi GNU ddrescue 1.25 About to copy 15724 MBytes from '{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi' to 'test2.vdi' Starting positions: infile = 0 B, outfile = 0 B Copy block size: 128 sectors Initial skip size: 384 sectors Sector size: 512 Bytes Press Ctrl-C to interrupt ipos: 13495 MB, non-trimmed: 0 B, current rate: 0 B/s opos: 13495 MB, non-scraped: 0 B, average rate: 162 MB/s non-tried: 0 B, bad-sector: 8192 B, error rate: 4608 B/s rescued: 15724 MB, bad areas: 2, run time: 1m 36s pct rescued: 99.99%, read errors: 18, remaining time: 0s time since last successful read: 0s Finished [root@srv Snapshots]# ls -al total 52602944 drwx------. 2 root root 4096 Jun 2 02:22 . drwxr-xr-x. 4 root root 4096 Jun 1 14:16 .. -rw-------. 1 root root 459981735 Nov 8 2018 2018-11-08T15-19-17-776317000Z.sav -rw-------. 1 root root 566704069 Jun 1 14:16 2020-06-01T11-16-05-735318000Z.sav -rw-------. 1 root root 8329887744 Jun 1 12:53 {3d30ebea-2e2f-4e33-8088-d3d66f315e2c}.vdi -rw-------. 1 root root 15724445696 Nov 8 2018 {9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi -rw-------. 1 root root 4012900352 Jun 1 14:16 {f7e72510-7dce-48fd-b62c-630664ad984f}.vdi -rw-r--r--. 1 root root 15724445696 Jun 2 02:24 test2.vdi -rw-------. 1 root root 9051041792 Jun 2 02:19 test.vdi
Here is an animated gif of the ddrescue procedure:
rsync reports Input/Output error:
root@srv ~ # rsync --verbose --progress --stats --recursive --times --perms --links --owner --group --hard-links --devices /mnt/stor/Virtualbox/ /mnt/stor2/Virtualbox/ sending incremental file list .VirtualBox/Machines/linux-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi 15,724,445,696 100% 93.58MB/s 0:02:40 (xfr#1, to-chk=145/235) rsync: read errors mapping "/mnt/stor/Virtualbox/.VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi": Input/output error (5) WARNING: .VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi failed verification -- update discarded (will try again). .VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi 15,724,445,696 100% 128.77MB/s 0:01:56 (xfr#2, to-chk=145/235) rsync: read errors mapping "/mnt/stor/Virtualbox/.VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi": Input/output error (5) ERROR: .VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi failed verification -- update discarded. Number of files: 235 (reg: 186, dir: 49) Number of created files: 1 (reg: 1) Number of deleted files: 0 Number of regular files transferred: 2 Total file size: 127,959,654,814 bytes Total transferred file size: 31,448,891,392 bytes Literal data: 31,448,891,392 bytes Matched data: 0 bytes File list size: 0 File list generation time: 0.001 seconds File list transfer time: 0.000 seconds Total bytes sent: 31,456,576,574 Total bytes received: 415 sent 31,456,576,574 bytes received 415 bytes 112,950,007.14 bytes/sec total size is 127,959,654,814 speedup is 4.07 rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1189) [sender=3.1.3]
Unfortunately, there is no skip error option and the files with the read error are not copied, at all!
So for a sector, the user may lose the whole file of 15G when relying on the healthy backup.
cp also reports Input/output errors
root@srv ~ # Snapshots # cp "{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi" test.vdi cp: error reading '{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi': Input/output error root@srv ~ # Snapshots # ls -al total 37247044 drwx------. 2 root root 4096 Jun 1 19:24 . drwxr-xr-x. 4 root root 4096 Jun 1 14:16 .. -rw-------. 1 root root 15724445696 Nov 8 2018 {9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi -rw-------. 1 root root 9051041792 Jun 1 19:25 test.vdi
cp stopped copying the file on the first read error and the file size of the destination file is more then 15 times smaller!
dmesg is full of read errors – Medium Error – Unrecovered read error – auto reallocate failed
[327673.819916] ata1.00: supports DRM functions and may not be fully accessible [327673.824985] ata1.00: disabling queued TRIM support [327673.830234] ata1.00: supports DRM functions and may not be fully accessible [327673.835056] ata1.00: disabling queued TRIM support [327673.840059] ata1.00: configured for UDMA/133 [327673.840070] sd 0:0:0:0: [sda] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [327673.840072] sd 0:0:0:0: [sda] tag#18 Sense Key : Medium Error [current] [327673.840073] sd 0:0:0:0: [sda] tag#18 Add. Sense: Unrecovered read error - auto reallocate failed [327673.840075] sd 0:0:0:0: [sda] tag#18 CDB: Read(10) 28 00 41 e1 a9 38 00 00 08 00 [327673.840077] blk_update_request: I/O error, dev sda, sector 1105307960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [327673.840088] ata1: EH complete [327673.875676] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [327673.875677] ata1.00: irq_stat 0x40000001 [327673.875679] ata1.00: failed command: READ DMA EXT [327673.875683] ata1.00: cmd 25/00:08:38:a9:e1/00:00:41:00:00/e0 tag 30 dma 4096 in res 51/40:08:37:a9:e1/00:00:41:00:00/e1 Emask 0x9 (media error) [327673.875684] ata1.00: status: { DRDY ERR } [327673.875684] ata1.00: error: { UNC } [327673.875911] ata1.00: supports DRM functions and may not be fully accessible [327673.880795] ata1.00: disabling queued TRIM support [327673.886034] ata1.00: supports DRM functions and may not be fully accessible [327673.890930] ata1.00: disabling queued TRIM support [327673.895877] ata1.00: configured for UDMA/133 [327673.895887] sd 0:0:0:0: [sda] tag#30 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [327673.895889] sd 0:0:0:0: [sda] tag#30 Sense Key : Medium Error [current] [327673.895891] sd 0:0:0:0: [sda] tag#30 Add. Sense: Unrecovered read error - auto reallocate failed [327673.895893] sd 0:0:0:0: [sda] tag#30 CDB: Read(10) 28 00 41 e1 a9 38 00 00 08 00 [327673.895895] blk_update_request: I/O error, dev sda, sector 1105307960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [327673.895904] ata1: EH complete [327673.943641] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [327673.943643] ata1.00: irq_stat 0x40000001 [327673.943647] ata1.00: failed command: READ DMA EXT [327673.943650] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 23 dma 4096 in res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error) [327673.943651] ata1.00: status: { DRDY ERR } [327673.943652] ata1.00: error: { UNC } [327673.943952] ata1.00: supports DRM functions and may not be fully accessible [327673.948868] ata1.00: disabling queued TRIM support [327673.954149] ata1.00: supports DRM functions and may not be fully accessible [327673.958897] ata1.00: disabling queued TRIM support [327673.963838] ata1.00: configured for UDMA/133 [327673.963860] sd 0:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [327673.963871] sd 0:0:0:0: [sda] tag#23 Sense Key : Medium Error [current] [327673.963873] sd 0:0:0:0: [sda] tag#23 Add. Sense: Unrecovered read error - auto reallocate failed [327673.963875] sd 0:0:0:0: [sda] tag#23 CDB: Read(10) 28 00 2b 1a f5 f8 00 00 08 00 [327673.963877] blk_update_request: I/O error, dev sda, sector 723187192 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [327673.963887] ata1: EH complete [327673.995644] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [327673.995645] ata1.00: irq_stat 0x40000001 [327673.995647] ata1.00: failed command: READ DMA EXT [327673.995650] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 10 dma 4096 in res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error) [327673.995651] ata1.00: status: { DRDY ERR } [327673.995652] ata1.00: error: { UNC } [327673.995921] ata1.00: supports DRM functions and may not be fully accessible [327674.000939] ata1.00: disabling queued TRIM support [327674.006292] ata1.00: supports DRM functions and may not be fully accessible [327674.011202] ata1.00: disabling queued TRIM support [327674.016239] ata1.00: configured for UDMA/133 [327674.016249] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [327674.016251] sd 0:0:0:0: [sda] tag#10 Sense Key : Medium Error [current] [327674.016253] sd 0:0:0:0: [sda] tag#10 Add. Sense: Unrecovered read error - auto reallocate failed [327674.016255] sd 0:0:0:0: [sda] tag#10 CDB: Read(10) 28 00 2b 1a f5 f8 00 00 08 00 [327674.016257] blk_update_request: I/O error, dev sda, sector 723187192 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [327674.016266] ata1: EH complete [327674.047641] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [327674.047642] ata1.00: irq_stat 0x40000001 [327674.047644] ata1.00: failed command: READ DMA EXT [327674.047647] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 15 dma 4096 in res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error) [327674.047648] ata1.00: status: { DRDY ERR } [327674.047649] ata1.00: error: { UNC } [327674.047891] ata1.00: supports DRM functions and may not be fully accessible [327674.052852] ata1.00: disabling queued TRIM support [327674.058024] ata1.00: supports DRM functions and may not be fully accessible [327674.062805] ata1.00: disabling queued TRIM support [327674.067669] ata1.00: configured for UDMA/133 [327674.067681] sd 0:0:0:0: [sda] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [327674.067683] sd 0:0:0:0: [sda] tag#15 Sense Key : Medium Error [current] [327674.067685] sd 0:0:0:0: [sda] tag#15 Add. Sense: Unrecovered read error - auto reallocate failed [327674.067687] sd 0:0:0:0: [sda] tag#15 CDB: Read(10) 28 00 2b 1a f5 f8 00 00 08 00 [327674.067689] blk_update_request: I/O error, dev sda, sector 723187192 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [327674.067699] ata1: EH complete [327674.095649] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [327674.095651] ata1.00: irq_stat 0x40000001 [327674.095653] ata1.00: failed command: READ DMA EXT [327674.095657] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 31 dma 4096 in res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error) [327674.095658] ata1.00: status: { DRDY ERR } [327674.095658] ata1.00: error: { UNC }