Copy files with read errors successfully – skipping only errors (i.e. bad sectors)

Author:

Sometimes disks have errors or an SSD disk has a bad NAND cell. Saving the whole hard disk data may not be needed and when only a specific file or two are important and which cannot be copied by cp or rsync because of “Unrecovered read error”.
Furthermore, the SSD reallocates the bad cells, when there are writes to the cell(s), which may not occur years, but reading may be each day. Reading from a sector with bad NAND cells will result in slow IO (multiple read commands are executed before giving up). Copying the file to a new place without only 512 bytes may not harm the data, but it is difficult to be done with the generic tool for copying.
This article is to save single files from a mounted ext4 file system with bad sectors using the ddrescue tool – https://www.gnu.org/software/ddrescue/ In fact, the ddrescue could save files or whole devices.

STEP 1) Install ddrescue.

Installing ddrescue is pretty easy. The tool is included in almost all Linux distributions and it doesn’t have many dependencies. Apparently, there is another dd_rescue tool, which is different than this one, just follow the link above for the tool used here.
CentOS 7/8 or Fedora:

yum install -y ddrescue

Ubuntu last 10 years versions:

apt install -y gddrescue

Gentoo:

emerge -v ddrescue

STEP 2) Rescuing a single file with read errors because of bad sectors in a mounted file system.

[root@srv Snapshots]# ddrescue -v \{9f02ae0a-6dae-4729-b6a6-ec3f0550f294\}.vdi test2.vdi
GNU ddrescue 1.25
About to copy 15724 MBytes from '{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi' to 'test2.vdi'
    Starting positions: infile = 0 B,  outfile = 0 B
    Copy block size: 128 sectors       Initial skip size: 384 sectors
Sector size: 512 Bytes

Press Ctrl-C to interrupt
     ipos:   13495 MB, non-trimmed:        0 B,  current rate:       0 B/s
     opos:   13495 MB, non-scraped:        0 B,  average rate:    162 MB/s
non-tried:        0 B,  bad-sector:     8192 B,    error rate:    4608 B/s
  rescued:   15724 MB,   bad areas:        2,        run time:      1m 36s
pct rescued:   99.99%, read errors:       18,  remaining time:          0s
                              time since last successful read:          0s
Finished                                      
[root@srv Snapshots]# ls -al
total 52602944
drwx------. 2 root root        4096 Jun  2 02:22 .
drwxr-xr-x. 4 root root        4096 Jun  1 14:16 ..
-rw-------. 1 root root   459981735 Nov  8  2018 2018-11-08T15-19-17-776317000Z.sav
-rw-------. 1 root root   566704069 Jun  1 14:16 2020-06-01T11-16-05-735318000Z.sav
-rw-------. 1 root root  8329887744 Jun  1 12:53 {3d30ebea-2e2f-4e33-8088-d3d66f315e2c}.vdi
-rw-------. 1 root root 15724445696 Nov  8  2018 {9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi
-rw-------. 1 root root  4012900352 Jun  1 14:16 {f7e72510-7dce-48fd-b62c-630664ad984f}.vdi
-rw-r--r--. 1 root root 15724445696 Jun  2 02:24 test2.vdi
-rw-------. 1 root root  9051041792 Jun  2 02:19 test.vdi

Here is an animated gif of the ddrescue procedure:

main menu
ddrescue – copy files with bad sectors

rsync reports Input/Output error:

root@srv ~ # rsync --verbose --progress --stats --recursive --times --perms --links --owner --group --hard-links --devices /mnt/stor/Virtualbox/ /mnt/stor2/Virtualbox/
sending incremental file list
.VirtualBox/Machines/linux-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi
 15,724,445,696 100%   93.58MB/s    0:02:40 (xfr#1, to-chk=145/235)
rsync: read errors mapping "/mnt/stor/Virtualbox/.VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi": Input/output error (5)
WARNING: .VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi failed verification -- update discarded (will try again).
.VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi
 15,724,445,696 100%  128.77MB/s    0:01:56 (xfr#2, to-chk=145/235)
rsync: read errors mapping "/mnt/stor/Virtualbox/.VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi": Input/output error (5)
ERROR: .VirtualBox/Machines/centos8-tests/Snapshots/{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi failed verification -- update discarded.

Number of files: 235 (reg: 186, dir: 49)
Number of created files: 1 (reg: 1)
Number of deleted files: 0
Number of regular files transferred: 2
Total file size: 127,959,654,814 bytes
Total transferred file size: 31,448,891,392 bytes
Literal data: 31,448,891,392 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 31,456,576,574
Total bytes received: 415

sent 31,456,576,574 bytes  received 415 bytes  112,950,007.14 bytes/sec
total size is 127,959,654,814  speedup is 4.07
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1189) [sender=3.1.3]

Unfortunately, there is no skip error option and the files with the read error are not copied, at all!

So for a sector, the user may lose the whole file of 15G when relying on the healthy backup.

cp also reports Input/output errors

root@srv ~ #  Snapshots # cp "{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi" test.vdi
cp: error reading '{9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi': Input/output error
root@srv ~ #  Snapshots # ls -al
total 37247044
drwx------. 2 root root        4096 Jun  1 19:24 .
drwxr-xr-x. 4 root root        4096 Jun  1 14:16 ..
-rw-------. 1 root root 15724445696 Nov  8  2018 {9f02ae0a-6dae-4729-b6a6-ec3f0550f294}.vdi
-rw-------. 1 root root   9051041792 Jun  1 19:25 test.vdi

cp stopped copying the file on the first read error and the file size of the destination file is more then 15 times smaller!

dmesg is full of read errors – Medium Error – Unrecovered read error – auto reallocate failed

[327673.819916] ata1.00: supports DRM functions and may not be fully accessible
[327673.824985] ata1.00: disabling queued TRIM support
[327673.830234] ata1.00: supports DRM functions and may not be fully accessible
[327673.835056] ata1.00: disabling queued TRIM support
[327673.840059] ata1.00: configured for UDMA/133
[327673.840070] sd 0:0:0:0: [sda] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[327673.840072] sd 0:0:0:0: [sda] tag#18 Sense Key : Medium Error [current] 
[327673.840073] sd 0:0:0:0: [sda] tag#18 Add. Sense: Unrecovered read error - auto reallocate failed
[327673.840075] sd 0:0:0:0: [sda] tag#18 CDB: Read(10) 28 00 41 e1 a9 38 00 00 08 00
[327673.840077] blk_update_request: I/O error, dev sda, sector 1105307960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[327673.840088] ata1: EH complete
[327673.875676] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[327673.875677] ata1.00: irq_stat 0x40000001
[327673.875679] ata1.00: failed command: READ DMA EXT
[327673.875683] ata1.00: cmd 25/00:08:38:a9:e1/00:00:41:00:00/e0 tag 30 dma 4096 in
                         res 51/40:08:37:a9:e1/00:00:41:00:00/e1 Emask 0x9 (media error)
[327673.875684] ata1.00: status: { DRDY ERR }
[327673.875684] ata1.00: error: { UNC }
[327673.875911] ata1.00: supports DRM functions and may not be fully accessible
[327673.880795] ata1.00: disabling queued TRIM support
[327673.886034] ata1.00: supports DRM functions and may not be fully accessible
[327673.890930] ata1.00: disabling queued TRIM support
[327673.895877] ata1.00: configured for UDMA/133
[327673.895887] sd 0:0:0:0: [sda] tag#30 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[327673.895889] sd 0:0:0:0: [sda] tag#30 Sense Key : Medium Error [current] 
[327673.895891] sd 0:0:0:0: [sda] tag#30 Add. Sense: Unrecovered read error - auto reallocate failed
[327673.895893] sd 0:0:0:0: [sda] tag#30 CDB: Read(10) 28 00 41 e1 a9 38 00 00 08 00
[327673.895895] blk_update_request: I/O error, dev sda, sector 1105307960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[327673.895904] ata1: EH complete
[327673.943641] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[327673.943643] ata1.00: irq_stat 0x40000001
[327673.943647] ata1.00: failed command: READ DMA EXT
[327673.943650] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 23 dma 4096 in
                         res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error)
[327673.943651] ata1.00: status: { DRDY ERR }
[327673.943652] ata1.00: error: { UNC }
[327673.943952] ata1.00: supports DRM functions and may not be fully accessible
[327673.948868] ata1.00: disabling queued TRIM support
[327673.954149] ata1.00: supports DRM functions and may not be fully accessible
[327673.958897] ata1.00: disabling queued TRIM support
[327673.963838] ata1.00: configured for UDMA/133
[327673.963860] sd 0:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[327673.963871] sd 0:0:0:0: [sda] tag#23 Sense Key : Medium Error [current] 
[327673.963873] sd 0:0:0:0: [sda] tag#23 Add. Sense: Unrecovered read error - auto reallocate failed
[327673.963875] sd 0:0:0:0: [sda] tag#23 CDB: Read(10) 28 00 2b 1a f5 f8 00 00 08 00
[327673.963877] blk_update_request: I/O error, dev sda, sector 723187192 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[327673.963887] ata1: EH complete
[327673.995644] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[327673.995645] ata1.00: irq_stat 0x40000001
[327673.995647] ata1.00: failed command: READ DMA EXT
[327673.995650] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 10 dma 4096 in
                         res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error)
[327673.995651] ata1.00: status: { DRDY ERR }
[327673.995652] ata1.00: error: { UNC }
[327673.995921] ata1.00: supports DRM functions and may not be fully accessible
[327674.000939] ata1.00: disabling queued TRIM support
[327674.006292] ata1.00: supports DRM functions and may not be fully accessible
[327674.011202] ata1.00: disabling queued TRIM support
[327674.016239] ata1.00: configured for UDMA/133
[327674.016249] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[327674.016251] sd 0:0:0:0: [sda] tag#10 Sense Key : Medium Error [current] 
[327674.016253] sd 0:0:0:0: [sda] tag#10 Add. Sense: Unrecovered read error - auto reallocate failed
[327674.016255] sd 0:0:0:0: [sda] tag#10 CDB: Read(10) 28 00 2b 1a f5 f8 00 00 08 00
[327674.016257] blk_update_request: I/O error, dev sda, sector 723187192 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[327674.016266] ata1: EH complete
[327674.047641] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[327674.047642] ata1.00: irq_stat 0x40000001
[327674.047644] ata1.00: failed command: READ DMA EXT
[327674.047647] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 15 dma 4096 in
                         res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error)
[327674.047648] ata1.00: status: { DRDY ERR }
[327674.047649] ata1.00: error: { UNC }
[327674.047891] ata1.00: supports DRM functions and may not be fully accessible
[327674.052852] ata1.00: disabling queued TRIM support
[327674.058024] ata1.00: supports DRM functions and may not be fully accessible
[327674.062805] ata1.00: disabling queued TRIM support
[327674.067669] ata1.00: configured for UDMA/133
[327674.067681] sd 0:0:0:0: [sda] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[327674.067683] sd 0:0:0:0: [sda] tag#15 Sense Key : Medium Error [current] 
[327674.067685] sd 0:0:0:0: [sda] tag#15 Add. Sense: Unrecovered read error - auto reallocate failed
[327674.067687] sd 0:0:0:0: [sda] tag#15 CDB: Read(10) 28 00 2b 1a f5 f8 00 00 08 00
[327674.067689] blk_update_request: I/O error, dev sda, sector 723187192 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[327674.067699] ata1: EH complete
[327674.095649] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[327674.095651] ata1.00: irq_stat 0x40000001
[327674.095653] ata1.00: failed command: READ DMA EXT
[327674.095657] ata1.00: cmd 25/00:08:f8:f5:1a/00:00:2b:00:00/e0 tag 31 dma 4096 in
                         res 51/40:08:f7:f5:1a/00:00:2b:00:00/eb Emask 0x9 (media error)
[327674.095658] ata1.00: status: { DRDY ERR }
[327674.095658] ata1.00: error: { UNC }

Leave a Reply

Your email address will not be published. Required fields are marked *