git status and bus error on SSD – fix READ errors by recovering part of the file

SSD and Linux encryption may not be the best idea, especially without the TRIM (allow-discards) option (or never executed fstrim?). Nevertheless, this error may occur not only on an SSD device, but just where there is a corrupted file system or device.
In our case, the SSD has some read errors. Apparently, some files or some parts of files could not be read by the git command:

[myuser@dekstop kernel]# git status -v
Bus error 84/115708)

In the case of SSD bad reads, the only working solution is to find and overwrite the problem file(s) or remove the file(s) and recreate them. A more sophisticated solution is to dump the file with dd and skip errors option enabled to another location and then overwrite the old file with the new one. So only the corrupted area of the file will be lost, which in most cases is just one or two sectors, i.e. one or two 512 bytes of data.

STEP 1) Find the bad files with the find command.

Use find Linux command and read all the files with the cat Linux command, so a bad sector will output an input/output error on READ. On write errors won’t be generated, but the sector will be automatically moved to a healthy one (the bad sector is marked and never used more).

[myuser@dekstop kernel]#  find -type f -exec cat {} > /dev/null \;
cat: ./servers/logo_description.txt: Input/output error

If multiple files are found repeat the procedure with each file.

STEP 2) Copy the healthy portion of the file.

The easiest way to remove the error is just to delete the file (or overwrite it), but if the healthy portion of the file is desirable the dd utility may be used to recover it:

[myuser@dekstop someproject]# dd if=./servers/logo_description.txt of=./servers/logo_description.txt_tmp bs=512 conv=noerror
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 1.63801 s, 295 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 1.83853 s, 263 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 2.03279 s, 238 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 2.21598 s, 218 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 2.3993 s, 201 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 2.58314 s, 187 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 2.7656 s, 175 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
944+0 records in
944+0 records out
483328 bytes (483 kB, 472 KiB) copied, 2.95258 s, 164 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 4.782 s, 106 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 4.96481 s, 102 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 5.15198 s, 98.6 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 5.33217 s, 95.3 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 5.51823 s, 92.0 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 5.71177 s, 88.9 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
507904 bytes (508 kB, 496 KiB) copied, 5.89475 s, 86.2 kB/s
dd: error reading './servers/logo_description.txt': Input/output error
992+0 records in
992+0 records out
.....
.....
7824384 bytes (7.8 MB, 7.5 MiB) copied, 18.3921 s, 425 kB/s
15452+1 records in
15452+1 records out
7911850 bytes (7.9 MB, 7.5 MiB) copied, 18.6403 s, 424 kB/s
[myuser@dekstop someproject]# stat ./servers/logo_description.txt
  File: ./servers/logo_description.txt
  Size: 7942570         Blocks: 15520      IO Block: 4096   regular file
Device: 253,0   Inode: 22677767    Links: 1
Access: (0444/-r--r--r--)  Uid: ( 1000/   myuser)   Gid: (  1000/   myuser)
Access: 2021-09-12 22:03:34.339990902 +0300
Modify: 2019-08-13 11:02:31.234818571 +0300
Change: 2021-09-12 20:40:36.099997049 +0300
 Birth: 2021-09-12 20:40:36.049997049 +0300
[myuser@dekstop someproject]# stat ./servers/logo_description.txt_tmp 
  File: ./servers/logo_description.txt_tmp
  Size: 7911850         Blocks: 15456      IO Block: 4096   regular file
Device: 253,0   Inode: 22677890    Links: 1
Access: (0644/-rw-r--r--)  Uid: (    1000/    myuser)   Gid: (    1000/    myuser)
Access: 2022-05-26 02:55:28.541391526 +0300
Modify: 2022-05-26 02:55:47.181498428 +0300
Change: 2022-05-26 02:55:47.181498428 +0300
 Birth: 2022-05-26 02:55:28.541391526 +0300
[myuser@dekstop someproject]# mv ./servers/logo_description.txt_tmp ./servers/logo_description.txt

Some of the output is trimmed, because multiple READ errors are reported, but still, 7911850 of 7942570 bytes are recovered.

STEP 3) Continue to use git or whatever program was failing before.

After “repairing” the file the git status command now works:

[myuser@dekstop someproject]# git status -v
Refresh index: 100% (115708/115708), done.
On branch master
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   servers/logo_description.txt
        modified:   servers/network_ifconfig.txt
        modified:   servers/processses_all.txt
        
no changes added to commit (use "git add" and/or "git commit -a")

No more segfaults or bus errors!

Errors in dmesg

The READ errors output similar information in Linux dmesg:

[365583.689704] sd 0:0:0:0: [sda] tag#30 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[365583.689708] sd 0:0:0:0: [sda] tag#30 Sense Key : Medium Error [current] 
[365583.689710] sd 0:0:0:0: [sda] tag#30 Add. Sense: Unrecovered read error - auto reallocate failed
[365583.689713] sd 0:0:0:0: [sda] tag#30 CDB: Read(16) 88 00 00 00 00 00 2d 5d c3 98 00 00 00 38 00 00
[365583.689714] blk_update_request: I/O error, dev sda, sector 761119664 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0
[365583.689747] ata1: EH complete
[365583.689820] ata1.00: Enabling discard_zeroes_data
[365583.881137] ata1.00: exception Emask 0x0 SAct 0x8000 SErr 0x0 action 0x0
[365583.881155] ata1.00: irq_stat 0x40000008
[365583.881164] ata1.00: failed command: READ FPDMA QUEUED
[365583.881168] ata1.00: cmd 60/08:78:b0:c3:5d/00:00:2d:00:00/40 tag 15 ncq dma 4096 in
                         res 41/40:08:b0:c3:5d/00:00:2d:00:00/00 Emask 0x409 (media error) <F>
[365583.881177] ata1.00: status: { DRDY ERR }
[365583.881179] ata1.00: error: { UNC }
[365583.881423] ata1.00: supports DRM functions and may not be fully accessible
[365583.881766] ata1.00: disabling queued TRIM support
[365583.884056] ata1.00: supports DRM functions and may not be fully accessible
[365583.884400] ata1.00: disabling queued TRIM support
[365583.886432] ata1.00: configured for UDMA/133
[365583.886448] sd 0:0:0:0: [sda] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[365583.886451] sd 0:0:0:0: [sda] tag#15 Sense Key : Medium Error [current] 
[365583.886453] sd 0:0:0:0: [sda] tag#15 Add. Sense: Unrecovered read error - auto reallocate failed
[365583.886455] sd 0:0:0:0: [sda] tag#15 CDB: Read(16) 88 00 00 00 00 00 2d 5d c3 b0 00 00 00 08 00 00
[365583.886456] blk_update_request: I/O error, dev sda, sector 761119664 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[365583.886472] ata1: EH complete
[365583.886540] ata1.00: Enabling discard_zeroes_data
[365584.061126] ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x0
[365584.061131] ata1.00: irq_stat 0x40000008
[365584.061134] ata1.00: failed command: READ FPDMA QUEUED
[365584.061135] ata1.00: cmd 60/08:88:b0:c3:5d/00:00:2d:00:00/40 tag 17 ncq dma 4096 in
                         res 41/40:08:b0:c3:5d/00:00:2d:00:00/00 Emask 0x409 (media error) <F>
[365584.061141] ata1.00: status: { DRDY ERR }
[365584.061142] ata1.00: error: { UNC }
[365584.061386] ata1.00: supports DRM functions and may not be fully accessible
[365584.061718] ata1.00: disabling queued TRIM support
[365584.064153] ata1.00: supports DRM functions and may not be fully accessible
[365584.064897] ata1.00: disabling queued TRIM support
[365584.066937] ata1.00: configured for UDMA/133
[365584.066978] sd 0:0:0:0: [sda] tag#17 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[365584.066987] sd 0:0:0:0: [sda] tag#17 Sense Key : Medium Error [current] 
[365584.066991] sd 0:0:0:0: [sda] tag#17 Add. Sense: Unrecovered read error - auto reallocate failed
[365584.066995] sd 0:0:0:0: [sda] tag#17 CDB: Read(16) 88 00 00 00 00 00 2d 5d c3 b0 00 00 00 08 00 00
[365584.066996] blk_update_request: I/O error, dev sda, sector 761119664 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[365584.067030] ata1: EH complete
[365584.068522] ata1.00: Enabling discard_zeroes_data

Leave a Reply

Your email address will not be published. Required fields are marked *