PIDVR-11: SD Card Corruption



Issue Information

Issue Type: Bug
 
Priority: Blocker
Status: In Progress

Reported By:
Ben Tasker
Assigned To:
Ben Tasker
Project: PiDVR (PIDVR)
Resolution: Unresolved

Created: 2014-11-20 11:45:09
Time Spent Working


Description
May be related to some recent power blips, but the SD card has corrupted. Will need to see if the filesystem can be repaired.


Attachments

virtualbox_image_creation.txt
Toggle State Changes

Activity


btasker changed priority from 'Major' to 'Blocker'
Running fsck:

~# fsck.ext4 /dev/sdj2 
e2fsck 1.41.11 (14-Mar-2010)
/dev/sdj2: recovering journal
Error reading block 341658 (Attempt to read block from filesystem resulted in short read).  Ignore error<y>? yes

Force rewrite<y>? yes

fsck.ext4: Attempt to read block from filesystem resulted in short read while trying to re-open /dev/sdj2
e2fsck: io manager magic bad!


Looks like seeing io manager magic bad! is a particularly bad thing - https://www.redhat.com/archives/ext3-users/2002-September/msg00160.html


Syslog has

Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392069] INFO: task hald-addon-stor:6863 blocked for more than 120 seconds.
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392074] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392077] hald-addon-st D 000059f2     0  6863   8457 0x00000000
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392084]  ed169bc8 00000082 00000000 000059f2 00000000 c08914c0 dcf64264 c08914c0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392093]  984d1049 00040268 c08914c0 c08914c0 dcf64264 c08914c0 c08914c0 ef5ba840
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392101]  984c6b68 00040268 dcf63fc0 7fffffff 7fffffff ed169c88 ed169c28 c05b5e9d
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392110] Call Trace:
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392119]  [<c05b5e9d>] schedule_timeout+0x1ad/0x280
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392125]  [<c03594ed>] ? kobject_put+0x1d/0x50
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392131]  [<c01310e8>] ? default_spin_lock_flags+0x8/0x10
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392135]  [<c05b7698>] ? _spin_lock_irq+0x18/0x20
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392139]  [<c05b5b56>] wait_for_common+0xa6/0x130
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392145]  [<c014daa0>] ? default_wake_function+0x0/0x20
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392149]  [<c05b5c97>] wait_for_completion+0x17/0x20
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392154]  [<c034b69a>] blk_execute_rq+0x7a/0xd0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392158]  [<c034b550>] ? blk_end_sync_rq+0x0/0x30
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392165]  [<c0347c72>] ? get_request_wait+0x22/0x170
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392169]  [<c013ac1e>] ? enqueue_task+0x7e/0xa0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392174]  [<c013c74f>] ? check_preempt_curr+0x4f/0x60
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392178]  [<c0347e23>] ? blk_get_request+0x63/0x80
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392183]  [<c04192d7>] scsi_execute+0xb7/0x140
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392188]  [<c041951a>] scsi_execute_req+0x8a/0x140
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392192]  [<c041966d>] scsi_test_unit_ready+0x9d/0x1a0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392198]  [<c0432aef>] ? sd_media_changed+0x12f/0x260
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392202]  [<c0432b1e>] sd_media_changed+0x15e/0x260
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392207]  [<c0359652>] ? kobject_get+0x12/0x20
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392212]  [<c03face8>] ? get_device+0x18/0x20
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392228]  [<c023df96>] check_disk_change+0x26/0x60
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392232]  [<c04300d2>] sd_open+0x82/0x1b0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392235]  [<c03594ed>] ? kobject_put+0x1d/0x50
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392239]  [<c0430050>] ? sd_open+0x0/0x1b0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392242]  [<c023ede9>] __blkdev_get+0x1d9/0x340
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392247]  [<c021f242>] ? __link_path_walk+0x632/0xca0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392251]  [<c023ef5f>] blkdev_get+0xf/0x20
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392254]  [<c023efd3>] blkdev_open+0x63/0xb0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392259]  [<c0212435>] __dentry_open+0xe5/0x290
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392263]  [<c030305d>] ? security_inode_permission+0x1d/0x30
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392267]  [<c021d43e>] ? inode_permission+0x9e/0xb0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392271]  [<c02126dd>] nameidata_to_filp+0x5d/0x70
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392275]  [<c023ef70>] ? blkdev_open+0x0/0xb0
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392279]  [<c0220e02>] do_filp_open+0x4d2/0x990
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392283]  [<c0213ff4>] ? do_sync_read+0xc4/0x100
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392287]  [<c0212195>] do_sys_open+0x55/0x160
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392291]  [<c021230e>] sys_open+0x2e/0x40
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392295]  [<c01097cc>] syscall_call+0x7/0xb
Nov 20 16:46:56 milleniumfalcon kernel: [1128722.392300]  [<c05b0000>] ? init_amd+0x1fa/0x22f
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165130] sd 7:0:0:3: timing out command, waited 180s
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165142] sd 7:0:0:3: [sdj] Unhandled sense code
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165145] sd 7:0:0:3: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165151] sd 7:0:0:3: [sdj] Sense Key : Hardware Error [current] 
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165157] sd 7:0:0:3: [sdj] Add. Sense: Unrecovered read error
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165163] sd 7:0:0:3: [sdj] CDB: Read(10): 28 00 00 01 e0 00 00 00 08 00
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165177] end_request: I/O error, dev sdj, sector 122880
Nov 20 16:47:01 milleniumfalcon kernel: [1128728.165182] Buffer I/O error on device sdj2, logical block 0



Testing with badblocks gives a lot of output

root@milleniumfalcon:~# badblocks -n -v /dev/sdj
Checking for bad blocks in non-destructive read-write mode
From block 0 to 3813375
Testing with random pattern: badblocks: Input/output error during test data write, block 0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
.. snip..



The card wasn't brand new when it went in, but it looks like we've probably killed it rather than it being on the way out when it went in.

Should (hopefully) have a backup image from partway through the build (otherwise we'll see how good my notekeeping is...) but suspect it might be better to restore onto a USB stick rather than an SD card.
btasker changed status from 'Open' to 'In Progress'
To get things moving again (and to avoid knackering another SD card), am going to rebuild in a Wheezy based Virtual Machine. One the basic software setup is designed/in place can look at migrating the setup to a Pi.

Using a VM will also allow the disk usage profile to be assessed to reduce the risk of regularly killing SD cards.
For reference, am attaching the steps used to build the VM
btasker added 'virtualbox_image_creation.txt' to Attachments
I think given the issues seen in PIDVR-12 I'm probably going to give up on the VM route. May get around to building another Pi based system at some point, but at the moment I'm not feeling particularly motivated to continue as a quick review of the TV listings suggests there's currently nothing worth recording.