Re: kupdated, bdflush and kjournald stuck in D state on RAID1 device (deadlock?)

Neil Brown (neilb@cse.unsw.edu.au)
Thu, 30 Aug 2001 10:39:16 +1000 (EST)


On Wednesday August 29, dbr@greenhydrant.com wrote:
> On Wed, Aug 29, 2001 at 03:48:12PM -0700, Andrew Morton wrote:
> > David Rees wrote:
> >
> > Are you able to access all the underlying devices on the array?
> > For example, if /dev/md0 consists of /dev/hda1 and /dev/hdb2,
> > can you run 'cp /dev/hda1 /dev/null' and 'cp /dev/hdb1 /dev/null'?
> >
> > If so, then I'm all out of ideas. Your raid1 buffers have disappeared
> > into thin air :(
>
> Copying as I write this (actually, `cat /dev/hde1 > /dev/null` and `cat
> /dev/hdg1 >/dev/null`.
>
> Well, if no-one knows, I'll reboot and cross my fingers that it doesn't
> happen again.

Thanks David and Andrew for providing all the helpful details.
I know what happened. As Andrew said, the raid1 buffers have simply
disappeared into thin air.
The line that makes them invisible is
r1_bh->state = 0;
at line 165 in drivers/md/raid1.c. This should be more like
r1_bh->state = (1 << R1BH_PreAlloc);
We need to clear the Uptodate bit and the Phase bit, but not
the prealloc bit.

Linus: Please consider applying this patch.

NeilBrown

--- drivers/md/raid1.c 2001/08/30 00:36:54 1.1
+++ drivers/md/raid1.c 2001/08/30 00:37:03
@@ -162,7 +162,7 @@
conf->freer1 = r1_bh->next_r1;
conf->freer1_cnt--;
r1_bh->next_r1 = NULL;
- r1_bh->state = 0;
+ r1_bh->state = (1 << R1BH_PreAlloc);
r1_bh->bh_req.b_state = 0;
}
md_spin_unlock_irq(&conf->device_lock);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/