Re: kupdated, bdflush and kjournald stuck in D state on RAID1 device (deadlock?)

David Rees (dbr@greenhydrant.com)
Wed, 29 Aug 2001 15:38:18 -0700


On Wed, Aug 29, 2001 at 03:26:01PM -0700, Andrew Morton wrote:
> David Rees wrote:
> >
> > On Wed, Aug 29, 2001 at 02:38:23PM -0700, Andrew Morton wrote:
> > >
> > > OK, thanks. bdflush is stuck in raid1_alloc_r1bh() and
> > > everything else is blocked by it. I thought we fixed
> > > that a couple of months ago :(
> > >
> > > Could you send the output of `cat /proc/meminfo'?
> > >
> > > > 18239 -bash wait4
> > > > 18274 umount /opt rwsem_down_write_failed
> > >
> > > What are we trying to do here? Is /opt the deadlocked
> > > filesytem?
> >
> > Yep, /dev/md0 is mounted on /opt.
> >
>
> OK, and according to your /proc/meminfo:
>
> total: used: free: shared: buffers: cached:
> Mem: 525422592 497364992 28057600 0 133500928 335839232
> Swap: 1052794880 4710400 1048084480
> MemTotal: 513108 kB
> MemFree: 27400 kB
> MemShared: 0 kB
> Buffers: 130372 kB
> Cached: 323368 kB
> SwapCached: 4600 kB
> Active: 293704 kB
> Inact_dirty: 161536 kB
> Inact_clean: 3100 kB
> Inact_target: 16 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 513108 kB
> LowFree: 27400 kB
> SwapTotal: 1028120 kB
> SwapFree: 1023520 kB
>
> it's not an out-of-memory deadlock.
>
> The RAID1 buffer allocation is pretty simple - unless the
> disk controller has decided to stop delivering interrupts,
> everything shold just come back to life as physical writes
> complete. I assume the hardware is still working OK?
>
> It's a uniprocessor machine, yes?

It is a uniprocessor machine, yes. The machine is a 1.1GHz Athlon on a Soyo
motherboard. This is the first problem we've seen with the machine.

The machine was and is still working mostly ok. Since I typed "umount /opt",
the opt directory doesn't show any contents any more, but before that things
appear OK. I can still use fdisk to look at the partition layout of the
drives in /opt raid1 array. /proc/mdstat is normal.

There are no other software raid devices on the machine (/ is also ext3 on a
separate drive/controller). There are no suspicious messages printed from
dmesg, either.

-Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/