Re: Unresponiveness of 2.4.16

Andrew Morton (akpm@zip.com.au)
Mon, 26 Nov 2001 16:36:25 -0800


Rik van Riel wrote:
>
> However, I suspect this unresponsiveness issue is related to
> either IO scheduling or write throttling, and that code is
> the same in both VMs. I'll take a look at smoothing out writes
> so we can get this thing fixed in both VMs.
>

umm... What I said.

balance_dirty_state() is allowing writes to flood the machine
with locked buffers.

elevator is penalising reads horridly. Try this on your
64 megabyte box:

dd if=/dev/zero of=foo bs=1024k count=8000

and then try to log in to it. Be patient. Very patient. Five
minutes pass. Still being patient? In fact with this test I've
never been able to get a login prompt. The filesystem which
holds `foo' is only 8 gigs, and it fills up, permitting the login
to happen.

What happens is this: sshd gets paged out. It wakes up, faults
and tries to read a page. That read gets stuck on the request
queue behind about 50 megabytes of write data. Eventually, it
gets read. Then sshd faults in another page. That gets stuck
on the request queue behind about 50 megabytes of data. By the time
this one gets read, the first page is probably paged out again. See
how this isn't getting us very far?

The patch I sent puts read requests near the head of the request
queue, and to hell with aggregate throughput. It's tunable with
`elvtune -b'. And it fixes it.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/