Some remarks:
Mostly for (1) and (4) ... Maybe there is one problem: If a loop device is
used or someone really make a `cat /dev/zero > /dev/<exterm_big_part>'
(/dev/md<x>) a high probability is given that the kflushd its selfs uses via
ll_rw_block() and getblk() the refill_freelist(). And therefore the
probability to touch the resereverd pages may increase.
Next point is: In bdflush() the wake_up(&bdflush_done) statement should
be moved to be called only once before sleeping again. And to avoid
a deadlock a check for the _actual_ number of dirty buffers should be
added. See the corresponding part of patch-2.0.31:
---------------------------------------------------------------------------
@@ -1685,16 +1731,16 @@
* dirty buffers, then make the next write to a
* loop device to be a blocking write.
* This lets us block--which we _must_ do! */
- if (ndirty == 0 && nr_buffers_type[BUF_DIRTY] > 0) {
+ if (ndirty == 0 && nr_buffers_type[BUF_DIRTY] > 0 && wrta_cmd != WRITE) {
wrta_cmd = WRITE;
continue;
}
run_task_queue(&tq_disk);
- wake_up(&bdflush_done);
/* If there are still a lot of dirty buffers around, skip the sleep
and flush some more */
- if(nr_buffers_type[BUF_DIRTY] <= nr_buffers * bdf_prm.b_un.nfract/100) {
+ if(ndirty == 0 || nr_buffers_type[BUF_DIRTY] <= nr_buffers * bdf_prm.b_un.nfract/100) {
+ wake_up(&bdflush_done);
current->signal = 0;
interruptible_sleep_on(&bdflush_wait);
}
---------------------------------------------------------------------------
Werner
> The changes are as follows:
> (1) Do a non-blocking wakeup if there are any dirty buffers. If we
> reached this section of code without finding any available buffers, we
> want to make sure any dirty buffers are flushed.
>
> (2) Allocate buffers to reach half the goal, but don't bother repeating
> the buffer list search. This is a minor change, but since the GFP_BUFFER
> allocations don't block, it just wastes CPU time to search the lists
> again.
>
> (3) Return now if we got any buffers, so that the current task can
> proceed.
>
> (4) If no buffers were found or allocated, do a blocking wakeup of
> bdflush, and then find a locked buffer to wait on. This is the most
> important change from prior practice, as it forces an explicit wait on a
> locked buffer before proceeding with atomic allocations.
>
> In the majority of cases that this code is reached, the system will have
> plenty of buffers, but the buffers will all locked. Performing atomic
> allocations will drain the system's reserved pages, and just calling
> schedule() wastes CPU time by repeatedly searching the lists instead of
> waiting for an unlock.
>
> (5) In the unlikely event that no locked buffers are found, do an atomic
> allocation, but only we still have at least half of the reserved pages.
> Allowing this limited access covers the case of using a device with
> non-standard sized buffers.
>
> This code should cover all of the cases pretty well, but it isn't a
> panacea for other memory management problems. In particular, if you're
> having a problem with fragmentation and allocation of network buffers,
> this patch probably won't help much, if at all. But otherwise I would
> expect it to allow arbitrarily extensive i/o without ever running into
> buffer problems.
>