Re: ext3-2.4-0.9.0

Neil Brown (neilb@cse.unsw.edu.au)
Sun, 8 Jul 2001 16:02:23 +1000 (EST)


On Sunday July 8, andrewm@uow.edu.au wrote:
>
> Could well be. ext3 will happily feed 2,000 buffers into submit_bh()
> prior to running tq_disk. Everything else is happy with this, so I blame
> nfsd and raid5 :) Rapid fsyncs will break this up, however.
>

raid5 is definately happy with large sequences of requests between
tq_disk (infact, that is best), but I think I have found a situation
where lots of small requests can confuse it. It seems that your
intuation about the direction of blame is better than mine :-)

Then a write request happens to raid5, the queue is (potentially)
plugged, and then the request is (potentially) queued, and there is a
window between the two where the queue can be unplugged by another
process. If this happens, then the tq_disk run the follows the write
request will not wake-up the raid5d, so the raid5 queue will not be
run, and the request will just sit there until something else causes
raid5d to run.
I'm guessing that ext3 imposes more sequencing on requests than ext2
does, and so it is easier for one request being stalled to stall the
whole filesystem.

In any case, the follow patch against raid5 seems to have relieved the
situation, but more testing is underway.

So ThankYou to ext3 for helping to find a bug in raid5 :-)

NeilBrown

--- drivers/md/raid5.c 2001/07/07 06:23:02 1.1
+++ drivers/md/raid5.c 2001/07/08 00:22:52
@@ -66,9 +66,10 @@
BUG();
if (atomic_read(&conf->active_stripes)==0)
BUG();
- if (test_bit(STRIPE_DELAYED, &sh->state))
+ if (test_bit(STRIPE_DELAYED, &sh->state)) {
list_add_tail(&sh->lru, &conf->delayed_list);
- else if (test_bit(STRIPE_HANDLE, &sh->state)) {
+ md_wakeup_thread(conf->thread);
+ } else if (test_bit(STRIPE_HANDLE, &sh->state)) {
list_add_tail(&sh->lru, &conf->handle_list);
md_wakeup_thread(conf->thread);
} else {
@@ -1167,10 +1168,9 @@

raid5_activate_delayed(conf);

- if (conf->plugged) {
+ if (conf->plugged)
conf->plugged = 0;
- md_wakeup_thread(conf->thread);
- }
+ md_wakeup_thread(conf->thread);
spin_unlock_irqrestore(&conf->device_lock, flags);
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/