We don't actually need more lists, I expect. Dirty and under writeback
pages just don't go on a list at all - cut them off the LRU and
bring them back at IO completion. We can't do anything useful with
a list of dirty/writeback pages anyway, so why have the list?
It kind of depends whether we want to put swapcache on that list. I
may just give swapper_inode a superblock and let pdflush write swap.
The interrupt-time page motion is of course essential if we are to
avoid long scans of that list.
That, and replacing the blk_congestion_wait() throttling with a per-classzone
wait_for_some_pages_to_come_clean() throttling pretty much eliminates the
remaining pointless scan activity from the VM, and fixes a current false OOM
scenario in -mm4.
> but that sure looks like the low hanging fruit.
It's low alright. AFAIK Linux has always had this problem of
seizing up when there's a lot of dirty data around.
Let me quantify infinity:
With mem=512m, on the quad:
`make -j6 bzImage' takes two minutes and two seconds.
On 2.5.34, a concurrent 4 x `dbench 100' slows that same kernel
build down to 35 minutes and 16 seconds.
On 2.5.34-mm4, while running 4 x `dbench 100' that kernel build
takes three minutes and 45 seconds.
That's with seven disks: four for the dbenches, one for the kernel
build, one for swap and one for the executables. Things would be
worse with less disks because of seek contention. But that's
to be expected. The intent of this work is to eliminate this
crosstalk between different activities. And to avoid blocking things
which aren't touching disk at all.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/