Yes, to phrase this more precisely, after we've submitted all the 
too-old buffers we then gain the freedom to select which of the younger 
buffers to flush.  When there is memory pressure we could benefit by 
skipping over some of the sys_write buffers in favor of page_launder 
buffers.  We may well be able to recognize the latter by looking for 
!bh->b_page->age.  This method would be an alternative to your 
writepage approach.
> > By the way, I think you should combine (2) and (3) using an and,
> > which gets us back to the "kupdate thing" vs the "bdflush thing".
>
> Perhaps, since I think they would be handled in roughly the same way.
(warning: I'm going to drift pretty far off the original topic now...)
I don't see why it makes sense to have both a kupdate and a bdflush 
thread.  We should complete the process of merging these (sharing 
flush_dirty buffers was a big step) and look into the possibility of 
adding more intelligence about what to submit next.  The proof of the 
pudding is to come up with a throughput-improving patch, not so easy 
since the ore in these hills has been sought after for a good number of 
years by many skilled prospectors.
Note that bdflush also competes with an unbounded number of threads 
doing wakeup_bdflush(1)->flush_dirty_buffers.
These are called through balance_dirty:
  mark_buffer_dirty->balance_dirty
  __block_commit_write->balance_dirty
  refill_freelist->balance_dirty
(Curiously, refill_freelist also calls wakeup_bdflush(1) directly.)  You 
can see that each of these paths is very popular, and as soon as we pass 
the hard_dirty_limit everybody will jump in to try to help with buffer 
writeout.
As I recall, the current arrangement was arrived at after a flurry of 
dbench-inspired tweaking last fall and hasn't changed much since then.  
I think we need to take another look at this.  My instinct is that 
it's wrong to ever have more than one instance of flush_dirty_buffers 
active per spindle, and that the current arrangement is an attempt to 
reduce context switches or perhaps to keep buffer submission flowing 
even when page_launder blocks on writepage-><read metadata 
synchronously>.  There has to be a cleaner way to approach this.
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/