OK, this is a straight-up design bug.  Although this can also happen with 
normal memory, it's much more likely to happen with highmem because of heavy 
demand for bounce buffers while a process is in PF_MEMALLOC state.  You can 
just turn off highmem and these messages will go away, or become so rare that 
you are unlikely to ever see one.
Now lets chase the real problem.  The gfp=0x30 tells us the requestor is 
willing to wait (0x10) and that it is not allowed to do any io (0x40) or call 
->writepage (0x80).  (By process of elimination, it's a bounce buffer.) 
Furthermore, this is a recursive memory request (/1) so __alloc_pages won't 
call page_launder because that could hit another allocation request resulting 
in a fatal infinite recursion (note to self: why couldn't we call 
page_launder here, with NOIO?).
There are probably dirty pages in flight and __alloc_pages is allowed to wait 
for them, but it doesn't - it trys reclaim_page once (in 
__alloc_pages_limit), falls the rest of the way through __alloc_pages and 
gives up with NULL.  This is clearly a bad thing because whoever wanted the 
page needs it to do writeout.  Memory users are supposed to be able to 
tolerate alloc failure, but in a case like this, there isn't much choice 
other than to spin.
So what could we do better here?  Well, obviously when there are writeout 
pages in flight, __alloc_pages should wait and not give up.  Secondly, we 
should be sure that when writeout does complete, the newly freeable page is 
given to a PF_MEMALLOC waiter in preference to a normal user.  We don't have 
mechanisms in place for doing either of those things right now, although some 
preliminary design ideas have been discussed.  This gets way outside the 
bound of what we should be doing in 2.4, we will need such things as 
reservations (which Ben has done some work on) and orderly prioritization of 
requests in __alloc_pages, with explicit blocking for low priority requests.  
What can we do right now?  We could always just comment out the alloc failed 
message.  The result will be a lot of busy waiting on dirty page writeout 
which will work but it will keep us from focussing on the question: how did 
we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
without intelligent throttling (/me waves at Ben).  That sounds like the 
place to attack first.
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/