RE: [2.2.14 to 2.2.17pre9] infinite or long loop in shrink_mmap()

Willy Tarreau (willy@novworld.Novecom.Fr)
Fri, 30 Jun 2000 00:56:39 +0200 (CEST)


Hi !

Well, at least, I've found why my system freezes very quickly when I run
several mmap002 : since I run them on the same file, all pages have a count>1.
(Marcello, you don't need to test my proggies and crash your system anymore).

in shrink_mmap(), there's a preliminary test (with a comment above) telling
that we won't free a page unless there's just one user. All the logs and
counters I've set in the function show that very quickly, the function
skips thousands of pages because their count is >1. Moreover, the logs
reported that the only pages that were freed were buffer pages.

to validate this, I've allowed the pages to be freed even if it was a buffer
page. Then, the system does not completely hang. It begins by filling the
swap and there's allways a bit of disk activity.

Although I understand we can't leave it that way, I wonder what can be done
to avoid this problem. concretely, this means that any user would just have
to call 2 mmap() on the same file to definetely lock the pages and kill the
system ?

please excuse my lack of knowledge in MM, but there's still something I don't
understand : the "referenced" flag for a page is taken from a test_and_clear()
If, finally, the page isn't freed, its flag is left unset, even if it was set
before, so further lookups should logically find it unset. This means to me
that 2 consecutive calls to shrink_mmap() may behave differently. I thought we
should have used a test_and_set_bit() before all continue statements if
registered was set.

I'm sorry I must give up from now, but I'll have some more spare time next
week for other tests. Andrea, I haven't tested all your patches yet, I think
I will do this week-end.

Hope this helps,

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/