2.4.10pre VM changes: Potential race condition on swap code

Marcelo Tosatti (marcelo@conectiva.com.br)
Tue, 11 Sep 2001 19:40:01 -0300 (BRT)


It seems there is a potential race caused by swap changes. The reason is
that we do not increase the swap entry on swapin readahead. The comment on
top of swap_duplicate() in read_swap_cache_async() says:

* Make sure the swap entry is still in use. It could have gone
* while caller waited for BKL, or while allocating page above,
* or while allocating page in prior call via swapin_readahead.
if (!swap_duplicate(entry)) /* Account for the swap cache */
goto out_free_page;

The BLK protects the logic against concurrent read_swap_cache_async()
calls, but it does not protect get_swap_page() in try_to_swap_out().

I do not see what protects us (increasing the swap map entry on
valid_swaphandles on older kernels used to be the protection) against the
following race:

- swapin_readahead() finds used entry on swap map. (valid_swaphandles)
- user of this entry deletes the swap map entry, so it becomes free. Then:

read_swap_cache_async() try_to_swap_out()
Second __find_get_page() fails
get_swap_page() returns swap
entry which CPU0 is trying to read
swap_duplicate() for the entry
succeeds: CPU1 just allocated it.

add_to_swap_cache() add_to_swap_cache()

Now we got two pages on the hash tables for the "same" data. From this
point on there is no guarantee _which_ data will be returned when searched
via pagecache lookup.

Linus, Hugh ?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/