Re: [patch 6/12] hold atomic kmaps across generic_file_read

Andrew Morton (akpm@zip.com.au)
Sat, 10 Aug 2002 02:08:59 -0700


Linus Torvalds wrote:
>
> On Fri, 9 Aug 2002, Andrew Morton wrote:
> >
> > What would be nice is a way of formalising the prefault, to pin
> > the mm's pages across the copy_*_user() in some manner, perhaps?
>
> Too easy to create a DoS-type attack with any trivial implementation.

hmm, yes. The pin has to be held across ->prepare_write. That
tears it.

> However, I don't think pinning is worthwhile, since even if the page goes
> away, the prefaulting was just a performance optimization. The code should
> work fine without it. In fact, it would probably be good to _not_ prefault
> for a development kernel, and verify that the code works without it. That
> way we can sleep safe in the knowledge that there isn't some race through
> code that requires the prefaulting..

OK. That covers reads. But we need to do something short-term to get
these large performance benefits, and I don't know how to properly fix
the write deadlock. The choices here are:

- live with the current __get_user thing

- make filemap_nopage aware of the problem, via a new `struct page *'
in task_struct (this would be very messy on the reader side).

- or?

(Of course, the write deadlock is a different and longstanding
problem, and I don't _have_ to fix it here, weasel, weasel)

> I agree that if you could guarantee pinning the out-of-line code would be
> a bit simpler, but since we have to handle the EFAULT case anyway, I doubt
> that it is _that_ much simpler.
>
> Also, there are actually advantages to doing it the "hard" way. If we ever
> want to, we can actually play clever tricks that avoid doing the copy at
> all with the slow path.
>
> Example tricks: we can, if we want to, do a read() with no copy for a
> common case by adding a COW-bit to the page cache, and if you do aligned
> reads into a page that will fault on write, you can just map in the page
> cache page directly, mark it COW in the page cache (assuming the page
> count tells us we're the only user, of course), and mark it COW in the
> mapping.

glibc malloc currently returns well-aligned-address + 8. If
it were taught to return well-aligned-address+0 then presumably a
lot of applications would automatically benefit from these
zero-copy reads.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/