Re: O_DIRECT read and holes in 2.5.26

Stephen Lord (lord@sgi.com)
26 Jul 2002 15:22:37 -0500


On Sun, 2002-07-21 at 21:26, Andrew Morton wrote:
> Stephen Lord wrote:
> >
> > Andrew,
> >
> > Did you realize that the new O_DIRECT code in 2.5 cannot read over holes
> > in a file.
>
> Well that was intentional, although I confess to not having
> put a lot of thought into the decision. The user wants
> O_DIRECT and we cannot do that. The CPU has to clear the
> memory by hand. Bad.

What it does mean is that things which used to work on Irix and
Linux now no longer work. You can write an app on 2.4 and have
it fail on 2.5 now.

>
> Obviously it's easy enough to put in the code to clear the
> memory out. Do you think that should be done?
>
> > The old code filled the user buffer with zeros, the new code
> > returned EINVAL if the getblock function returns an unmapped buffer.
> > With this exception, XFS does work with the new code - with more cpu
> > overhead than before due to the once per page getblock calls.
>
> OK, thanks. Presumably XFS has a fairly heavyweight get_block()?

No, not really that expensive, especially in the read and buffered
write path. I am objecting to the extra cpu cycles which we get to
spend in the kernel doing processing we do not need, as opposed to
spending those cycles in an application. It does not really show
up as a difference when you are sitting around waiting for the
I/O, but if you are doing processing in parallel with the I/O
I prefer to put as many cycles in user space as possible. We have
customers who like to see 99.x% of their cpu time in user space.

>
> I'd be interested in seeing just how expensive that O_DIRECT
> I/O is, and whether we need to get down and implement
> many-block get_block() interface. Any numbers/profiles
> available?
>

I will try and generate some numbers once I emerge from under a
mountain of email - I cannot use the Linus approach to email
backlogs ;-)

Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/