Re: fadvise syscall?

Martin K. Petersen (mkp@mkp.net)
18 Mar 2002 14:42:47 -0500


>>>>> "Andrew" == Andrew Morton <akpm@zip.com.au> writes:

Andrew> google fails me - where does your kiobuf-based splitter live?

It's in the kiobuf XFS patches.

Andrew> I'm curious to know how this will all work. Will it take a
Andrew> large BIO and split it into a number of smaller, newly
Andrew> allocated BIOs?

For kiobufs I walked the request, cloned a new every time I crossed a
stripe/device boundary and sent it off. I had my own completion
function with an atomic counter that would call the parent kiobuf's
end_io function when all clones had completed.

So I didn't chop the request into page sized chunks or something like
that.

Andrew> If that's really the only way in which we can solve this
Andrew> problem, would it not be better to pass information up to the
Andrew> higher layer, telling it when the BIO which is currently under
Andrew> assembly cannot be grown further? Say,
Andrew> blk_can_i_add_more_stuff_to_this_bio()?

We tried different approaches. One of them was to be able to signal
to upper layers that your I/O was too big and please submit smaller
chunks. Running with that, however, the I/O size converged against
small requests because you'd often start an I/O - say 4K - from a
stripe boundary. And that would kill it right off.

So unless the filesystem knows about stripe/device boundaries it's
really hard to get the size signalling right. And then what happens
when you stack LVM and MD?

In the end, cloning the kiobuf from the above and adjusting
offset/length in the children turned out to be the best approach.

And I suspect that's why Jens kept the clone facility around for bio
bufs :)

Andrew> Anyway. I'm interested. O_DIRECT is a bit of a weird
Andrew> curiosity, but I'm working on making these big-BIO code paths
Andrew> *the* way in which data gets to and from disk. It needs to be
Andrew> efficient ;)

*nod*

I'll try and poke at this again tonight. Will shoot you the patch
once I get the zoning evil sorted out.

-- 
Martin K. Petersen, Principal Linux Consultant, Linuxcare, Inc.
mkp@linuxcare.com, http://www.linuxcare.com/
SGI XFS for Linux Developer, http://oss.sgi.com/projects/xfs/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/