raw readv/writev

Janet Morgan (janetmor@us.ibm.com)
Wed, 16 Jan 2002 10:06:38 -0800 (PST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Stephan von Krawczynski: "Re: sysv.o and 2.4.18-pre4"
Previous message: Linus Torvalds: "Re: pte-highmem-5"

I have two versions of a patch for improving the performance of readv/writev
for raw io, and I'm not sure which is the better approach.

The current implementation calls read/write for each iovec (unless an override
routine is defined). So an 8x32K readv, for example, runs about twice as
slow as a single 256K read. Both versions of the patch nearly eliminate
this performance gap.

The first version coalesces the iovecs (up to KIO_MAX_SECTORS bytes of data)
into a single kiobuf (pre-allocated at file open) and issues 1 call to
brw_kiovec to submit the io. The 2nd version of the patch also groups the
iovecs into a single call to brw_kiovec, but uses one kiobuf per iovec.

Mapping discontiguous virtual memory into a single kiobuf is unconventional,
but minimizes the number of pre-allocated buffer heads (1024 per kiobuf).
It also avoids some of the logistics involved in using one kiobuf for each
iovec (e.g., when should the kiobufs be allocated/freed and should there
be a system-wide limit on the number of kiobufs in use for this purpose).

Thanks,
-Janet
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Stephan von Krawczynski: "Re: sysv.o and 2.4.18-pre4"
Previous message: Linus Torvalds: "Re: pte-highmem-5"