Re: problem with blk_queue_bounce_limit()

David Mosberger (davidm@napali.hpl.hp.com)
Fri, 6 Jun 2003 13:13:38 -0700


>>>>> On Fri, 06 Jun 2003 00:32:30 -0700 (PDT), "David S. Miller" <davem@redhat.com> said:

David> From: David Mosberger <davidm@napali.hpl.hp.com> Date:
David> Fri, 6 Jun 2003 00:19:08 -0700

David> PCI_DMA_BUS_IS_PHYS (and it's description) is quite
David> misleading: it claims that it has something to do with there
David> being an equivalence between PCI bus and physical addresses.
David> That's actually the case for (small) ia64 platforms so that's
David> why we ended up setting it to 1.

David> It does have to do with such an equivalence. If your port
David> couldn't work if drivers use the deprecated
David> virt_to_bus/bus_to_virt, you must set PCI_DMA_BUS_IS_PHYS to
David> zero.

Yes, but the comment certainly is confusing. How about something like
this:

/*
* PCI_DMA_BUS_IS_PHYS should be set to 1 if there is necessarily a
* direct corespondence between device bus addresses and CPU physical
* addresses. Platforms with a hardware I/O MMU _must_ turn this off
* to suppress the bounce buffer handling code in the block and
* network device layers. Platforms with separate bus address spaces
* _must_ turn this off and provide a device DMA mapping
* implementation that takes care of the necessary address
* translation.
*/
#define PCI_DMA_BUS_IS_PHYS pci_dma_bus_is_phys

David> The whole block layer makes all kinds of assumptions about
David> what physically contiguous addresses mean about how they'll
David> be contiguous in the bus addresses the device will actually
David> use to perform the DMA transfer.

This sounds all very dramatic, but try as I might, all I find is three
places where PCI_DMA_BUS_IS_PHYS is used:

- ide-lib.c: used to disable bounce buffering
- scsi_lib.c: used to disable bounce buffering
- tg3.c: what the heck??

The tg3 code looks ugly in the extreme (sorry). If I understand it
right, it's trying to work around a bug which shows up when a packet
covers a certain address? With an I/O MMU, you then remap the
offending buffer again before freeing the old mapping which will
ensure a different address of the packet, whereas in the non-I/O MMU
case, you copy the entire socket buffer (since just remapping it won't
change the address and since there is no interface to copy just a
portion of a socket buffer) and then do the remapping.

Did I get this right (or at least close enough)?

It seems really bad to me to rely on implementation-specifics of the
DMA API. I suspect the code would break on a platform which has
separate bus address spaces but no (hardware) I/O TLB? (Yeah,
probably not a supported scenarious, but with a proper fix, this
wouldn't be a problem.)

Does this bug happen often enough that it's performance critical?
Otherwise, you could just always use the copy-the-entire-buffer
workaround. If its performance-critical, would it make sense
to extend the socket buffer API to allow copying a portion of
a buffer?

I really dislike PCI_DMA_BUS_IS_PHYS, because it introduces a
discontinuity. I don't think it should be necessary.

If it wasn't for tg3.c, couldn't PCI_DMA_BUS_IS_PHYS be gotten rid of
much more cleanly with a dma_max_phys_addr(dev) function, which would
return the maximum physical address that device DEV can address
(either directly, or via an I/O TLB)?

David> We could convert the few compile time checks of
David> PCI_DMA_BUS_IS_PHYS so that you can set this based upon the
David> configuration of the machine if for some configurations it is
David> true. drivers/net/tg3.c is the only offender, my bad :-)

Yes. Would you mind fixing that?

--david
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/