This is sort of the same question I have, the only problem you will have here is that vmalloc() will
return only _virtual_ contiguous pages, not physical, so you would actually have to use pci_map_sg()
instead of pci_map_single(). The problem is that vmalloc() itself is not restricted to allocate
pages which is guaranteed to be directly DMA capable for the PCI device so the pci_map_xxx function
will have to allocate bounce buffers for the data if the physical address is not within the device's
limits. I'm proposing a new interface, something in the line of :
First set the DMA boundaries for our device:
pci_set_dma_mask(pcidev, 0xffffffff);
Then use this API to allocate DMA capable memory (this API also does the mapping to PCI so the
pci_map_xx calls is not needed when using it) :
dmamem = pci_alloc_dmamem(pcidev, nbytes, &vaddr, CONSISTENT);
to get consistent memory (a memory region where caching and so on would be turned off for certain
platforms). This memory is of course physical contiguous (this is the equivalent to the existing
pci_alloc_consistent() function).
dmamem = pci_dmamem_alloc(pcidev, nbytes, &vaddr, BIDIRECTIONAL);
to get a streaming memory region which should be accessible from kernel space, but isn't needed to
be physical contiguous (i.e. using a scatter-gather table for all the physical pages when mapping it
to PCI). vmalloc() could be used to get the pages here.
dmamem = pci_dmamem_alloc(pcidev, nbytes, &vaddr, BIDIRECTIONAL | CONTIGUOUS);
to get a streaming memory region which should be accessible from kernel space and also physical
contiguous (i.e. using get_free_pages() or kmalloc() to get the pages).
dmamem = pci_dmamem_alloc(pcidev, nbytes, NULL, BIDIRECTIONAL);
to get a streaming memory region which is not accessed by the kernel at all (i.e a frame grabber
buffer or a SCI shared memory segment only used in user space).
Feeding the I/O addresss and length to the actual PCI adapter should be done sort of the same way as
before :
nents = pci_dmamem_nents(dmamem);
for (i = 0; i < nents; i++) {
   hw_address[i] = pci_dmamem_address(dmamem, i);
   hw_len[i] = pci_dmamem_len(dmamem, i);
}
On contigous and consistent memory regions, nents should be one and therefore no looping should be
neccessary :
hw_address = pci_dmamem_address(dmamem, 0);
hw_len = pci_dmamem_len(dmamem, 0);
hw_len here should of course correspond to the nbytes argument given to the pci_alloc_dmamem()
function.
So, what do you think ? Is this something we should think of for 2.5, or am I on the wrong side of
the road here ?
Regards,
-- Steffen Persvold | Scalable Linux Systems | Try out the world's best mailto:sp@scali.com | http://www.scali.com | performing MPI implementation: Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 - Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/