
Hi Christoph,
On Wed, Aug 19, 2020 at 8:57 AM Christoph Hellwig hch@lst.de wrote:
Add a new API to allocate and free pages that are guaranteed to be addressable by a device, but otherwise behave like pages allocated by alloc_pages. The intended APIs to sync them for use with the device and cpu are dma_sync_single_for_{device,cpu} that are also used for streaming mappings.
Switch all drivers over to this new API, but keep the usage of the crufty dma_cache_sync API for now, which will be cleaned up on a driver by driver basis.
Signed-off-by: Christoph Hellwig hch@lst.de
Documentation/core-api/dma-api.rst | 68 +++++++++++------------ Documentation/core-api/dma-attributes.rst | 8 --- arch/alpha/kernel/pci_iommu.c | 2 + arch/arm/mm/dma-mapping-nommu.c | 2 + arch/arm/mm/dma-mapping.c | 4 ++ arch/ia64/hp/common/sba_iommu.c | 2 + arch/mips/jazz/jazzdma.c | 7 +-- arch/powerpc/kernel/dma-iommu.c | 2 + arch/powerpc/platforms/ps3/system-bus.c | 4 ++ arch/powerpc/platforms/pseries/vio.c | 2 + arch/s390/pci/pci_dma.c | 2 + arch/x86/kernel/amd_gart_64.c | 2 + drivers/iommu/dma-iommu.c | 2 + drivers/iommu/intel/iommu.c | 4 ++ drivers/net/ethernet/i825xx/lasi_82596.c | 13 ++--- drivers/net/ethernet/seeq/sgiseeq.c | 12 ++-- drivers/parisc/ccio-dma.c | 2 + drivers/parisc/sba_iommu.c | 2 + drivers/scsi/53c700.c | 8 +-- drivers/scsi/sgiwd93.c | 12 ++-- drivers/xen/swiotlb-xen.c | 2 + include/linux/dma-direct.h | 5 ++ include/linux/dma-mapping.h | 29 ++++++++-- include/linux/dma-noncoherent.h | 3 - kernel/dma/direct.c | 51 ++++++++++++++++- kernel/dma/mapping.c | 43 +++++++++++++- kernel/dma/ops_helpers.c | 35 ++++++++++++ kernel/dma/virt.c | 2 + sound/mips/hal2.c | 20 +++---- 29 files changed, 254 insertions(+), 96 deletions(-)
Thanks for the patch. The general design looks quite nice, but please see my comments inline.
diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst index 90239348b30f6f..047fcfffa0e5cf 100644 --- a/Documentation/core-api/dma-api.rst +++ b/Documentation/core-api/dma-api.rst @@ -516,48 +516,53 @@ routines, e.g.::: }
-Part II - Advanced dma usage
+Part II - Non-coherent DMA allocations +--------------------------------------
-Warning: These pieces of the DMA API should not be used in the -majority of cases, since they cater for unlikely corner cases that -don't belong in usual drivers. +These APIs allow to allocate pages that can be used like normal pages +in the kernel direct mapping, but are guaranteed to be DMA addressable.
Could we elaborate a bit more on what "like normal pages in kernel direct mapping" mean from the driver perspective?
If you don't understand how cache line coherency works between a processor and an I/O device, you should not be using this part of the -API at all. +API.
::
void *
dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
gfp_t flag, unsigned long attrs)
dma_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle,
enum dma_data_direction dir, gfp_t gfp)
+This routine allocates a region of <size> bytes of consistent memory. It +returns a pointer to the allocated region (in the processor's virtual address +space) or NULL if the allocation failed. The returned memory is guanteed to +behave like memory allocated using alloc_pages.
There is one aspect that the existing dma_alloc_attrs() handles, but this new function doesn't: IOMMU support. The function will always allocate a physically-contiguous block memory, which is a costly operation and not even guaranteed to succeed, even if enough free memory is available.
Modern SoCs employ IOMMUs to avoid the need to allocate physically-contiguous memory and those happen to be also the devices that could benefit from non-coherent allocations a lot. One of the tasks of the DMA API was making it possible to allocate suitable memory for a given device, without having the driver know about the SoC integration details, such as the presence of an IOMMU.
Today, dma_alloc_attrs() uses the .alloc callback of the dma_ops struct and the IOMMU-aware implementations, like the dma-iommu helpers [1], would allocate discontiguous pages. Therefore, while I see the DMA-aware page allocation functionality as a useful functionality on its own for scatter-gather-capable hardware, I believe it is not a complete replacement for dma_alloc_attrs() with the DMA_ATTR_NON_CONSISTENT attribute.
[1] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/iommu/dma-iommu.c#L...
Best regards, Tomasz