[PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems
Hi,
this is a patch series to address the recent regression on Xen PV (and possibly non-IOMMU) systems about the SG-buffer memory allocation. We switched to use dma_alloc_noncontiguous() as hoped it handling everything right, but it turned out that this doesn't work always. So this is one step back, use the explicit SG-buffer with dma_alloc_coherent() calls, but in a bit more optimized ways, and also applying only for those systems.
Takashi
===
Takashi Iwai (2): ALSA: memalloc: Explicit SG-allocations for Xen PV and non-IOMMU systems ALSA: memalloc: Use coherent DMA allocation for fallback again
sound/core/memalloc.c | 68 +++++++++++++++++++++++++++++++++---------- 1 file changed, 53 insertions(+), 15 deletions(-)
For non-IOMMU systems, it makes little sense (or even buggy) to use dma_alloc_noncontiguous() as an x86 SG-buffer allocation, as it always results in the big continuous pages instead of SG buffer. Also, for Xen PV, this seems also not working well as the pages allocated there aren't guaranteed to be coherent.
This patch is a first step to address those problems. It changes the memalloc helper to go for the explicit SG-page allocations via the existing fallback allocation primarily for such a platform instead of dma_alloc_noncontiguous().
Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again") Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU") Reported-and-tested-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/memalloc.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c index 81025f50a542..30c9ad192986 100644 --- a/sound/core/memalloc.c +++ b/sound/core/memalloc.c @@ -541,10 +541,9 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size) struct sg_table *sgt; void *p;
- sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir, - DEFAULT_GFP, 0); #ifdef CONFIG_SND_DMA_SGBUF - if (!sgt && !get_dma_ops(dmab->dev.dev)) { + if (cpu_feature_enabled(X86_FEATURE_XENPV) || + !get_dma_ops(dmab->dev.dev)) { if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG) dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK; else @@ -552,6 +551,8 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size) return snd_dma_sg_fallback_alloc(dmab, size); } #endif + sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir, + DEFAULT_GFP, 0); if (!sgt) return NULL;
We switched the memory allocation for fallback cases in the noncontig type to use the standard alloc_pages*() at the commit a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again"), while we used the dma_alloc_coherent() in the past. The reason was that the page address retrieved from the virtual pointer returned from dma_alloc_coherent() can't be used with IOMMU systems. Meanwhile, we explicitly disabled the fallback allocation for IOMMU systems at the commit 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU") after the commit above; that is, the usage of dma_alloc_coherent() should be OK again.
Now, we've received reports that the current fallback page allocation caused a regression on Xen (and maybe other) systems; the sound disappear partially or completely. So it's time to take back to dma_alloc_coherent().
This patch switches back to the dma_alloc_coherent() for the fallback allocations. Unlike the previous implementation, the allocation is implemented in a more optimized way to try larger chunks. Also, the proper DMA address from the dma_alloc_coherent() is returned.
The page count for the page chunk is stored in the lower bits of the first page address. The address is masked and re-calculated at get_addr memalloc ops in return.
Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again") Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU") Reported-and-tested-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/memalloc.c | 61 ++++++++++++++++++++++++++++++++++--------- 1 file changed, 49 insertions(+), 12 deletions(-)
diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c index 30c9ad192986..5fe21c0080dc 100644 --- a/sound/core/memalloc.c +++ b/sound/core/memalloc.c @@ -720,17 +720,31 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = { struct snd_dma_sg_fallback { size_t count; struct page **pages; + /* DMA address array; the first page contains #pages in ~PAGE_MASK */ + dma_addr_t *addrs; };
static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab, struct snd_dma_sg_fallback *sgbuf) { - bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK; - size_t i; - - for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++) - do_free_pages(page_address(sgbuf->pages[i]), PAGE_SIZE, wc); + size_t i, size; + + if (sgbuf->pages && sgbuf->addrs) { + i = 0; + while (i < sgbuf->count) { + if (!sgbuf->pages[i] || !sgbuf->addrs[i]) + break; + size = sgbuf->addrs[i] & ~PAGE_MASK; + if (WARN_ON(!size)) + break; + dma_free_coherent(dmab->dev.dev, size << PAGE_SHIFT, + page_address(sgbuf->pages[i]), + sgbuf->addrs[i] & PAGE_MASK); + i += size; + } + } kvfree(sgbuf->pages); + kvfree(sgbuf->addrs); kfree(sgbuf); }
@@ -739,9 +753,9 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size) struct snd_dma_sg_fallback *sgbuf; struct page **pagep, *curp; size_t chunk, npages; + dma_addr_t *addrp; dma_addr_t addr; void *p; - bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL); if (!sgbuf) @@ -749,14 +763,16 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size) size = PAGE_ALIGN(size); sgbuf->count = size >> PAGE_SHIFT; sgbuf->pages = kvcalloc(sgbuf->count, sizeof(*sgbuf->pages), GFP_KERNEL); - if (!sgbuf->pages) + sgbuf->addrs = kvcalloc(sgbuf->count, sizeof(*sgbuf->addrs), GFP_KERNEL); + if (!sgbuf->pages || !sgbuf->addrs) goto error;
pagep = sgbuf->pages; - chunk = size; + addrp = sgbuf->addrs; + chunk = (PAGE_SIZE - 1) << PAGE_SHIFT; /* to fit in low bits in addrs */ while (size > 0) { chunk = min(size, chunk); - p = do_alloc_pages(dmab->dev.dev, chunk, &addr, wc); + p = dma_alloc_coherent(dmab->dev.dev, chunk, &addr, DEFAULT_GFP); if (!p) { if (chunk <= PAGE_SIZE) goto error; @@ -768,17 +784,25 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size) size -= chunk; /* fill pages */ npages = chunk >> PAGE_SHIFT; + *addrp = npages; /* store in lower bits */ curp = virt_to_page(p); - while (npages--) + while (npages--) { *pagep++ = curp++; + *addrp++ |= addr; + addr += PAGE_SIZE; + } }
p = vmap(sgbuf->pages, sgbuf->count, VM_MAP, PAGE_KERNEL); if (!p) goto error; + + if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK) + set_pages_array_wc(sgbuf->pages, sgbuf->count); + dmab->private_data = sgbuf; /* store the first page address for convenience */ - dmab->addr = snd_sgbuf_get_addr(dmab, 0); + dmab->addr = sgbuf->addrs[0] & PAGE_MASK; return p;
error: @@ -788,10 +812,23 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab) { + struct snd_dma_sg_fallback *sgbuf = dmab->private_data; + vunmap(dmab->area); + if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK) + set_pages_array_wb(sgbuf->pages, sgbuf->count); __snd_dma_sg_fallback_free(dmab, dmab->private_data); }
+static dma_addr_t snd_dma_sg_fallback_get_addr(struct snd_dma_buffer *dmab, + size_t offset) +{ + struct snd_dma_sg_fallback *sgbuf = dmab->private_data; + size_t index = offset >> PAGE_SHIFT; + + return (sgbuf->addrs[index] & PAGE_MASK) | (offset & ~PAGE_MASK); +} + static int snd_dma_sg_fallback_mmap(struct snd_dma_buffer *dmab, struct vm_area_struct *area) { @@ -806,8 +843,8 @@ static const struct snd_malloc_ops snd_dma_sg_fallback_ops = { .alloc = snd_dma_sg_fallback_alloc, .free = snd_dma_sg_fallback_free, .mmap = snd_dma_sg_fallback_mmap, + .get_addr = snd_dma_sg_fallback_get_addr, /* reuse vmalloc helpers */ - .get_addr = snd_dma_vmalloc_get_addr, .get_page = snd_dma_vmalloc_get_page, .get_chunk_size = snd_dma_vmalloc_get_chunk_size, };
On Tue, 24 Jan 2023 10:27:42 +0100, Takashi Iwai wrote:
Hi,
this is a patch series to address the recent regression on Xen PV (and possibly non-IOMMU) systems about the SG-buffer memory allocation. We switched to use dma_alloc_noncontiguous() as hoped it handling everything right, but it turned out that this doesn't work always. So this is one step back, use the explicit SG-buffer with dma_alloc_coherent() calls, but in a bit more optimized ways, and also applying only for those systems.
It seems that the second patch causes a problem; at least I see casual Oopses on my system after using the patch. Let's scratch.
I'll resubmit the fix. Marek, could you try that later and report back if it still works and doesn't break things again?
thanks,
Takashi
On Wed, Jan 25, 2023 at 04:29:19PM +0100, Takashi Iwai wrote:
On Tue, 24 Jan 2023 10:27:42 +0100, Takashi Iwai wrote:
Hi,
this is a patch series to address the recent regression on Xen PV (and possibly non-IOMMU) systems about the SG-buffer memory allocation. We switched to use dma_alloc_noncontiguous() as hoped it handling everything right, but it turned out that this doesn't work always. So this is one step back, use the explicit SG-buffer with dma_alloc_coherent() calls, but in a bit more optimized ways, and also applying only for those systems.
It seems that the second patch causes a problem; at least I see casual Oopses on my system after using the patch. Let's scratch.
I'll resubmit the fix. Marek, could you try that later and report back if it still works and doesn't break things again?
Sure, just cc me.
participants (2)
-
Marek Marczykowski-Górecki
-
Takashi Iwai