On Tue, 09 Jun 2020 07:43:06 +0200, Christoph Hellwig wrote:
On Mon, Jun 08, 2020 at 07:31:47PM -0700, David Rientjes wrote:
On Mon, 8 Jun 2020, Alex Xu (Hello71) wrote:
Excerpts from Christoph Hellwig's message of June 8, 2020 2:19 am:
Can you do a listing using gdb where this happens?
gdb vmlinux
l *(snd_pcm_hw_params+0x3f3)
?
(gdb) l *(snd_pcm_hw_params+0x3f3) 0xffffffff817efc85 is in snd_pcm_hw_params (.../linux/sound/core/pcm_native.c:749). 744 while (runtime->boundary * 2 <= LONG_MAX - runtime->buffer_size) 745 runtime->boundary *= 2; 746 747 /* clear the buffer for avoiding possible kernel info leaks */ 748 if (runtime->dma_area && !substream->ops->copy_user) 749 memset(runtime->dma_area, 0, runtime->dma_bytes); 750 751 snd_pcm_timer_resolution_change(substream); 752 snd_pcm_set_state(substream, SNDRV_PCM_STATE_SETUP); 753
Working theory is that CONFIG_DMA_NONCOHERENT_MMAP getting set is causing the error_code in the page fault path. Debugging with Alex off-thread we found that dma_{alloc,free}_from_pool() are not getting called from the new code in dma_direct_{alloc,free}_pages() and he has not enabled mem_encrypt.
While DMA_COHERENT_POOL absolutely should not select DMA_NONCOHERENT_MMAP (and you should send your patch either way), I don't think it is going to make a difference here, as DMA_NONCOHERENT_MMAP just means we allows mmaps even for non-coherent devices, and we do not support non-coherent devices on x86.
From the disassembly it seems like a vmalloc allocation is NULL, which
seems really weird as this patch shouldn't make a difference for them, and I also only see a single places that allocates the field, and that checks for an allocation failure. But the sound code is a little hard to unwind sometimes.
It's not clear which sound device being affected, but if it's HD-audio on x86, runtime->dma_area points to a vmapped buffer from SG-pages allocated by dma_alloc_coherent().
OTOH, if it's a USB-audio, runtime->dma_area is a buffer by vmalloc().
Takashi