regression with SG DMA buf allocations and IOMMU in low-mem

Kai Vehmanen kai.vehmanen at linux.intel.com
Thu Nov 10 13:52:08 CET 2022


Hi,

On Fri, 4 Nov 2022, Kai Vehmanen wrote:

> On Fri, 4 Nov 2022, Takashi Iwai wrote:
> 
> > On Fri, 04 Nov 2022 16:42:29 +0100, Kai Vehmanen wrote:
> > > I think an explicit error would be best. The problem now is that the 
> > > driver will think the allocation (and mapping to device) is fine and 
> > > proceeds to program the hardware to use the address. This will then create 
> > > an IOMMU fault down the line that is not so straighforward to recover from 
> > > (worst case is that a full device level reset needs to be done). And given 
> > > driver doesn't know it got a faulty mapping, it's hard to make the 
> > > decision why the fault happened.
> > 
> > OK, then what I posted in another mail (it went to nirvana) might
> > work.  Attached below again.
> 
> thanks! Let me put this through testing and get back to you next 
> week. We'll also debug a bit more what exactly goes on that leads to 
> dma_alloc_noncontiguous failing.

Takashi, the patch seems good. We've included it in multiple test runs and
seems to be working as intented, so:

Reviewed-by: Kai Vehmanen <kai.vehmanen at linux.intel.com>

... for the patch. Rootcausing why dma_alloc_noncontiguous() fails is 
still ongoing (reproduction rate is very, very low). I've been looking at 
snd_dma_sg_fallback_alloc() and comparing to iommu dma allocator code, 
and it's curious how can we have a case where former succeeds but latter 
not. We e.g. now pass __GFP_RETRY_MAYFAIL to dma_alloc_noncontiguous(), 
but snd_dma_sg_fallback_alloc() does the alloc_pages_exact() call with 
__GFP_NORETRY. But of course it can fail when doing the mapping and 
there's more code involved. But alas this is a separate issue for 
us to track down.

Br, Kai


More information about the Alsa-devel mailing list