Hi,
On Fri, 4 Nov 2022, Kai Vehmanen wrote:
On Fri, 4 Nov 2022, Takashi Iwai wrote:
On Fri, 04 Nov 2022 16:42:29 +0100, Kai Vehmanen wrote:
I think an explicit error would be best. The problem now is that the driver will think the allocation (and mapping to device) is fine and proceeds to program the hardware to use the address. This will then create an IOMMU fault down the line that is not so straighforward to recover from (worst case is that a full device level reset needs to be done). And given driver doesn't know it got a faulty mapping, it's hard to make the decision why the fault happened.
OK, then what I posted in another mail (it went to nirvana) might work. Attached below again.
thanks! Let me put this through testing and get back to you next week. We'll also debug a bit more what exactly goes on that leads to dma_alloc_noncontiguous failing.
Takashi, the patch seems good. We've included it in multiple test runs and seems to be working as intented, so:
Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com
... for the patch. Rootcausing why dma_alloc_noncontiguous() fails is still ongoing (reproduction rate is very, very low). I've been looking at snd_dma_sg_fallback_alloc() and comparing to iommu dma allocator code, and it's curious how can we have a case where former succeeds but latter not. We e.g. now pass __GFP_RETRY_MAYFAIL to dma_alloc_noncontiguous(), but snd_dma_sg_fallback_alloc() does the alloc_pages_exact() call with __GFP_NORETRY. But of course it can fail when doing the mapping and there's more code involved. But alas this is a separate issue for us to track down.
Br, Kai