On Mon, 09 Sep 2024 22:02:08 +0200, Elliott Mitchell wrote:
On Sat, Sep 07, 2024 at 11:38:50AM +0100, Andrew Cooper wrote:
On 07/09/2024 8:46 am, Takashi Iwai wrote:
On Fri, 06 Sep 2024 20:42:09 +0200, Ariadne Conill wrote:
This patch attempted to work around a DMA issue involving Xen, but causes subtle kernel memory corruption.
When I brought up this patch in the XenDevel matrix channel, I was told that it had been requested by the Qubes OS developers because they were trying to fix an issue where the sound stack would fail after a few hours of uptime. They wound up disabling SG buffering entirely instead as a workaround.
Accordingly, I propose that we should revert this workaround patch, since it causes kernel memory corruption and that the ALSA and Xen communities should collaborate on fixing the underlying problem in such a way that SG buffering works correctly under Xen.
This reverts commit 53466ebdec614f915c691809b0861acecb941e30.
Signed-off-by: Ariadne Conill ariadne@ariadne.space Cc: stable@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: alsa-devel@alsa-project.org Cc: Takashi Iwai tiwai@suse.de
The relevant code has been largely rewritten for 6.12, so please check the behavior with sound.git tree for-next branch. I guess the same issue should happen as the Xen workaround was kept and applied there, too, but it has to be checked at first.
If the issue is persistent with there, the fix for 6.12 code would be rather much simpler like the blow.
thanks,
Takashi
--- a/sound/core/memalloc.c +++ b/sound/core/memalloc.c @@ -793,9 +793,6 @@ static void *snd_dma_sg_alloc(struct snd_dma_buffer *dmab, size_t size) int type = dmab->dev.type; void *p;
- if (cpu_feature_enabled(X86_FEATURE_XENPV))
return snd_dma_sg_fallback_alloc(dmab, size);
- /* try the standard DMA API allocation at first */ if (type == SNDRV_DMA_TYPE_DEV_WC_SG) dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC;
Individual subsystems ought not to know or care about XENPV; it's a layering violation.
If the main APIs don't behave properly, then it probably means we've got a bug at a lower level (e.g. Xen SWIOTLB is a constant source of fun) which is probably affecting other subsystems too.
This is a big problem. Debian bug #988477 (https://bugs.debian.org/988477) showed up in May 2021. While some characteristics are quite different, the time when it was first reported is similar to the above and it is also likely a DMA bug with Xen.
Yes, some incompatible behavior has been seen on Xen wrt DMA buffer handling, as it seems. But note that, in the case of above, it was triggered by the change in the sound driver side, hence we needed a quick workaround there. The result was to move back to the old method for Xen in the end.
As already mentioned in another mail, the whole code was changed for 6.12, and the revert isn't applicable in anyway.
So I'm going to submit another patch to drop this Xen PV-specific workaround for 6.12. The new code should work without the workaround (famous last words). If the problem happens there, I'd rather leave it to Xen people ;)
thanks,
Takashi