Re: [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE

7 Oct 2021


      On Wed, 06 Oct 2021 21:47:33 +0200,
Pierre-Louis Bossart wrote:
...
...
I tested this further. It appears that things work fine till 'patch 3/5'
of yours. After I take 'patch 4/5', the print "BUG: scheduling while
atomic: aplay" started showing up and I see a hang. This failure was
seen for 2x1 mixer test itself.
The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger().
This seems to be the problem, since trigger() runs in atomic context
depending on the PCM 'nonatomic' flag. I am not sure if your design sets
'nonatomic' flag by default and that is why the issue is not seen at
your end?
With below (just for testing purpose), tests ran well. I was able to run
2x1, 3x1 ... 10x1 mixer tests.
This is useful information, thanks a lot Sameer for your time.
Unfortunately removing the lines below will not work, that's
precisely what's needed to prevent multiple triggers from being sent to
the same shared BE.
I don't have a clear idea why we see different results, and now I have
even less understanding of the ALSA/ASoC/DPCM locking model. We use
card->mutex, card->pcm_mutex, card->dpcm_lock,
substream->self_group.mutex or lock, client_mutex, dapm_mutex....
I don't think any of the DPCM code is ever called from an interrupt
context, so the errors you reported seem more like a card configuration,
specifically the interaction with 'aplay'/userspace will happen for FEs.
BEs are mostly hidden to userspace.
One possible difference is that all our DPCM solutions are based on
non-atomic FEs (since they all involve an IPC with a DSP), so we would
always use the top banch of these tests:
if (nonatomic) \
   	mutex_ ## mutex_action(&group->mutex); \
   else \
   	spin_ ## action(&group->lock);
I don't see this nonatomic flag set anywhere in the sound/soc/tegra
code, so it's possible that in your case the spin_lock/spin_lock_irq is
used before triggering the FE, and using a mutex before the BE trigger
throws errors? I think Takashi mentioned that the BEs inherit the
properties of the FE to some extent.
Looking at the code, I see that only Intel legacy, SOF and Qualcomm
drivers set nonatomic=1, so maybe we can assume that that FEs for a card
are either all atomic or all non-atomic, maybe we could then use a
spinlock or a mutex. But if the FEs can be different then I am not sure
what locking model might work, we can't have a BE protected by a
spinlock or a mutex depending on the property of the FE. We need one
method only to guarantee a mutual exclusion.
Takashi, Mark, do you think that an all/none assumption on FE nonatomic
properties would make sense?
As long as BE's are updated from FE's PCM callback, BE has to follow
the atomicity of its FE, so we can't assume all or none globally.
How is the expected lock granularity and the protection context?  Do
we need a card global lock/mutex, or is it specific to each BE
substream?
thanks,
Takashi

Re: [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE

Takashi Iwai