On Mon, 11 Oct 2021 22:06:51 +0200, Pierre-Louis Bossart wrote:
Basically we need to protect two things:
- The BE links
- The concurrent accesses to BEs
The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
famous last words "nothing wrong can happen." :-)
I already added a helper to do this FE lock, I can easily replace the implementation to remove the spin_lock and use the FE PCM lock. we might even add the lock in the definition of for_each_dpcm_be() to avoid misses.
Let me try this out today, thanks for the suggestions.
well, it's not successful at all...
When I replace the existing dpcm_lock with the FE PCM lock as you suggested, without any additional changes, speaker-test produces valid audio on the endpoints, but if I try a Ctrl-C or limit the number of loops with the '-l' option, I hear an endless loop on the same buffer and I have to power cycle my test device to stop the sound.
See 2 patches attached, the first patch with the introduction of the helper works fine, the second with the use of the FE PCM lock doesn't. In hindsight I am glad I worked on minimal patches, one after the other, to identify problems.
And when I add the BE lock, then nothing happens. Device stuck and no audio...
There must be something we're missing related to the locking...
And indeed there's a deadlock!
snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call snd_pcm_trigger.
Indeed, this would deadlock.
So if we also take the pcm stream lock in the BE trigger, there's a conceptual deadlock: we painted ourselves in a corner by using the same lock twice.
Takashi, are you really sure we should protect these for_each_dpcm_be() loops with the same pcm lock?
The call within the FE lock is done only in dpcm_dai_trigger_fe_be(), and this should call dpcm_be_dai_trigger() as is. In other places, the calls are without FE lock, hence they can take the lock, e.g. create a variant dpcm_dai_trigger_fe_be_lock().
it seems like asking for trouble to revisit the ALSA core just to walking through a list of BEs? Would you object to changing dpcm_lock as I suggested, but not interfering with stream handling?
That would work, too, it's just a pity to degrade the fine-grained locks that have been already taken into global locks...
Takashi