[alsa-devel] pcm_post_start race condition with hardware error interrupts

Takashi Iwai tiwai at suse.de
Wed Dec 20 19:02:14 CET 2017


On Wed, 20 Dec 2017 18:07:50 +0100,
srainey at delta-info.com wrote:
> 
> 
> This is more of a general question/issue, but a bit of background: I encountered this in trying to resolve an issue on a fairly mature embedded ARM platform wherein captured audio was left/right swapped some small percent (~4%) of the time on startup.  I should mention that I'm locked into using a fairly old kernel on this platform, but having a quick look at the latest at kernel.org reveals it to be not much changed.
>  
> In tracing events, I found that during the stream startup, the following sequence is possible (and indeed does happen some small percent of time):
>  
> -Audio Pipeline startups routines are kicked off through snd_pcm_action_start, eventually calling the driver's trigger op.
>  
> -The driver's trigger function is called with  SNDRV_PCM_TRIGGER_START command, which configures the necessary hardware and enables interrupts.
>  
> -While the startup sequence is still running, an interrupt is generated due to some real error condition in the hardware, and the ISR sets the stream's state to SNDRV_PCM_STATE_XRUN.

At setting XRUN, did you take the substream lock?  Without that, it's
racy, of course.

> -The startup sequence reaches snd_pcm_post_start, which unconditionally sets the stream's state to RUNNING, clobbering the error state if it was set.  See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/core/pcm_native.c?h=v4.15-rc4#n1192
>  
> If the hardware reaches an error state, the user must be informed in order to take action, but the unconditional assignment here does not allow that to happen if there is a hardware error in this small startup time.  I don't have a patch because I am working on a very old vendor provided kernel (2.6.37), but by only setting the state if the current state is PREPARED, along with some other necessary changes in the driver, I was able to mitigate the l/r swap issue and get correct behavior.
>  
> This is likely a very rare issue, but also it was somewhat difficult to debug.  I'm not familiar enough with kernel dev processes or the audio framework to say something should be done or if there is perhaps some other solution here.  It seems like something *should* be done anyway. 

By taking the substream lock (e.g. via snd_pcm_stop_xrun()), it
assures that the whole start sequence has been done.  It's the way to
avoid such an inconsistent state.


Takashi


More information about the Alsa-devel mailing list