[alsa-devel] pcm_post_start race condition with hardware error interrupts

srainey at delta-info.com srainey at delta-info.com
Wed Dec 20 20:13:22 CET 2017


>-----Original Message-----
>From: "Takashi Iwai" <tiwai at suse.de>
>Sent: Wednesday, December 20, 2017 1:02pm
>On Wed, 20 Dec 2017 18:07:50 +0100,
>srainey at delta-info.com wrote:
>> 
>> 
>> This is more of a general question/issue, but a bit of background: I encountered this in trying to resolve an issue on a fairly mature embedded ARM platform wherein captured audio was left/right swapped some small percent (~4%) of the time on startup. I should mention that I'm locked into using a fairly old kernel on this platform, but having a quick look at the latest at kernel.org reveals it to be not much changed.
>> 
>> In tracing events, I found that during the stream startup, the following sequence is possible (and indeed does happen some small percent of time):
>> 
>> -Audio Pipeline startups routines are kicked off through snd_pcm_action_start, eventually calling the driver's trigger op.
>> 
>> -The driver's trigger function is called with SNDRV_PCM_TRIGGER_START command, which configures the necessary hardware and enables interrupts.
>> 
>> -While the startup sequence is still running, an interrupt is generated due to some real error condition in the hardware, and the ISR sets the stream's state to SNDRV_PCM_STATE_XRUN.
>
>At setting XRUN, did you take the substream lock? Without that, it's
>racy, of course.
>
>> -The startup sequence reaches snd_pcm_post_start, which unconditionally sets the stream's state to RUNNING, clobbering the error state if it was set. See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/core/pcm_native.c?h=v4.15-rc4#n1192
>> 
>> If the hardware reaches an error state, the user must be informed in order to take action, but the unconditional assignment here does not allow that to happen if there is a hardware error in this small startup time. I don't have a patch because I am working on a very old vendor provided kernel (2.6.37), but by only setting the state if the current state is PREPARED, along with some other necessary changes in the driver, I was able to mitigate the l/r swap issue and get correct behavior.
>> 
>> This is likely a very rare issue, but also it was somewhat difficult to debug. I'm not familiar enough with kernel dev processes or the audio framework to say something should be done or if there is perhaps some other solution here. It seems like something *should* be done anyway. 
>
>By taking the substream lock (e.g. via snd_pcm_stop_xrun()), it
>assures that the whole start sequence has been done. It's the way to
>avoid such an inconsistent state.
>
>
>Takashi
 
The driver/IRQ handler in my case is the davinci mcasp driver https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/soc/davinci/davinci-mcasp.c?h=v4.15-rc4#n309
 
It does manually grab the lock. I also currently have it modified to ALWAYS set the state to xrun while the lock is held - when the IRQ handler is triggered before we reach running state, the user should be notified as things may need to be reset (in this case I think we want to go from prepared->xrun instead of the usual running->xrun). However this does not happen in the current unmodified kernel - snd_pcm_post_start does not check the state for error before setting it, and no errors are reported to the user.  
 
 


More information about the Alsa-devel mailing list