[alsa-devel] [PROBLEM] ALSA: hda: PCM streams are suspended while opening

Jon Hunter jonathanh at nvidia.com
Mon Mar 25 11:19:31 CET 2019


On 25/03/2019 09:59, Takashi Iwai wrote:
> On Thu, 21 Mar 2019 15:28:23 +0100,
> Jon Hunter wrote:
>>
>>
>> On 20/03/2019 17:59, Jon Hunter wrote:
>>>
>>> On 20/03/2019 17:19, Takashi Iwai wrote:
>>>> On Wed, 20 Mar 2019 17:04:14 +0100,
>>>> Jon Hunter wrote:
>>>>>
>>>>> From: Jonathan Hunter <jonathanh at nvidia.com>
>>>>>
>>>>> This issue is not observed on the latest mainline kernel since the
>>>>> 'ALSA: PCM suspend cleanup' series has been applied and the
>>>>> snd_pcm_suspend_all() function call in the HDA codec driver has been
>>>>> removed. However, this issue impacts stable kernel branches and so I
>>>>> am trying to find a solution for these branches.
>>>>>
>>>>> When stress testing audio playback across multiple HDA streams
>>>>> simultaneously, the following failure is sometimes observed when
>>>>> starting playback ...
>>>>>
>>>>>  Unable to set hw params for playback: File descriptor in bad state
>>>>>
>>>>> The problem is caused because ...
>>>>> 1. Playback is starting for one HDA codec and so the chip->open_mutex
>>>>>    in azx_pcm_open() has been acquired.
>>>>> 2. Playback for another HDA codec is starting but is blocked waiting to
>>>>>    acquire the chip->open_mutex in azx_pcm_open().
>>>>> 3. For the HDA codec that is blocked, its runtime-pm status is still
>>>>>    enabled from a previous playback session that has since completed and
>>>>>    been closed, however its autosuspend timeout has not expired yet.
>>>>> 4. For the HDA codec that is blocked, the runtime-pm autosuspend timeout
>>>>>    now occurs calling the runtime-pm suspend callback and this calls
>>>>>    snd_pcm_suspend_all() placing all PCM streams in the suspended state.
>>>>> 5. The block HDA codec is now unblocked and fails to set the HW params
>>>>>    because the PCM stream is in the suspended state.
>>>>>
>>>>> The above has been observed on Tegra platforms where the autosuspend
>>>>> delay is set to 1 second and there is a delay to an AZX command. This
>>>>> highlights that there is a window of time where autosuspend can place
>>>>> the HDA PCM stream in the suspended state while opening the stream
>>>>> and cause playback to fail.
>>>>>
>>>>> Looking at commit 3d21ef0b49f8 ('ALSA: pcm: Suspend streams globally via
>>>>> device type PM ops') it appears that the call to snd_pcm_suspend_all()
>>>>> has now been moved to the to the suspend handler for PCM streams and so
>>>>> I am wondering if it would be equivalent to make the following change to
>>>>> the HDA codec driver to achieve the same behaviour but only for the HDA
>>>>> driver. So far the issue has not been observed with this change.
>>>>>
>>>>> Please note that I am not sending this as a formal patch as I wanted to
>>>>> get some feedback/comments on the above.
>>>>
>>>> I guess just backporting 3d21ef0b49f8 should be OK even for older
>>>> kernels.  This assures the PCM stream gets suspended before entering
>>>> any parent device suspend call.
>>>>
>>>> The remaining changes (except for the one for atiixp) are merely
>>>> cleanup of superfluous calls, and keeping them are harmless.
>>>>
>>>> Could you check whether my theory above correct?
>>>
>>> I believe that you will need that change and 17bc4815de58 because we
>>> need to prevent snd_pcm_suspend_all() being called in the pm-runtime
>>> suspend callback. Applying only 3d21ef0b49f8 will means that at runtime
>>> snd_pcm_suspend_all() can still be called from within the HDA codec
>>> driver when the autosuspend timeout occurs. I will test this and
>>> confirm. Maybe both could be backported?
>>
>> I confirmed today that with just 3d21ef0b49f8 the issue can occur, but
>> with both 3d21ef0b49f8 and 17bc4815de58 the issue is not seen.
>>
>> Do you think that it would be appropriate to include these in the stable
>> branches? They apply cleanly to v5.0, but we would probably need to
>> back-port for early kernels such as v4.19, v4.9, etc.
> 
> I reconsidered the problem again, and noticed that the very same
> problem may appear with the system PM, not only with runtime PM.
> The suspend can happen at any time, so even a stream in OPEN state may
> go to suspend, and you'll hit the same problem.  It's just a corner
> case so no one really cared much.

Yes I was not sure if it could also happen in the system suspend case or
not.

> So, I think the patch like below should fix the problem.
> This can be easily backported to all stable trees, and it alone should
> work without backporting the else intrusive changes.
> 
> Could you check whether my theory is correct?

Do you want me to test this on the same stable branch I have been
testing with where commits 3d21ef0b49f8 and 17bc4815de58 are not present?

Cheers
Jon
 --
nvpublic


More information about the Alsa-devel mailing list