[PATCH v4] ALSA: core: Fix deadlock when shutdown a frozen userspace

Pierre-Louis Bossart pierre-louis.bossart at linux.intel.com
Mon Nov 28 18:26:03 CET 2022



On 11/28/22 11:04, Takashi Iwai wrote:
> On Mon, 28 Nov 2022 17:49:20 +0100,
> Pierre-Louis Bossart wrote:
>>
>>
>>
>> On 11/28/22 07:42, Ricardo Ribalda wrote:
>>> During kexec(), the userspace is frozen. Therefore we cannot wait for it
>>> to complete.
>>>
>>> Avoid running snd_sof_machine_unregister during shutdown.
>>>
>>> This fixes:
>>>
>>> [   84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done.
>>> [  246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds.
>>> [  246.819035] Call Trace:
>>> [  246.821782]  <TASK>
>>> [  246.824186]  __schedule+0x5f9/0x1263
>>> [  246.828231]  schedule+0x87/0xc5
>>> [  246.831779]  snd_card_disconnect_sync+0xb5/0x127
>>> ...
>>> [  246.889249]  snd_sof_device_shutdown+0xb4/0x150
>>> [  246.899317]  pci_device_shutdown+0x37/0x61
>>> [  246.903990]  device_shutdown+0x14c/0x1d6
>>> [  246.908391]  kernel_kexec+0x45/0xb9
>>>
>>> And:
>>>
>>> [  246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds.
>>> [  246.927709] Call Trace:
>>> [  246.930461]  <TASK>
>>> [  246.932819]  __schedule+0x5f9/0x1263
>>> [  246.936855]  ? fsnotify_grab_connector+0x5c/0x70
>>> [  246.942045]  schedule+0x87/0xc5
>>> [  246.945567]  schedule_timeout+0x49/0xf3
>>> [  246.949877]  wait_for_completion+0x86/0xe8
>>> [  246.954463]  snd_card_free+0x68/0x89
>>> ...
>>> [  247.001080]  platform_device_unregister+0x12/0x35
>>>
>>> Cc: stable at vger.kernel.org
>>> Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
>>> Signed-off-by: Ricardo Ribalda <ribalda at chromium.org>
>>> ---
>>> To: Pierre-Louis Bossart <pierre-louis.bossart at linux.intel.com>
>>> To: Liam Girdwood <lgirdwood at gmail.com>
>>> To: Peter Ujfalusi <peter.ujfalusi at linux.intel.com>
>>> To: Bard Liao <yung-chuan.liao at linux.intel.com>
>>> To: Ranjani Sridharan <ranjani.sridharan at linux.intel.com>
>>> To: Kai Vehmanen <kai.vehmanen at linux.intel.com>
>>> To: Daniel Baluta <daniel.baluta at nxp.com>
>>> To: Mark Brown <broonie at kernel.org>
>>> To: Jaroslav Kysela <perex at perex.cz>
>>> To: Takashi Iwai <tiwai at suse.com>
>>> Cc: sound-open-firmware at alsa-project.org
>>> Cc: alsa-devel at alsa-project.org
>>> Cc: linux-kernel at vger.kernel.org
>>> ---
>>> Changes in v4:
>>> - Do not call snd_sof_machine_unregister from shutdown.
>>> - Link to v3: https://lore.kernel.org/r/20221127-snd-freeze-v3-0-a2eda731ca14@chromium.org
>>>
>>> Changes in v3:
>>> - Wrap pm_freezing in a function
>>> - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
>>>
>>> Changes in v2:
>>> - Only use pm_freezing if CONFIG_FREEZER 
>>> - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
>>> ---
>>>  sound/soc/sof/core.c | 7 ++-----
>>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
>>> index 3e6141d03770..9616ba607ded 100644
>>> --- a/sound/soc/sof/core.c
>>> +++ b/sound/soc/sof/core.c
>>> @@ -475,19 +475,16 @@ EXPORT_SYMBOL(snd_sof_device_remove);
>>>  int snd_sof_device_shutdown(struct device *dev)
>>>  {
>>>  	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
>>> -	struct snd_sof_pdata *pdata = sdev->pdata;
>>>  
>>>  	if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE))
>>>  		cancel_work_sync(&sdev->probe_work);
>>>  
>>>  	/*
>>> -	 * make sure clients and machine driver(s) are unregistered to force
>>> -	 * all userspace devices to be closed prior to the DSP shutdown sequence
>>> +	 * make sure clients are unregistered prior to the DSP shutdown
>>> +	 * sequence.
>>>  	 */
>>>  	sof_unregister_clients(sdev);
>>>  
>>> -	snd_sof_machine_unregister(sdev, pdata);
>>> -
>>
>> The comment clearly says that we do want all userspace devices to be
>> closed. This was added in 83bfc7e793b5 ("ASoC: SOF: core: unregister
>> clients and machine drivers in .shutdown") precisely to avoid a platform
>> hang if the devices are used after the shutdown completes.
> 
> The problem is that it wants the *close* of the user-space programs
> unnecessarily.  Basically the shutdown can be seen as a sort of device
> hot unplug; i.e. the disconnection of the device files and the cleanup
> of device state are the main task.  The difference is that the hot
> unplug (unbind) usually follows the sync for the all processes being
> closed (so that you can release all resources gracefully), while this
> step is skipped for the shutdown (no need for resource-free).

Sorry Takashi, I don't have enough background to follow your explanations.

As Kai mentioned it, this step helped with a S5 issue earlier in 2022.
Removing this will mechanically bring the issue back and break other
Chromebooks.


More information about the Alsa-devel mailing list