[PATCH] ASoC: hdac_hdmi: remove cancel_work_sync in runtime suspend
A deadlock is identified when there are three contexts running at the same time: - a HDMI jack work which is calling snd_soc_dapm_sync(). - user space is calling snd_pcm_release() to close pcm device. - pm is calling runtime suspend function of HDMI codec driver.
By removing the clear_dapm_works() invocation in the hdac_hdmi_runtime_suspend() function, the snd_pcm_release() could always returns from dapm_power_widgets() function call without blocking the hdac_hdmi_jack_dapm_work() work thread or being blocked by the hdac_hdmi_runtime_suspend() function. The purpose of the jack work is to enable/disable the dapm jack pin so it's not necessary to cancel the work in runtime suspend function which is usually called when pcm device is closed.
Signed-off-by: Brent Lu brent.lu@intel.com --- sound/soc/codecs/hdac_hdmi.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c index f26b77f..9052edd 100644 --- a/sound/soc/codecs/hdac_hdmi.c +++ b/sound/soc/codecs/hdac_hdmi.c @@ -2097,8 +2097,6 @@ static int hdac_hdmi_runtime_suspend(struct device *dev) if (!bus) return 0;
- clear_dapm_works(hdev); - /* * Power down afg. * codec_read is preferred over codec_write to set the power state.
A deadlock is identified when there are three contexts running at the same time:
- a HDMI jack work which is calling snd_soc_dapm_sync().
- user space is calling snd_pcm_release() to close pcm device.
- pm is calling runtime suspend function of HDMI codec driver.
By removing the clear_dapm_works() invocation in the hdac_hdmi_runtime_suspend() function, the snd_pcm_release() could always returns from dapm_power_widgets() function call without blocking the hdac_hdmi_jack_dapm_work() work thread or being blocked by the hdac_hdmi_runtime_suspend() function. The purpose of the jack work is to enable/disable the dapm jack pin so it's not necessary to cancel the work in runtime suspend function which is usually called when pcm device is closed.
Signed-off-by: Brent Lu brent.lu@intel.com
It is found on CML Chromebox 'Puff' which is using Chrome-v4.19 branch. Kernel resets itself because tasks getting stuck in mutex for 2 mins. Following is the reproduce steps:
1. Connect to external HDMI display. 2. Open web page "https://bitmovin.com/demos/drm" 3. Reload the web page multiple times until device reboot
snd_pcm_release() - In dapm_power_widgets() waitng for dapm_pre_sequence_async() to complete. - Holding the card->dapm_mutex when calling snd_soc_dapm_stream_event() function. - Call trace: [ 1352.228063] __schedule+0x4b8/0xf5c [ 1352.228074] schedule+0x3f/0x7a [ 1352.228083] async_synchronize_cookie_domain+0xb5/0x11c [ 1352.228091] ? init_wait_entry+0x2e/0x2e [ 1352.228101] dapm_power_widgets+0x1eb/0x3b5 [ 1352.228111] snd_soc_dapm_stream_event+0x80/0x92 [ 1352.228121] dpcm_dapm_stream_event+0x58/0x7c [ 1352.228130] dpcm_fe_dai_close+0xa3/0x118 [ 1352.228141] snd_pcm_release_substream+0x72/0xbc [ 1352.228148] snd_pcm_release+0x3e/0x9b [ 1352.228156] __fput+0xfa/0x1ef [ 1352.228166] task_work_run+0x7f/0xa7 [ 1352.228174] exit_to_usermode_loop+0xa5/0xa7 [ 1352.228182] do_syscall_64+0xfb/0x14b [ 1352.228190] entry_SYSCALL_64_after_hwframe+0x44/0xa9
dapm_pre_sequence_async() - In rpm_resume() waiting for hdac_hdmi_runtime_suspend() to complete. - Call trace: [ 1352.227903] Workqueue: events_unbound async_run_entry_fn [ 1352.227908] Call Trace: [ 1352.227917] __schedule+0x4b8/0xf5c [ 1352.227926] schedule+0x3f/0x7a [ 1352.227934] rpm_resume+0x16e/0x5ee [ 1352.227944] ? init_wait_entry+0x2e/0x2e [ 1352.227951] __pm_runtime_resume+0x6b/0x83 [ 1352.227959] dapm_pre_sequence_async+0x2c/0x8f [ 1352.227967] async_run_entry_fn+0x3d/0xd1 [ 1352.227975] process_one_work+0x19b/0x4c9 [ 1352.227983] worker_thread+0x10d/0x284 [ 1352.227991] kthread+0x138/0x140 [ 1352.227998] ? process_one_work+0x4c9/0x4c9 [ 1352.228006] ? kthread_associate_blkcg+0xbe/0xbe [ 1352.228012] ret_from_fork+0x1f/0x40
hdac_hdmi_runtime_suspend() - In clear_dapm_works() waiting for hdac_hdmi_jack_dapm_work() to complete. - Call trace: [ 1352.228359] Workqueue: pm pm_runtime_work [ 1352.228379] Call Trace: [ 1352.228396] __schedule+0x4b8/0xf5c [ 1352.228410] ? delete_node+0x74/0x1ab [ 1352.228426] schedule+0x3f/0x7a [ 1352.228442] schedule_timeout+0x11f/0x176 [ 1352.228461] do_wait_for_common+0x118/0x164 [ 1352.228477] ? console_conditional_schedule+0x28/0x28 [ 1352.228494] ? queue_core_balance+0x75/0x75 [ 1352.228509] wait_for_completion+0x4c/0x58 [ 1352.228526] __flush_work+0x112/0x1a1 [ 1352.228540] ? insert_wq_barrier+0x118/0x118 [ 1352.228556] __cancel_work_timer+0x9f/0x1ad [ 1352.228569] ? __switch_to_asm+0x41/0x70 [ 1352.228582] ? __switch_to_asm+0x35/0x70 [ 1352.228596] ? __switch_to_asm+0x41/0x70 [ 1352.228613] clear_dapm_works+0x3f/0x60 [snd_soc_hdac_hdmi] [ 1352.228627] ? hdmi_codec_resume+0x40/0x40 [snd_soc_hdac_hdmi] [ 1352.228642] hdac_hdmi_runtime_suspend+0x20/0xac [snd_soc_hdac_hdmi] [ 1352.228654] __rpm_callback+0x8a/0xf3 [ 1352.228669] ? hdmi_codec_resume+0x40/0x40 [snd_soc_hdac_hdmi] [ 1352.228680] rpm_suspend+0x3b4/0x73e [ 1352.228696] pm_runtime_work+0x5d/0xa0 [ 1352.228708] process_one_work+0x19b/0x4c9 [ 1352.228717] worker_thread+0x10d/0x284 [ 1352.228727] kthread+0x138/0x140 [ 1352.228736] ? process_one_work+0x4c9/0x4c9 [ 1352.228746] ? kthread_associate_blkcg+0xbe/0xbe [ 1352.228761] ret_from_fork+0x1f/0x40
hdac_hdmi_jack_dapm_work() - In snd_soc_dapm_sync() waiting for waiting for card->dapm_mutex to be released. - The mutex is held by snd_pcm_release(), deadlock here. - Call trace: [ 1352.228835] Workqueue: events hdac_hdmi_jack_dapm_work [snd_soc_hdac_hdmi] [ 1352.228845] Call Trace: [ 1352.228860] __schedule+0x4b8/0xf5c [ 1352.228878] ? mutex_spin_on_owner+0x86/0x94 [ 1352.228896] schedule_preempt_disabled+0x47/0x8a [ 1352.228911] __mutex_lock_common+0x25c/0x52c [ 1352.228929] __mutex_lock_slowpath+0x20/0x26 [ 1352.228946] snd_soc_dapm_sync+0x1b/0x4d [ 1352.228960] process_one_work+0x19b/0x4c9 [ 1352.228976] worker_thread+0x1e4/0x284 [ 1352.228994] kthread+0x138/0x140 [ 1352.229007] ? process_one_work+0x4c9/0x4c9 [ 1352.229021] ? kthread_associate_blkcg+0xbe/0xbe [ 1352.229036] ret_from_fork+0x1f/0x40
Regards, Brent
sound/soc/codecs/hdac_hdmi.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c index f26b77f..9052edd 100644 --- a/sound/soc/codecs/hdac_hdmi.c +++ b/sound/soc/codecs/hdac_hdmi.c @@ -2097,8 +2097,6 @@ static int hdac_hdmi_runtime_suspend(struct device *dev) if (!bus) return 0;
- clear_dapm_works(hdev);
- /*
- Power down afg.
- codec_read is preferred over codec_write to set the power state.
-- 2.7.4
On Wed, 15 Jul 2020 21:01:50 +0800, Brent Lu wrote:
A deadlock is identified when there are three contexts running at the same time:
- a HDMI jack work which is calling snd_soc_dapm_sync().
- user space is calling snd_pcm_release() to close pcm device.
- pm is calling runtime suspend function of HDMI codec driver.
By removing the clear_dapm_works() invocation in the hdac_hdmi_runtime_suspend() function, the snd_pcm_release() could always returns from dapm_power_widgets() function call without blocking the hdac_hdmi_jack_dapm_work() work thread or being blocked by the hdac_hdmi_runtime_suspend() function. The purpose of the jack work is to enable/disable the dapm jack pin so it's not necessary to cancel the work in runtime suspend function which is usually called when pcm device is closed.
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/1] ASoC: hdac_hdmi: remove cancel_work_sync in runtime suspend commit: cec6e41ce094affad3b5f9f5e5aa1f81c66ce682
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
participants (3)
-
Brent Lu
-
Lu, Brent
-
Mark Brown