On Wed, 16 Mar 2016 15:04:20 +0100, Ville Syrjälä wrote:
But now I got a lockdep spew when I enabled the HDMI video output [1]
And sure enough mplayer got stuck in the kernel when I tried to use the HDMI audio device [2]
[1] [ 1939.476458] ============================================= [ 1939.476460] [ INFO: possible recursive locking detected ] [ 1939.476463] 4.5.0-vga+ #13 Not tainted [ 1939.476464] --------------------------------------------- [ 1939.476466] kworker/2:2/1016 is trying to acquire lock: [ 1939.476469] (&spec->pcm_lock){+.+...}, at: [<ffffffffa020b868>] hdmi_present_sense+0x38/0x300 [snd_hda_codec_hdmi] [ 1939.476480] but task is already holding lock: [ 1939.476482] (&spec->pcm_lock){+.+...}, at: [<ffffffffa020b868>] hdmi_present_sense+0x38/0x300 [snd_hda_codec_hdmi] [ 1939.476489] other info that might help us debug this: [ 1939.476491] Possible unsafe locking scenario:
[ 1939.476493] CPU0 [ 1939.476495] ---- [ 1939.476496] lock(&spec->pcm_lock); [ 1939.476499] lock(&spec->pcm_lock); [ 1939.476502] *** DEADLOCK ***
[ 1939.476504] May be due to missing lock nesting notation
Unfortunately, no this is a real deadlock. Let's see below: hdmi_present_sense() gets called twice because the function issues a verb that does self-resume and it invokes hdmi_present_sense() again in runtime resume.
[ 1939.476622] [<ffffffffa020b868>] hdmi_present_sense+0x38/0x300 [snd_hda_codec_hdmi]
....
[ 1939.476642] [<ffffffffa020bd6d>] generic_hdmi_resume+0x4d/0x60 [snd_hda_codec_hdmi]
....
[ 1939.476690] [<ffffffffa017de62>] snd_hdac_power_up_pm+0x52/0x60 [snd_hda_core] [ 1939.476694] [<ffffffffa020b9c3>] hdmi_present_sense+0x193/0x300 [snd_hda_codec_hdmi] [ 1939.476699] [<ffffffffa020bba0>] check_presence_and_report+0x70/0x90 [snd_hda_codec_hdmi] [ 1939.476703] [<ffffffffa020bcba>] hdmi_unsol_event+0x9a/0xb0 [snd_hda_codec_hdmi]
This wasn't seen until now because the code path using i915 audio notifier doesn't need to power up the codec. Now we switched to the old method for old chips, and the bug is revealed. It's good to have caught it now, because basically this hits all non-Intel chips.
Takashi