On Wed, 27 Dec 2023 08:37:07 +0100, Dominik Brodowski wrote:
Hi,
unfortunately, the latest 6.7.0-rc7 and the two previous rc kernels cause an oops in hdac_hda_dev_probe(); sound and resume-from-suspend subsequently do not work:
BUG: kernel NULL pointer dereference, address: 0000000000000078 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Hardware name: Dell Inc. XPS 9315/00KRKP, BIOS 1.1.3 05/11/2022 Workqueue: events sof_probe_work
RIP: 0010:hdac_hda_dev_probe+0x42/0xf0 Code: 48 8b 37 48 8b bb c8 04 00 00 e8 09 9b 0a 00 48 85 c0 48 89 c5 0f 84 a6 00 00 00 48 8b bb c8 04 00 00 48 89 c6 e8 1e 9a 0a 00 <41> 80 7c 24 78 00 75 46 b9 03 00 00 00 48 c7 c2 c0 b2 a1 ac 48 c7 RSP: 0000:ffffc90000207b50 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88811495d000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888108691600 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88811400b028 FS: 0000000000000000(0000) GS:ffff88886f500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000078 CR3: 00000002b7a5a000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace:
<TASK> ? __die+0x1e/0x70 ? page_fault_oops+0x17c/0x4b0 ? snd_hdac_ext_bus_link_get+0x24/0xc0 ? exc_page_fault+0x462/0x8e0 ? asm_exc_page_fault+0x26/0x30 ? hdac_hda_dev_probe+0x42/0xf0 really_probe+0x166/0x300 ? __pfx___device_attach_driver+0x10/0x10 __driver_probe_device+0x6e/0x120 driver_probe_device+0x1a/0x90 __device_attach_driver+0x8e/0xd0 bus_for_each_drv+0x90/0xf0 __device_attach+0xac/0x1a0 bus_probe_device+0x93/0xb0 device_add+0x669/0x860 snd_hdac_device_register+0x10/0x60 hda_codec_probe_bus+0x189/0x290 hda_dsp_probe+0x211/0x550 sof_probe_work+0x2c/0x430 ? process_one_work+0x19c/0x500 process_one_work+0x205/0x500 worker_thread+0x1dc/0x3e0
? __pfx_worker_thread+0x10/0x10 kthread+0xea/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30
</TASK> Modules linked in: CR2: 0000000000000078 ---[ end trace 0000000000000000 ]--- RIP: 0010:hdac_hda_dev_probe+0x42/0xf0 Code: 48 8b 37 48 8b bb c8 04 00 00 e8 09 9b 0a 00 48 85 c0 48 89 c5 0f 84 a6 00 00 00 48 8b bb c8 04 00 00 48 89 c6 e8 1e 9a 0a 00 <41> 80 7c 24 78 00 75 46 b9 03 00 00 00 48 c7 c2 c0 b2 a1 ac 48 c7 RSP: 0000:ffffc90000207b50 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88811495d000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888108691600 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88811400b028 FS: 0000000000000000(0000) GS:ffff88886f500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000078 CR3: 00000002b7a5a000 CR4: 0000000000f50ef0 PKRU: 55555554 note: kworker/2:0[24] exited with irqs disabled
I was able to bisect the issue to commit a0575b4add21 ("ASoC: hdac_hda: Conditionally register dais for HDMI and Analog"). Reverting that patch on top of mainline fixes it.
As I've been (and still am) off, I had too little time for taking a deeper look now, unfortunately. But my wild guess is that it's a NULL dereference of the hdac_hda_priv referred via hdac dev. If it's correct, a oneliner like below should work around the crash. Could you give it a try?
thanks,
Takashi
-- 8< -- --- a/sound/soc/codecs/hdac_hda.c +++ b/sound/soc/codecs/hdac_hda.c @@ -630,7 +630,7 @@ static int hdac_hda_dev_probe(struct hdac_device *hdev) snd_hdac_ext_bus_link_get(hdev->bus, hlink);
/* ASoC specific initialization */ - if (hda_pvt->need_display_power) + if (hda_pvt && hda_pvt->need_display_power) ret = devm_snd_soc_register_component(&hdev->dev, &hdac_hda_hdmi_codec, hdac_hda_hdmi_dais, ARRAY_SIZE(hdac_hda_hdmi_dais));