[PATCH 0/9] ALSA: hda: Codec-reload bug fixes and cleanups

Total of 6 fixes and 3 cleanups - cleanups are last.
All of the fixes address problems that present themselves in situation when user engages in codec driver reload. Second condition to reproduce is two-step initialization of HDAudio codec - this is the case only for ASoC HDAudio bus driver as snd_hda_intel calls only compound function snd_hda_codec_new(). Once these conditions are met, several reload/unload scenarios end with null-ptr-deref and page faults. Goal of the series is to allow codec/bus driver reloading without any errors.
Amadeusz Sławiński (2): ALSA: hda: Reset all SIE bits in INTCTL ALSA: hda: Remove unused macro definition
Cezary Rojewski (7): ALSA: hda: Do not unset preset when cleaning up codec ALSA: hda: Fix null-ptr-deref when i915 fails and hdmi is denylisted ALSA: hda: Make device usage_count consistent across subsequent probing ALSA: hda: Fix put_device() inconsistency in error path ALSA: hda: Skip event processing for unregistered codecs ALSA: hda: Fix page fault in snd_hda_codec_shutdown() ALSA: hda/realtek: Remove redundant init_hook() in alc_default_init()
include/sound/hda_codec.h | 1 - include/sound/hdaudio.h | 1 + sound/hda/ext/hdac_ext_controller.c | 7 --- sound/hda/hdac_bus.c | 2 +- sound/hda/hdac_controller.c | 7 +-- sound/pci/hda/hda_bind.c | 7 +++ sound/pci/hda/hda_codec.c | 83 +++++++++++++++-------------- sound/pci/hda/patch_realtek.c | 3 -- sound/soc/codecs/hda.c | 4 +- 9 files changed, 56 insertions(+), 59 deletions(-)

snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with module unloading and triggers null-ptr-deref. Preset is assigned only once, during device/driver matching whereas module reload and unload follow completely different path and may occur several times during runtime.
Fixes: 9a6246ff78ac ("ALSA: hda - Implement unbind more safely") Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/pci/hda/hda_codec.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 7579a6982f47..9ceb642ac819 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -795,7 +795,6 @@ void snd_hda_codec_cleanup_for_unbind(struct hda_codec *codec) snd_array_free(&codec->cvt_setups); snd_array_free(&codec->spdif_out); snd_array_free(&codec->verbs); - codec->preset = NULL; codec->follower_dig_outs = NULL; codec->spdif_status_reset = 0; snd_array_free(&codec->mixers);

On Wed, 06 Jul 2022 14:02:22 +0200, Cezary Rojewski wrote:
snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with module unloading and triggers null-ptr-deref. Preset is assigned only once, during device/driver matching whereas module reload and unload follow completely different path and may occur several times during runtime.
Hm, the driver reload/unload does unbind. Keeping this field mean to leave the pointer to the possibly freed object, no?
And if it's not cleared, where is this field cleared instead?
thanks,
Takashi

On 2022-07-09 6:34 PM, Takashi Iwai wrote:
On Wed, 06 Jul 2022 14:02:22 +0200, Cezary Rojewski wrote:
snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with module unloading and triggers null-ptr-deref. Preset is assigned only once, during device/driver matching whereas module reload and unload follow completely different path and may occur several times during runtime.
Hm, the driver reload/unload does unbind. Keeping this field mean to leave the pointer to the possibly freed object, no?
And if it's not cleared, where is this field cleared instead?
avs-driver i.e. the bus driver takes responsibility for the codec device only. There is no real probe(), just the device creation and initialization of its fields. The rest is handled by the component driver (sound/soc/hda.c). If this field is cleared and the test is limited to reloading HDAudio codec module alone, we get a panic. Something similar to the stack found below my message.
In regard to the other question - are presets freed at all? It seems all of them are part of the static device-driver matching list. If so, the pointer is always valid.
[ 136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec] [ 136.827929] Code: ff 85 c0 0f 88 5b 0b 00 00 4d 8d bc 24 d0 03 00 00 4c 89 ff e8 e5 a2 9e ca 49 8b 9c 24 d0 03 00 00 48 8d 7b 10 e8 d4 a2 9e ca <48> 8b 73 10 4c 89 e7 e8 e8 7d fb ff 85 c0 0f 88 43 0b 00 00 4c 89 [ 136.828028] RSP: 0018:ffff888101af74d0 EFLAGS: 00010286 [ 136.828079] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8b4f1b1a [ 136.828128] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8e323d20 [ 136.828175] RBP: ffff888101af7540 R08: 1ffffffff1c647a4 R09: fffffbfff1c647a5 [ 136.828224] R10: ffffffff8e323d27 R11: fffffbfff1c647a4 R12: ffff888102920000 [ 136.828272] R13: ffff88810812e428 R14: ffff888102925028 R15: ffff8881029203d0 [ 136.828323] FS: 00007f9049dd8540(0000) GS:ffff888227100000(0000) knlGS:0000000000000000 [ 136.828380] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 136.828425] CR2: 0000000000000010 CR3: 000000010f086001 CR4: 00000000003706e0 [ 136.828474] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 136.828520] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 136.828568] Call Trace: [ 136.828593] <TASK> [ 136.828628] snd_soc_component_probe+0x3a/0x60 [snd_soc_core] [ 136.828981] soc_probe_component+0x276/0x4a0 [snd_soc_core] [ 136.829274] snd_soc_bind_card+0x819/0x13d0 [snd_soc_core] [ 136.829560] ? __kasan_slab_alloc+0x32/0x90 [ 136.829614] snd_soc_register_card+0x24e/0x260 [snd_soc_core] [ 136.829900] devm_snd_soc_register_card+0x48/0x90 [snd_soc_core] [ 136.830204] avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio] [ 136.830269] platform_probe+0x67/0x100 [ 136.830313] really_probe+0x1ff/0x500 [ 136.830354] __driver_probe_device+0xeb/0x240 [ 136.830397] driver_probe_device+0x4e/0xf0 [ 136.830438] __driver_attach+0xfd/0x210 [ 136.830478] ? __device_attach_driver+0x170/0x170 [ 136.830520] bus_for_each_dev+0xf9/0x150 [ 136.830557] ? subsys_dev_iter_exit+0x10/0x10 [ 136.830597] ? preempt_count_sub+0x18/0xc0 [ 136.830643] driver_attach+0x2d/0x40 [ 136.830679] bus_add_driver+0x28e/0x320 [ 136.830722] driver_register+0xdc/0x170 [ 136.830763] ? 0xffffffffc0428000 [ 136.830796] __platform_driver_register+0x39/0x40 [ 136.830842] avs_hdaudio_driver_init+0x1c/0x1000 [snd_soc_avs_hdaudio] [ 136.830902] do_one_initcall+0xa0/0x2e0 [ 136.830939] ? initcall_blacklisted+0x170/0x170 [ 136.830981] ? __kasan_kmalloc+0x88/0xa0 [ 136.831020] ? kasan_poison+0x3c/0x50 [ 136.831059] ? kasan_unpoison+0x28/0x50 [ 136.831100] ? kasan_poison+0x3c/0x50 [ 136.831139] ? __asan_register_globals+0x5e/0x70 [ 136.831187] do_init_module+0xf6/0x350 [ 136.831228] load_module+0x2bf5/0x2e30 (...)

On Mon, 11 Jul 2022 10:25:17 +0200, Cezary Rojewski wrote:
On 2022-07-09 6:34 PM, Takashi Iwai wrote:
On Wed, 06 Jul 2022 14:02:22 +0200, Cezary Rojewski wrote:
snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with module unloading and triggers null-ptr-deref. Preset is assigned only once, during device/driver matching whereas module reload and unload follow completely different path and may occur several times during runtime.
Hm, the driver reload/unload does unbind. Keeping this field mean to leave the pointer to the possibly freed object, no?
And if it's not cleared, where is this field cleared instead?
avs-driver i.e. the bus driver takes responsibility for the codec device only. There is no real probe(), just the device creation and initialization of its fields. The rest is handled by the component driver (sound/soc/hda.c). If this field is cleared and the test is limited to reloading HDAudio codec module alone, we get a panic. Something similar to the stack found below my message.
In regard to the other question - are presets freed at all? It seems all of them are part of the static device-driver matching list. If so, the pointer is always valid.
When the codec driver is unbound and the module is unloaded, the whole objects and symbols are gone.
[ 136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec] [ 136.827929] Code: ff 85 c0 0f 88 5b 0b 00 00 4d 8d bc 24 d0 03 00 00 4c 89 ff e8 e5 a2 9e ca 49 8b 9c 24 d0 03 00 00 48 8d 7b 10 e8 d4 a2 9e ca <48> 8b 73 10 4c 89 e7 e8 e8 7d fb ff 85 c0 0f 88 43 0b 00 00 4c 89 [ 136.828028] RSP: 0018:ffff888101af74d0 EFLAGS: 00010286 [ 136.828079] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8b4f1b1a [ 136.828128] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8e323d20 [ 136.828175] RBP: ffff888101af7540 R08: 1ffffffff1c647a4 R09: fffffbfff1c647a5 [ 136.828224] R10: ffffffff8e323d27 R11: fffffbfff1c647a4 R12: ffff888102920000 [ 136.828272] R13: ffff88810812e428 R14: ffff888102925028 R15: ffff8881029203d0 [ 136.828323] FS: 00007f9049dd8540(0000) GS:ffff888227100000(0000) knlGS:0000000000000000 [ 136.828380] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 136.828425] CR2: 0000000000000010 CR3: 000000010f086001 CR4: 00000000003706e0 [ 136.828474] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 136.828520] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 136.828568] Call Trace: [ 136.828593] <TASK> [ 136.828628] snd_soc_component_probe+0x3a/0x60 [snd_soc_core] [ 136.828981] soc_probe_component+0x276/0x4a0 [snd_soc_core] [ 136.829274] snd_soc_bind_card+0x819/0x13d0 [snd_soc_core] [ 136.829560] ? __kasan_slab_alloc+0x32/0x90 [ 136.829614] snd_soc_register_card+0x24e/0x260 [snd_soc_core] [ 136.829900] devm_snd_soc_register_card+0x48/0x90 [snd_soc_core] [ 136.830204] avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio] [ 136.830269] platform_probe+0x67/0x100 [ 136.830313] really_probe+0x1ff/0x500 [ 136.830354] __driver_probe_device+0xeb/0x240 [ 136.830397] driver_probe_device+0x4e/0xf0 [ 136.830438] __driver_attach+0xfd/0x210 [ 136.830478] ? __device_attach_driver+0x170/0x170 [ 136.830520] bus_for_each_dev+0xf9/0x150 [ 136.830557] ? subsys_dev_iter_exit+0x10/0x10 [ 136.830597] ? preempt_count_sub+0x18/0xc0 [ 136.830643] driver_attach+0x2d/0x40 [ 136.830679] bus_add_driver+0x28e/0x320 [ 136.830722] driver_register+0xdc/0x170 [ 136.830763] ? 0xffffffffc0428000 [ 136.830796] __platform_driver_register+0x39/0x40 [ 136.830842] avs_hdaudio_driver_init+0x1c/0x1000 [snd_soc_avs_hdaudio] [ 136.830902] do_one_initcall+0xa0/0x2e0 [ 136.830939] ? initcall_blacklisted+0x170/0x170 [ 136.830981] ? __kasan_kmalloc+0x88/0xa0 [ 136.831020] ? kasan_poison+0x3c/0x50 [ 136.831059] ? kasan_unpoison+0x28/0x50 [ 136.831100] ? kasan_poison+0x3c/0x50 [ 136.831139] ? __asan_register_globals+0x5e/0x70 [ 136.831187] do_init_module+0xf6/0x350 [ 136.831228] load_module+0x2bf5/0x2e30 (...)
Hmm, in the Oops above, at which moment, snd_hda_codec_cleanup_for_unbind() is called via which function? Is it the unload of HD-audio codec driver during the probe of AVS HD-audio?
The preset is assigned to the given HD-audio device object for the attached codec driver. Once after the codec driver gets unbound, you must not access to this codec driver's methods any longer, hence we clear the preset field.
So I wonder how the access to the codec->preset happens after the codec unbind.
thanks,
Takashi

On 2022-07-11 4:12 PM, Takashi Iwai wrote:
On Mon, 11 Jul 2022 10:25:17 +0200, Cezary Rojewski wrote:
...
avs-driver i.e. the bus driver takes responsibility for the codec device only. There is no real probe(), just the device creation and initialization of its fields. The rest is handled by the component driver (sound/soc/hda.c). If this field is cleared and the test is limited to reloading HDAudio codec module alone, we get a panic. Something similar to the stack found below my message.
In regard to the other question - are presets freed at all? It seems all of them are part of the static device-driver matching list. If so, the pointer is always valid.
When the codec driver is unbound and the module is unloaded, the whole objects and symbols are gone.
hda_codec_driver_remove() won't get even called when soc-card is being unbound so everything is still here.
[ 136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec]
[ 136.828568] Call Trace: [ 136.828593] <TASK> [ 136.828628] snd_soc_component_probe+0x3a/0x60 [snd_soc_core] [ 136.828981] soc_probe_component+0x276/0x4a0 [snd_soc_core] [ 136.829274] snd_soc_bind_card+0x819/0x13d0 [snd_soc_core] [ 136.829560] ? __kasan_slab_alloc+0x32/0x90 [ 136.829614] snd_soc_register_card+0x24e/0x260 [snd_soc_core] [ 136.829900] devm_snd_soc_register_card+0x48/0x90 [snd_soc_core] [ 136.830204] avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio]
(...)
Hmm, in the Oops above, at which moment, snd_hda_codec_cleanup_for_unbind() is called via which function? Is it the unload of HD-audio codec driver during the probe of AVS HD-audio?
The preset is assigned to the given HD-audio device object for the attached codec driver. Once after the codec driver gets unbound, you must not access to this codec driver's methods any longer, hence we clear the preset field.
So I wonder how the access to the codec->preset happens after the codec unbind.
Test scenario: - enumerate avs-driver stack on machine with HDAudio codec present - rmmod snd_soc_avs_hdaudio // just the machine board driver i.e. soc-card driver - modprobe snd_soc_avs_hdaudio
panic <<<
snd_hda_codec_cleanup_for_unbind() is called in more places than just HDAudio codec driver's probe() and remove(). It's also called whenever HDAudio codec soc-component is being removed. Relevant part of the stack showing when does the cleanup function get called during rmmod:
[ 220.549349] snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec] [ 220.549536] ? dump_stack_lvl+0x45/0x49 [ 220.549568] hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec] [ 220.549609] snd_soc_component_remove+0x34/0x40 [snd_soc_core] [ 220.549942] soc_remove_component+0x113/0x120 [snd_soc_core] [ 220.550249] soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core] [ 220.550561] snd_soc_unbind_card+0x9e/0x190 [snd_soc_core] [ 220.550885] snd_soc_unregister_card+0x28/0x80 [snd_soc_core] [ 220.551193] devm_card_release+0x1d/0x20 [snd_soc_core] [ 220.551527] release_nodes+0x73/0x170 [ 220.551549] ? preempt_count_sub+0x18/0xc0 [ 220.551576] devres_release_all+0x10a/0x150 [ 220.551600] ? devres_remove_group+0x260/0x260 [ 220.551630] device_unbind_cleanup+0x14/0xd0 [ 220.551656] device_release_driver_internal+0x146/0x1d0 [ 220.551688] driver_detach+0x81/0xf0 [ 220.551716] bus_remove_driver+0xae/0x170 [ 220.551743] driver_unregister+0x4d/0x70 [ 220.551770] platform_driver_unregister+0x12/0x20 [ 220.551799] avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]
Regards, Czarek

On Tue, 12 Jul 2022 11:42:56 +0200, Cezary Rojewski wrote:
On 2022-07-11 4:12 PM, Takashi Iwai wrote:
On Mon, 11 Jul 2022 10:25:17 +0200, Cezary Rojewski wrote:
...
avs-driver i.e. the bus driver takes responsibility for the codec device only. There is no real probe(), just the device creation and initialization of its fields. The rest is handled by the component driver (sound/soc/hda.c). If this field is cleared and the test is limited to reloading HDAudio codec module alone, we get a panic. Something similar to the stack found below my message.
In regard to the other question - are presets freed at all? It seems all of them are part of the static device-driver matching list. If so, the pointer is always valid.
When the codec driver is unbound and the module is unloaded, the whole objects and symbols are gone.
hda_codec_driver_remove() won't get even called when soc-card is being unbound so everything is still here.
[ 136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec]
[ 136.828568] Call Trace: [ 136.828593] <TASK> [ 136.828628] snd_soc_component_probe+0x3a/0x60 [snd_soc_core] [ 136.828981] soc_probe_component+0x276/0x4a0 [snd_soc_core] [ 136.829274] snd_soc_bind_card+0x819/0x13d0 [snd_soc_core] [ 136.829560] ? __kasan_slab_alloc+0x32/0x90 [ 136.829614] snd_soc_register_card+0x24e/0x260 [snd_soc_core] [ 136.829900] devm_snd_soc_register_card+0x48/0x90 [snd_soc_core] [ 136.830204] avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio]
(...)
Hmm, in the Oops above, at which moment, snd_hda_codec_cleanup_for_unbind() is called via which function? Is it the unload of HD-audio codec driver during the probe of AVS HD-audio?
The preset is assigned to the given HD-audio device object for the attached codec driver. Once after the codec driver gets unbound, you must not access to this codec driver's methods any longer, hence we clear the preset field.
So I wonder how the access to the codec->preset happens after the codec unbind.
Test scenario:
- enumerate avs-driver stack on machine with HDAudio codec present
- rmmod snd_soc_avs_hdaudio // just the machine board driver
i.e. soc-card driver
- modprobe snd_soc_avs_hdaudio
panic <<<
snd_hda_codec_cleanup_for_unbind() is called in more places than just HDAudio codec driver's probe() and remove(). It's also called whenever HDAudio codec soc-component is being removed. Relevant part of the stack showing when does the cleanup function get called during rmmod:
[ 220.549349] snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec] [ 220.549536] ? dump_stack_lvl+0x45/0x49 [ 220.549568] hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec] [ 220.549609] snd_soc_component_remove+0x34/0x40 [snd_soc_core] [ 220.549942] soc_remove_component+0x113/0x120 [snd_soc_core] [ 220.550249] soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core] [ 220.550561] snd_soc_unbind_card+0x9e/0x190 [snd_soc_core] [ 220.550885] snd_soc_unregister_card+0x28/0x80 [snd_soc_core] [ 220.551193] devm_card_release+0x1d/0x20 [snd_soc_core] [ 220.551527] release_nodes+0x73/0x170 [ 220.551549] ? preempt_count_sub+0x18/0xc0 [ 220.551576] devres_release_all+0x10a/0x150 [ 220.551600] ? devres_remove_group+0x260/0x260 [ 220.551630] device_unbind_cleanup+0x14/0xd0 [ 220.551656] device_release_driver_internal+0x146/0x1d0 [ 220.551688] driver_detach+0x81/0xf0 [ 220.551716] bus_remove_driver+0xae/0x170 [ 220.551743] driver_unregister+0x4d/0x70 [ 220.551770] platform_driver_unregister+0x12/0x20 [ 220.551799] avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]
So, IMO, you're scratching a wrong surface. The problem is rather that snd_hda_codec_cleanup_for_unbind() is called even if it's not for unbinding the codec.
Takashi

On 2022-07-12 12:46 PM, Takashi Iwai wrote:
On Tue, 12 Jul 2022 11:42:56 +0200, Cezary Rojewski wrote:
...
snd_hda_codec_cleanup_for_unbind() is called in more places than just HDAudio codec driver's probe() and remove(). It's also called whenever HDAudio codec soc-component is being removed. Relevant part of the stack showing when does the cleanup function get called during rmmod:
[ 220.549349] snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec] [ 220.549536] ? dump_stack_lvl+0x45/0x49 [ 220.549568] hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec] [ 220.549609] snd_soc_component_remove+0x34/0x40 [snd_soc_core] [ 220.549942] soc_remove_component+0x113/0x120 [snd_soc_core] [ 220.550249] soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core] [ 220.550561] snd_soc_unbind_card+0x9e/0x190 [snd_soc_core] [ 220.550885] snd_soc_unregister_card+0x28/0x80 [snd_soc_core] [ 220.551193] devm_card_release+0x1d/0x20 [snd_soc_core] [ 220.551527] release_nodes+0x73/0x170 [ 220.551549] ? preempt_count_sub+0x18/0xc0 [ 220.551576] devres_release_all+0x10a/0x150 [ 220.551600] ? devres_remove_group+0x260/0x260 [ 220.551630] device_unbind_cleanup+0x14/0xd0 [ 220.551656] device_release_driver_internal+0x146/0x1d0 [ 220.551688] driver_detach+0x81/0xf0 [ 220.551716] bus_remove_driver+0xae/0x170 [ 220.551743] driver_unregister+0x4d/0x70 [ 220.551770] platform_driver_unregister+0x12/0x20 [ 220.551799] avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]
So, IMO, you're scratching a wrong surface. The problem is rather that snd_hda_codec_cleanup_for_unbind() is called even if it's not for unbinding the codec.
This is how ASoC HDAudio codec component was behaving for years, see sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call snd_hda_codec_cleanup_for_unbind() then the teardown procedure will probably need a little update. Actually.. I'm afraid the init one would need an update to. Given how the implementation of HDAudio codec component's probe() and remove() looks like, there is no dropping the cleanup function without changing the upward path accordingly.
Well, to be honest the init/free procedures of HDAudio codec are a little hairy, perhaps it's time to address this.
Regards, Czarek

On Tue, 12 Jul 2022 12:58:09 +0200, Cezary Rojewski wrote:
On 2022-07-12 12:46 PM, Takashi Iwai wrote:
On Tue, 12 Jul 2022 11:42:56 +0200, Cezary Rojewski wrote:
...
snd_hda_codec_cleanup_for_unbind() is called in more places than just HDAudio codec driver's probe() and remove(). It's also called whenever HDAudio codec soc-component is being removed. Relevant part of the stack showing when does the cleanup function get called during rmmod:
[ 220.549349] snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec] [ 220.549536] ? dump_stack_lvl+0x45/0x49 [ 220.549568] hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec] [ 220.549609] snd_soc_component_remove+0x34/0x40 [snd_soc_core] [ 220.549942] soc_remove_component+0x113/0x120 [snd_soc_core] [ 220.550249] soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core] [ 220.550561] snd_soc_unbind_card+0x9e/0x190 [snd_soc_core] [ 220.550885] snd_soc_unregister_card+0x28/0x80 [snd_soc_core] [ 220.551193] devm_card_release+0x1d/0x20 [snd_soc_core] [ 220.551527] release_nodes+0x73/0x170 [ 220.551549] ? preempt_count_sub+0x18/0xc0 [ 220.551576] devres_release_all+0x10a/0x150 [ 220.551600] ? devres_remove_group+0x260/0x260 [ 220.551630] device_unbind_cleanup+0x14/0xd0 [ 220.551656] device_release_driver_internal+0x146/0x1d0 [ 220.551688] driver_detach+0x81/0xf0 [ 220.551716] bus_remove_driver+0xae/0x170 [ 220.551743] driver_unregister+0x4d/0x70 [ 220.551770] platform_driver_unregister+0x12/0x20 [ 220.551799] avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]
So, IMO, you're scratching a wrong surface. The problem is rather that snd_hda_codec_cleanup_for_unbind() is called even if it's not for unbinding the codec.
This is how ASoC HDAudio codec component was behaving for years, see sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call snd_hda_codec_cleanup_for_unbind() then the teardown procedure will probably need a little update.
Do we see a similar crash with the hdac-hda stuff, too?
And, after avs_hdaudio_driver_exit() is called, why the codec object still remains bound with the HD-audio (Realtek or whatever) codec driver?
Actually.. I'm afraid the init one would need an update to. Given how the implementation of HDAudio codec component's probe() and remove() looks like, there is no dropping the cleanup function without changing the upward path accordingly.
Well, to be honest the init/free procedures of HDAudio codec are a little hairy, perhaps it's time to address this.
Admittedly, the plumbing work for ASoC HD-audio was somewhat messy, and it's fine if we can clean things up.
snd_hda_codec_cleanup_for_unbind() is certainly written for unbinding the codec driver, and if a part of that function code is needed for different purposes, it should be factored out properly, at least.
thanks,
Takashi

On 2022-07-15 4:55 PM, Takashi Iwai wrote:
On Tue, 12 Jul 2022 12:58:09 +0200, Cezary Rojewski wrote:
This is how ASoC HDAudio codec component was behaving for years, see sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call snd_hda_codec_cleanup_for_unbind() then the teardown procedure will probably need a little update.
Do we see a similar crash with the hdac-hda stuff, too?
And, after avs_hdaudio_driver_exit() is called, why the codec object still remains bound with the HD-audio (Realtek or whatever) codec driver?
Hello Takashi,
Your reply was somehow missed by me and shows as a review for patch 5/9 in my email-client. Sorry for the delay.
In regard to the hdac_hda.c question, we did test reloading for the skylake-driver and there are several places where the driver can cause panics, that is, it may not even get to hdac_hda failing - some other panic will pop up faster.
But yes, the exact same problem exists there as both implementations handle hdev_attach/detach() and component's probe/remove() is similar fashion.
Actually.. I'm afraid the init one would need an update to. Given how the implementation of HDAudio codec component's probe() and remove() looks like, there is no dropping the cleanup function without changing the upward path accordingly.
Well, to be honest the init/free procedures of HDAudio codec are a little hairy, perhaps it's time to address this.
Admittedly, the plumbing work for ASoC HD-audio was somewhat messy, and it's fine if we can clean things up.
snd_hda_codec_cleanup_for_unbind() is certainly written for unbinding the codec driver, and if a part of that function code is needed for different purposes, it should be factored out properly, at least.
On ASoC side, component->probe() and component->remove() basically mimic the behavior of hda_codec_driver_probe/remove() found in sound/pci/hda/hda_bind.c. As ASoC sound card may be unbound without codec device being actually removed from the system, relying solely on driver's (not component's) probe/remove() may not be an option.
So, the discussion does not circle around just snd_hda_codec_cleanup_for_unbind() but basically any function that takes part in driver's probe() and remove() routines.
Right now, we are in a situation where user can generate a panic with a single rmmod. Also, our tests show no regression with modprobe/rmmod on snd_hda_intel side with this patch applied.
Regards, Czarek

On Tue, 17 Jan 2023 15:45:17 +0100, Cezary Rojewski wrote:
On 2022-07-15 4:55 PM, Takashi Iwai wrote:
On Tue, 12 Jul 2022 12:58:09 +0200, Cezary Rojewski wrote:
This is how ASoC HDAudio codec component was behaving for years, see sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call snd_hda_codec_cleanup_for_unbind() then the teardown procedure will probably need a little update.
Do we see a similar crash with the hdac-hda stuff, too?
And, after avs_hdaudio_driver_exit() is called, why the codec object still remains bound with the HD-audio (Realtek or whatever) codec driver?
Hello Takashi,
Your reply was somehow missed by me and shows as a review for patch 5/9 in my email-client. Sorry for the delay.
In regard to the hdac_hda.c question, we did test reloading for the skylake-driver and there are several places where the driver can cause panics, that is, it may not even get to hdac_hda failing - some other panic will pop up faster.
But yes, the exact same problem exists there as both implementations handle hdev_attach/detach() and component's probe/remove() is similar fashion.
Actually.. I'm afraid the init one would need an update to. Given how the implementation of HDAudio codec component's probe() and remove() looks like, there is no dropping the cleanup function without changing the upward path accordingly.
Well, to be honest the init/free procedures of HDAudio codec are a little hairy, perhaps it's time to address this.
Admittedly, the plumbing work for ASoC HD-audio was somewhat messy, and it's fine if we can clean things up.
snd_hda_codec_cleanup_for_unbind() is certainly written for unbinding the codec driver, and if a part of that function code is needed for different purposes, it should be factored out properly, at least.
On ASoC side, component->probe() and component->remove() basically mimic the behavior of hda_codec_driver_probe/remove() found in sound/pci/hda/hda_bind.c. As ASoC sound card may be unbound without codec device being actually removed from the system, relying solely on driver's (not component's) probe/remove() may not be an option.
So, the discussion does not circle around just snd_hda_codec_cleanup_for_unbind() but basically any function that takes part in driver's probe() and remove() routines.
Right now, we are in a situation where user can generate a panic with a single rmmod. Also, our tests show no regression with modprobe/rmmod on snd_hda_intel side with this patch applied.
Let's focus on that bug, then. Can we restart the thread with the minimal change? Otherwise it's hard to review and discuss further.
thanks,
Takashi

If snd_hda_hdmi_codec module is denylisted and any event causes i915 enumeration to fail, is_likely_hdmi_codec() ends in null-ptr-deref.
As snd_soc_hda is an ASoC-based driver, its initialization is delayed until all the necessary components appear in the system - allowing actual sound card to enumerate. snd_hda_codec_configure() gets called by the avs-driver core during probe_codecs() but the snd_hda_codec_device_new(), necessary to complete codecs initialization, happens only when codec-component of hda sound card is being probed.
Denylisting snd_hda_codec_hdmi module causes snd_hda_codec_configure() to reach: codec_bind_generic() -> is_likely_hdmi_codec() which makes use of ->wcaps and at this point the it isn't initialized yet - again, requires completion of snd_hda_codec_device_new().
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/pci/hda/hda_bind.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/sound/pci/hda/hda_bind.c b/sound/pci/hda/hda_bind.c index c572fb5886d5..cae9a975cbcc 100644 --- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -248,6 +248,13 @@ static bool is_likely_hdmi_codec(struct hda_codec *codec) { hda_nid_t nid;
+ /* + * For ASoC users, if snd_hda_hdmi_codec module is denylisted and any + * event causes i915 enumeration to fail, ->wcaps remains uninitialized. + */ + if (!codec->wcaps) + return true; + for_each_hda_codec_node(nid, codec) { unsigned int wcaps = get_wcaps(codec, nid); switch (get_wcaps_type(wcaps)) {

AVS HDAudio bus driver does not tie with codec drivers tighly and snd_hda_codec_device_new() can be called after codec's module reload. In such case, rpm is forbidden and invoking pm_runtime_forbid() unconditionally causes device's usage_count to become unbalanced. This is later caught by WARN_ON() found in sound/soc/hda.c. Detect such circumstance and bump the usage_count instead.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/pci/hda/hda_codec.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 9ceb642ac819..83d954ab056f 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -1044,8 +1044,14 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card, goto error; }
+#ifdef CONFIG_PM /* PM runtime needs to be enabled later after binding codec */ - pm_runtime_forbid(&codec->core.dev); + if (codec->core.dev.power.runtime_auto) + pm_runtime_forbid(&codec->core.dev); + else + /* Keep the usage_count consistent across subsequent probing */ + pm_runtime_get_noresume(&codec->core.dev); +#endif
return 0;

AVS HDAudio bus driver does not tie with codec drivers tighly. Codec device and its respective driver cleanup procedures are split and may not occur one after the other. Device cleanup is performed only on snd_hdac_ext_bus_device_remove() i.e. it's the bus driver's responsibility. If codec component probing fails, put_device() found in snd_hda_codec_device_new() may lead to page fault. Relocate it to snd_hda_codec_new() to address the problem on ASoC side while keeping status quo for snd_hda_intel.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/pci/hda/hda_codec.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 83d954ab056f..2381aced492f 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -949,6 +949,7 @@ int snd_hda_codec_new(struct hda_bus *bus, struct snd_card *card, unsigned int codec_addr, struct hda_codec **codecp) { struct hda_codec *codec; + int ret;
codec = snd_hda_codec_device_init(bus, codec_addr, "hdaudioC%dD%d", card->number, codec_addr); @@ -956,7 +957,11 @@ int snd_hda_codec_new(struct hda_bus *bus, struct snd_card *card, return PTR_ERR(codec); *codecp = codec;
- return snd_hda_codec_device_new(bus, card, codec_addr, *codecp, true); + ret = snd_hda_codec_device_new(bus, card, codec_addr, *codecp, true); + if (ret) + put_device(hda_codec_dev(*codecp)); + + return ret; } EXPORT_SYMBOL_GPL(snd_hda_codec_new);
@@ -1011,19 +1016,17 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card,
if (codec->bus->modelname) { codec->modelname = kstrdup(codec->bus->modelname, GFP_KERNEL); - if (!codec->modelname) { - err = -ENOMEM; - goto error; - } + if (!codec->modelname) + return -ENOMEM; }
fg = codec->core.afg ? codec->core.afg : codec->core.mfg; err = read_widget_caps(codec, fg); if (err < 0) - goto error; + return err; err = read_pin_defaults(codec); if (err < 0) - goto error; + return err;
/* power-up all before initialization */ hda_set_power_state(codec, AC_PWRST_D0); @@ -1041,7 +1044,7 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card, /* ASoC features component management instead */ err = snd_device_new(card, SNDRV_DEV_CODEC, codec, &dev_ops); if (err < 0) - goto error; + return err; }
#ifdef CONFIG_PM @@ -1054,10 +1057,6 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card, #endif
return 0; - - error: - put_device(hda_codec_dev(codec)); - return err; } EXPORT_SYMBOL_GPL(snd_hda_codec_device_new);

When codec is unbound but not yet removed, in the eyes of snd_hdac_bus_process_unsol_events() it is still a valid target to delegate work to. Such behaviour may lead to use-after-free errors. Address by verifying if codec is actually registered.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- include/sound/hda_codec.h | 1 - include/sound/hdaudio.h | 1 + sound/hda/hdac_bus.c | 2 +- sound/pci/hda/hda_codec.c | 10 +++++----- sound/soc/codecs/hda.c | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/sound/hda_codec.h b/include/sound/hda_codec.h index b7be300b6b18..6d3c82c4b6ac 100644 --- a/include/sound/hda_codec.h +++ b/include/sound/hda_codec.h @@ -231,7 +231,6 @@ struct hda_codec { /* misc flags */ unsigned int configured:1; /* codec was configured */ unsigned int in_freeing:1; /* being released */ - unsigned int registered:1; /* codec was registered */ unsigned int display_power_control:1; /* needs display power */ unsigned int spdif_status_reset :1; /* needs to toggle SPDIF for each * status change diff --git a/include/sound/hdaudio.h b/include/sound/hdaudio.h index 15f15075238d..797bf67a164d 100644 --- a/include/sound/hdaudio.h +++ b/include/sound/hdaudio.h @@ -93,6 +93,7 @@ struct hdac_device { bool lazy_cache:1; /* don't wake up for writes */ bool caps_overwriting:1; /* caps overwrite being in process */ bool cache_coef:1; /* cache COEF read/write too */ + unsigned int registered:1; /* codec was registered */ };
/* device/driver type used for matching */ diff --git a/sound/hda/hdac_bus.c b/sound/hda/hdac_bus.c index 71db8592b33d..d497414a5538 100644 --- a/sound/hda/hdac_bus.c +++ b/sound/hda/hdac_bus.c @@ -183,7 +183,7 @@ static void snd_hdac_bus_process_unsol_events(struct work_struct *work) if (!(caddr & (1 << 4))) /* no unsolicited event? */ continue; codec = bus->caddr_tbl[caddr & 0x0f]; - if (!codec || !codec->dev.driver) + if (!codec || !codec->registered) continue; spin_unlock_irq(&bus->reg_lock); drv = drv_to_hdac_driver(codec->dev.driver); diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 2381aced492f..75e85bf58681 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -772,11 +772,11 @@ static void codec_release_pcms(struct hda_codec *codec) */ void snd_hda_codec_cleanup_for_unbind(struct hda_codec *codec) { - if (codec->registered) { + if (codec->core.registered) { /* pm_runtime_put() is called in snd_hdac_device_exit() */ pm_runtime_get_noresume(hda_codec_dev(codec)); pm_runtime_disable(hda_codec_dev(codec)); - codec->registered = 0; + codec->core.registered = 0; }
snd_hda_codec_disconnect_pcms(codec); @@ -824,14 +824,14 @@ void snd_hda_codec_display_power(struct hda_codec *codec, bool enable) */ void snd_hda_codec_register(struct hda_codec *codec) { - if (codec->registered) + if (codec->core.registered) return; if (device_is_registered(hda_codec_dev(codec))) { snd_hda_codec_display_power(codec, true); pm_runtime_enable(hda_codec_dev(codec)); /* it was powered up in snd_hda_codec_new(), now all done */ snd_hda_power_down(codec); - codec->registered = 1; + codec->core.registered = 1; } } EXPORT_SYMBOL_GPL(snd_hda_codec_register); @@ -3047,7 +3047,7 @@ void snd_hda_codec_shutdown(struct hda_codec *codec) struct hda_pcm *cpcm;
/* Skip the shutdown if codec is not registered */ - if (!codec->registered) + if (!codec->core.registered) return;
cancel_delayed_work_sync(&codec->jackpoll_work); diff --git a/sound/soc/codecs/hda.c b/sound/soc/codecs/hda.c index edcb8bc6806b..ad20a3dff9b7 100644 --- a/sound/soc/codecs/hda.c +++ b/sound/soc/codecs/hda.c @@ -274,7 +274,7 @@ static void hda_codec_remove(struct snd_soc_component *component) struct hdac_device *hdev = &codec->core; struct hdac_bus *bus = hdev->bus; struct hdac_ext_link *hlink; - bool was_registered = codec->registered; + bool was_registered = codec->core.registered;
/* Don't allow any more runtime suspends */ pm_runtime_forbid(&hdev->dev); @@ -376,7 +376,7 @@ static int hda_hdev_detach(struct hdac_device *hdev) { struct hda_codec *codec = dev_to_hda_codec(&hdev->dev);
- if (codec->registered) + if (codec->core.registered) cancel_delayed_work_sync(&codec->jackpoll_work);
snd_soc_unregister_component(&hdev->dev);

On Wed, 06 Jul 2022 14:02:26 +0200, Cezary Rojewski wrote:
When codec is unbound but not yet removed, in the eyes of snd_hdac_bus_process_unsol_events() it is still a valid target to delegate work to. Such behaviour may lead to use-after-free errors. Address by verifying if codec is actually registered.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
include/sound/hda_codec.h | 1 - include/sound/hdaudio.h | 1 + sound/hda/hdac_bus.c | 2 +- sound/pci/hda/hda_codec.c | 10 +++++----- sound/soc/codecs/hda.c | 4 ++--
This patch doesn't apply cleanly because the change in sound/soc/codecs/hda.c still not included in my tree.
Mark, there's yet one case we need the merge of asoc-next :)
thanks,
Takashi

On Sat, 09 Jul 2022 18:47:41 +0200, Takashi Iwai wrote:
On Wed, 06 Jul 2022 14:02:26 +0200, Cezary Rojewski wrote:
When codec is unbound but not yet removed, in the eyes of snd_hdac_bus_process_unsol_events() it is still a valid target to delegate work to. Such behaviour may lead to use-after-free errors. Address by verifying if codec is actually registered.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
include/sound/hda_codec.h | 1 - include/sound/hdaudio.h | 1 + sound/hda/hdac_bus.c | 2 +- sound/pci/hda/hda_codec.c | 10 +++++----- sound/soc/codecs/hda.c | 4 ++--
This patch doesn't apply cleanly because the change in sound/soc/codecs/hda.c still not included in my tree.
Now merged to for-next branch.
thanks,
Takashi

If early probe of HDAudio bus driver fails e.g.: due to missing firmware file, snd_hda_codec_shutdown() ends in manipulating uninitialized codec->pcm_list_head causing page fault.
Iinitialization of HDAudio codec in ASoC is split in two: - snd_hda_codec_device_init() - snd_hda_codec_device_new()
snd_hda_codec_device_init() is called during probe_codecs() by HDAudio bus driver while snd_hda_codec_device_new() is called by codec-component's ->probe(). The second call will not happen until all components required by related sound card are present within the ASoC framework. With firmware failing to load during the PCI's deferred initialization i.e.: probe_work(), no platform components are ever registered. HDAudio codec enumeration is done at that point though, so the codec components became registered to ASoC framework, calling snd_hda_codec_device_init() in the process.
Now, during platform reboot snd_hda_codec_shutdown() is called for every codec found on the HDAudio bus causing oops if any of them has not completed both of their initialization steps. Relocating field initialization fixes the issue.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/pci/hda/hda_codec.c | 41 +++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 21 deletions(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 75e85bf58681..677d0a78f19c 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -930,8 +930,28 @@ snd_hda_codec_device_init(struct hda_bus *bus, unsigned int codec_addr, }
codec->bus = bus; + codec->depop_delay = -1; + codec->fixup_id = HDA_FIXUP_ID_NOT_SET; + codec->core.dev.release = snd_hda_codec_dev_release; + codec->core.exec_verb = codec_exec_verb; codec->core.type = HDA_DEV_LEGACY;
+ mutex_init(&codec->spdif_mutex); + mutex_init(&codec->control_mutex); + snd_array_init(&codec->mixers, sizeof(struct hda_nid_item), 32); + snd_array_init(&codec->nids, sizeof(struct hda_nid_item), 32); + snd_array_init(&codec->init_pins, sizeof(struct hda_pincfg), 16); + snd_array_init(&codec->driver_pins, sizeof(struct hda_pincfg), 16); + snd_array_init(&codec->cvt_setups, sizeof(struct hda_cvt_setup), 8); + snd_array_init(&codec->spdif_out, sizeof(struct hda_spdif_out), 16); + snd_array_init(&codec->jacktbl, sizeof(struct hda_jack_tbl), 16); + snd_array_init(&codec->verbs, sizeof(struct hda_verb *), 8); + INIT_LIST_HEAD(&codec->conn_list); + INIT_LIST_HEAD(&codec->pcm_list_head); + INIT_DELAYED_WORK(&codec->jackpoll_work, hda_jackpoll_work); + refcount_set(&codec->pcm_ref, 1); + init_waitqueue_head(&codec->remove_sleep); + return codec; } EXPORT_SYMBOL_GPL(snd_hda_codec_device_init); @@ -984,29 +1004,8 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card, if (snd_BUG_ON(codec_addr > HDA_MAX_CODEC_ADDRESS)) return -EINVAL;
- codec->core.dev.release = snd_hda_codec_dev_release; - codec->core.exec_verb = codec_exec_verb; - codec->card = card; codec->addr = codec_addr; - mutex_init(&codec->spdif_mutex); - mutex_init(&codec->control_mutex); - snd_array_init(&codec->mixers, sizeof(struct hda_nid_item), 32); - snd_array_init(&codec->nids, sizeof(struct hda_nid_item), 32); - snd_array_init(&codec->init_pins, sizeof(struct hda_pincfg), 16); - snd_array_init(&codec->driver_pins, sizeof(struct hda_pincfg), 16); - snd_array_init(&codec->cvt_setups, sizeof(struct hda_cvt_setup), 8); - snd_array_init(&codec->spdif_out, sizeof(struct hda_spdif_out), 16); - snd_array_init(&codec->jacktbl, sizeof(struct hda_jack_tbl), 16); - snd_array_init(&codec->verbs, sizeof(struct hda_verb *), 8); - INIT_LIST_HEAD(&codec->conn_list); - INIT_LIST_HEAD(&codec->pcm_list_head); - refcount_set(&codec->pcm_ref, 1); - init_waitqueue_head(&codec->remove_sleep); - - INIT_DELAYED_WORK(&codec->jackpoll_work, hda_jackpoll_work); - codec->depop_delay = -1; - codec->fixup_id = HDA_FIXUP_ID_NOT_SET;
#ifdef CONFIG_PM codec->power_jiffies = jiffies;

On 7/6/22 07:02, Cezary Rojewski wrote:
If early probe of HDAudio bus driver fails e.g.: due to missing firmware file, snd_hda_codec_shutdown() ends in manipulating uninitialized codec->pcm_list_head causing page fault.
Iinitialization of HDAudio codec in ASoC is split in two:
- snd_hda_codec_device_init()
- snd_hda_codec_device_new()
snd_hda_codec_device_init() is called during probe_codecs() by HDAudio bus driver while snd_hda_codec_device_new() is called by codec-component's ->probe(). The second call will not happen until all components required by related sound card are present within the ASoC framework. With firmware failing to load during the PCI's deferred initialization i.e.: probe_work(), no platform components are ever registered. HDAudio codec enumeration is done at that point though, so the codec components became registered to ASoC framework, calling snd_hda_codec_device_init() in the process.
Now, during platform reboot snd_hda_codec_shutdown() is called for every codec found on the HDAudio bus causing oops if any of them has not completed both of their initialization steps. Relocating field initialization fixes the issue.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
This patch causes an across-the-board regression on all SOF platforms using an HDaudio or iDISP codec. Only 'nocodec' platforms are unaffected, see results at https://sof-ci.01.org/linuxpr/PR3763/build394/devicetest/
Reverting this commit seems to fix the issue.
Upstream merge: https://github.com/thesofproject/linux/pull/3763
Issue and bisect results https://github.com/thesofproject/linux/issues/3764
I don't know what this was supposed to fix on the shutdown path but it's clearly having side effects on the probe/init path.
sound/pci/hda/hda_codec.c | 41 +++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 21 deletions(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 75e85bf58681..677d0a78f19c 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -930,8 +930,28 @@ snd_hda_codec_device_init(struct hda_bus *bus, unsigned int codec_addr, }
codec->bus = bus;
codec->depop_delay = -1;
codec->fixup_id = HDA_FIXUP_ID_NOT_SET;
codec->core.dev.release = snd_hda_codec_dev_release;
codec->core.exec_verb = codec_exec_verb; codec->core.type = HDA_DEV_LEGACY;
mutex_init(&codec->spdif_mutex);
mutex_init(&codec->control_mutex);
snd_array_init(&codec->mixers, sizeof(struct hda_nid_item), 32);
snd_array_init(&codec->nids, sizeof(struct hda_nid_item), 32);
snd_array_init(&codec->init_pins, sizeof(struct hda_pincfg), 16);
snd_array_init(&codec->driver_pins, sizeof(struct hda_pincfg), 16);
snd_array_init(&codec->cvt_setups, sizeof(struct hda_cvt_setup), 8);
snd_array_init(&codec->spdif_out, sizeof(struct hda_spdif_out), 16);
snd_array_init(&codec->jacktbl, sizeof(struct hda_jack_tbl), 16);
snd_array_init(&codec->verbs, sizeof(struct hda_verb *), 8);
INIT_LIST_HEAD(&codec->conn_list);
INIT_LIST_HEAD(&codec->pcm_list_head);
INIT_DELAYED_WORK(&codec->jackpoll_work, hda_jackpoll_work);
refcount_set(&codec->pcm_ref, 1);
init_waitqueue_head(&codec->remove_sleep);
return codec;
} EXPORT_SYMBOL_GPL(snd_hda_codec_device_init); @@ -984,29 +1004,8 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card, if (snd_BUG_ON(codec_addr > HDA_MAX_CODEC_ADDRESS)) return -EINVAL;
- codec->core.dev.release = snd_hda_codec_dev_release;
- codec->core.exec_verb = codec_exec_verb;
- codec->card = card; codec->addr = codec_addr;
- mutex_init(&codec->spdif_mutex);
- mutex_init(&codec->control_mutex);
- snd_array_init(&codec->mixers, sizeof(struct hda_nid_item), 32);
- snd_array_init(&codec->nids, sizeof(struct hda_nid_item), 32);
- snd_array_init(&codec->init_pins, sizeof(struct hda_pincfg), 16);
- snd_array_init(&codec->driver_pins, sizeof(struct hda_pincfg), 16);
- snd_array_init(&codec->cvt_setups, sizeof(struct hda_cvt_setup), 8);
- snd_array_init(&codec->spdif_out, sizeof(struct hda_spdif_out), 16);
- snd_array_init(&codec->jacktbl, sizeof(struct hda_jack_tbl), 16);
- snd_array_init(&codec->verbs, sizeof(struct hda_verb *), 8);
- INIT_LIST_HEAD(&codec->conn_list);
- INIT_LIST_HEAD(&codec->pcm_list_head);
- refcount_set(&codec->pcm_ref, 1);
- init_waitqueue_head(&codec->remove_sleep);
- INIT_DELAYED_WORK(&codec->jackpoll_work, hda_jackpoll_work);
- codec->depop_delay = -1;
- codec->fixup_id = HDA_FIXUP_ID_NOT_SET;
#ifdef CONFIG_PM codec->power_jiffies = jiffies;

On Fri, 15 Jul 2022 20:16:14 +0200, Pierre-Louis Bossart wrote:
On 7/6/22 07:02, Cezary Rojewski wrote:
If early probe of HDAudio bus driver fails e.g.: due to missing firmware file, snd_hda_codec_shutdown() ends in manipulating uninitialized codec->pcm_list_head causing page fault.
Iinitialization of HDAudio codec in ASoC is split in two:
- snd_hda_codec_device_init()
- snd_hda_codec_device_new()
snd_hda_codec_device_init() is called during probe_codecs() by HDAudio bus driver while snd_hda_codec_device_new() is called by codec-component's ->probe(). The second call will not happen until all components required by related sound card are present within the ASoC framework. With firmware failing to load during the PCI's deferred initialization i.e.: probe_work(), no platform components are ever registered. HDAudio codec enumeration is done at that point though, so the codec components became registered to ASoC framework, calling snd_hda_codec_device_init() in the process.
Now, during platform reboot snd_hda_codec_shutdown() is called for every codec found on the HDAudio bus causing oops if any of them has not completed both of their initialization steps. Relocating field initialization fixes the issue.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
This patch causes an across-the-board regression on all SOF platforms using an HDaudio or iDISP codec. Only 'nocodec' platforms are unaffected, see results at https://sof-ci.01.org/linuxpr/PR3763/build394/devicetest/
Reverting this commit seems to fix the issue.
Upstream merge: https://github.com/thesofproject/linux/pull/3763
Issue and bisect results https://github.com/thesofproject/linux/issues/3764
I don't know what this was supposed to fix on the shutdown path but it's clearly having side effects on the probe/init path.
Yeah, obviously the patch ignores the fact that hdac_hda does initialize the HD-audio codec without snd_hda_codec_device_init().
I'm going to revert the commit.
thanks,
Takashi

On 2022-07-15 8:23 PM, Takashi Iwai wrote:
On Fri, 15 Jul 2022 20:16:14 +0200, Pierre-Louis Bossart wrote:
...
This patch causes an across-the-board regression on all SOF platforms using an HDaudio or iDISP codec. Only 'nocodec' platforms are unaffected, see results at https://sof-ci.01.org/linuxpr/PR3763/build394/devicetest/
Reverting this commit seems to fix the issue.
Upstream merge: https://github.com/thesofproject/linux/pull/3763
Issue and bisect results https://github.com/thesofproject/linux/issues/3764
I don't know what this was supposed to fix on the shutdown path but it's clearly having side effects on the probe/init path.
Yeah, obviously the patch ignores the fact that hdac_hda does initialize the HD-audio codec without snd_hda_codec_device_init().
I'm going to revert the commit.
Thanks for the report and sorry the trouble.
This patch is still valid - will re-check hdac_hda.c and update it so that both the regression and the page fault are gone.
Regards, Czarek

From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
Old code resets SIE for up to 8 streams using byte accessor, but register is laid out in following way:
31 GIE 30 CIE 29:x Reserved x-1:0 SIE
If there is more than 8 streams, some of them may and up with enabled interrupts. To fix this just clear whole INTCTL register when disabling interrupts.
Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/hda/hdac_controller.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index f7bd6e2db085..9a60bfdb39ba 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -474,11 +474,8 @@ static void azx_int_disable(struct hdac_bus *bus) list_for_each_entry(azx_dev, &bus->stream_list, list) snd_hdac_stream_updateb(azx_dev, SD_CTL, SD_INT_MASK, 0);
- /* disable SIE for all streams */ - snd_hdac_chip_writeb(bus, INTCTL, 0); - - /* disable controller CIE and GIE */ - snd_hdac_chip_updatel(bus, INTCTL, AZX_INT_CTRL_EN | AZX_INT_GLOBAL_EN, 0); + /* disable SIE for all streams & disable controller CIE and GIE */ + snd_hdac_chip_writel(bus, INTCTL, 0); }
/* clear interrupts */

From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
It is not used anywhere in the file, so there is no need to keep it.
Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/hda/ext/hdac_ext_controller.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/sound/hda/ext/hdac_ext_controller.c b/sound/hda/ext/hdac_ext_controller.c index b072392725c7..a42f66f561f5 100644 --- a/sound/hda/ext/hdac_ext_controller.c +++ b/sound/hda/ext/hdac_ext_controller.c @@ -14,13 +14,6 @@ #include <sound/hda_register.h> #include <sound/hdaudio_ext.h>
-/* - * maximum HDAC capablities we should parse to avoid endless looping: - * currently we have 4 extended caps, so this is future proof for now. - * extend when this limit is seen meeting in real HW - */ -#define HDAC_MAX_CAPS 10 - /* * processing pipe helpers - these helpers are useful for dealing with HDA * new capability of processing pipelines

snd_hda_gen_init() does that for every codec already.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/pci/hda/patch_realtek.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index f3ad454b3fbf..a8688025352d 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -923,9 +923,6 @@ static int alc_init(struct hda_codec *codec) if (is_s4_resume(codec)) alc_pre_init(codec);
- if (spec->init_hook) - spec->init_hook(codec); - spec->gen.skip_verbs = 1; /* applied in below */ snd_hda_gen_init(codec); alc_fix_pll(codec);

On Wed, 06 Jul 2022 14:02:30 +0200, Cezary Rojewski wrote:
snd_hda_gen_init() does that for every codec already.
Definitely not.
snd_hda_gen_init() calls init_hook of snd_hda_gen_spec.
OTOH, what you're trying to kill the init_hook call of alc_spec. Both are in different layers and they don't share the same callback.
thanks,
Takashi
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
sound/pci/hda/patch_realtek.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index f3ad454b3fbf..a8688025352d 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -923,9 +923,6 @@ static int alc_init(struct hda_codec *codec) if (is_s4_resume(codec)) alc_pre_init(codec);
- if (spec->init_hook)
spec->init_hook(codec);
- spec->gen.skip_verbs = 1; /* applied in below */ snd_hda_gen_init(codec); alc_fix_pll(codec);
-- 2.25.1

On 2022-07-09 6:46 PM, Takashi Iwai wrote:
On Wed, 06 Jul 2022 14:02:30 +0200, Cezary Rojewski wrote:
snd_hda_gen_init() does that for every codec already.
Definitely not.
snd_hda_gen_init() calls init_hook of snd_hda_gen_spec.
OTOH, what you're trying to kill the init_hook call of alc_spec. Both are in different layers and they don't share the same callback.
I see the mistake now. This patch was generated when I was debugging one of the module-reload issues, and once it was fixed, double ->init_hook() caught my eye so I decided to add additional cleanup patch on top. Meh.
Thanks for paying attention and good review!
Regards, Czarek

On Wed, 06 Jul 2022 14:02:21 +0200, Cezary Rojewski wrote:
Total of 6 fixes and 3 cleanups - cleanups are last.
All of the fixes address problems that present themselves in situation when user engages in codec driver reload. Second condition to reproduce is two-step initialization of HDAudio codec - this is the case only for ASoC HDAudio bus driver as snd_hda_intel calls only compound function snd_hda_codec_new(). Once these conditions are met, several reload/unload scenarios end with null-ptr-deref and page faults. Goal of the series is to allow codec/bus driver reloading without any errors.
Amadeusz Sławiński (2): ALSA: hda: Reset all SIE bits in INTCTL ALSA: hda: Remove unused macro definition
Cezary Rojewski (7): ALSA: hda: Do not unset preset when cleaning up codec ALSA: hda: Fix null-ptr-deref when i915 fails and hdmi is denylisted ALSA: hda: Make device usage_count consistent across subsequent probing ALSA: hda: Fix put_device() inconsistency in error path ALSA: hda: Skip event processing for unregistered codecs ALSA: hda: Fix page fault in snd_hda_codec_shutdown() ALSA: hda/realtek: Remove redundant init_hook() in alc_default_init()
Now applied partially what are applicable and look correct: patches 2, 3, 4, 6, 7, and 8.
Patch 5 waits for Mark's PR of ASoC changes, and patch 1 needs a bit more clarification and investigation.
thanks,
Takashi
participants (3)
-
Cezary Rojewski
-
Pierre-Louis Bossart
-
Takashi Iwai