On 6/7/22 00:40, Amadeusz Sławiński wrote:
On 6/3/2022 10:14 PM, Dean Gehnert wrote:
Don't deference card in comp->card->snd_card before checking for NULL card.
During the unloading of ASoC kernel modules, there is a kernel oops in snd_soc_tplg_component_remove() that happens because comp->card is set to NULL in soc_cleanup_component().
Cc: Liam Girdwood lgirdwood@gmail.com Cc: Mark Brown broonie@kernel.org Cc: Jaroslav Kysela perex@perex.cz Cc: Takashi Iwai tiwai@suse.com Cc: alsa-devel@alsa-project.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 7e567b5ae063 ("ASoC: topology: Add missing rwsem around snd_ctl_remove() calls") Signed-off-by: Dean Gehnert deang@tpi.com
sound/soc/soc-topology.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c index 3f9d314fba16..cf0efe1147c2 100644 --- a/sound/soc/soc-topology.c +++ b/sound/soc/soc-topology.c @@ -2613,15 +2613,18 @@ EXPORT_SYMBOL_GPL(snd_soc_tplg_component_load); /* remove dynamic controls from the component driver */ int snd_soc_tplg_component_remove(struct snd_soc_component *comp) { - struct snd_card *card = comp->card->snd_card; + struct snd_card *card; struct snd_soc_dobj *dobj, *next_dobj; int pass; /* process the header types from end to start */ for (pass = SOC_TPLG_PASS_END; pass >= SOC_TPLG_PASS_START; pass--) { + card = (comp->card) ? comp->card->snd_card : NULL;
/* remove mixer controls */ - down_write(&card->controls_rwsem); + if (card) + down_write(&card->controls_rwsem); list_for_each_entry_safe(dobj, next_dobj, &comp->dobj_list, list) {
I'm pretty sure that quite a lot of operations in this list_for_each_entry_safe() loop require existing card...
I get your point... Many of the cases in the loop, the functions are also dereferencing 'comp->card->', so this patch may not be the actual root cause fix... It worked for us since the kernel oops no longer happens, but looks like there are many more dereferences that could still cause problems.
And trying to investigate more closely, there is no soc_cleanup_component() mentioned in commit message, for quite a few kernel versions - seems to have been removed during v5.5-rc1.
Ah yes! You are correct. soc_cleanup_component() functionality has been moved to soc_remove_component() in later kernels.
I would say to not merge this, unless problem can be reproduced with latest kernel and even then would consider if it is a correct fix.
Agreed... This patch made our our kernel oops go away, however, we are going to dig deeper into the kernel oops call stack to see if I can find the root cause problem. There is something going on with the sound cleanup since this happens during rmmod... We just need to dig deeper and see if we can find what is really causing the problems.
@@ -2660,7 +2663,8 @@ int snd_soc_tplg_component_remove(struct snd_soc_component *comp) break; } } - up_write(&card->controls_rwsem); + if (card) + up_write(&card->controls_rwsem); } /* let caller know if FW can be freed when no objects are left */