Hi Pierre-Louis, again Cc Takashi-san
Takashi's feedback seems to hint at problems with this patch, so will wait for further instructions here if you want me to test. Thanks!
OK, I will post patch as "RFC". Because I can't test it.
I try to explain it here, not at RFC :)
I'm not 100% sure detail of SOF / topology, but in my understanding, topology want to add *extra* dai_link by using snd_soc_add_dai_link(). And it will be removed by snd_soc_romove_dai_link().
Before, this snd_soc_add/remove_dai_link() and/or its related functions are unbalanced. Now, these are balance-uping, and it finds the random operation issue.
topology will call snd_soc_remove_dai_link() via (A). In my understanding, currently, SOF and skylake are calling it.
static void soc_cleanup_card_resources(struct snd_soc_card *card) { struct snd_soc_dai_link *link, *_link;
/* This should be called before snd_card_free() */ (A) soc_remove_link_components(card);
/* free the ALSA card at first; this syncs with pending operations */ if (card->snd_card) { (B) snd_card_free(card->snd_card); card->snd_card = NULL; }
/* remove and free each DAI */ (X) soc_remove_link_dais(card);
for_each_card_links_safe(card, link, _link) (C) snd_soc_remove_dai_link(card, link);
... }
At (A), topology calls snd_soc_remove_dai_link(). Then topology rtd, and its related all data are freed.
Next, (B) is called, and then, pcm->private_free = soc_pcm_private_free() is called.
static void soc_pcm_private_free(struct snd_pcm *pcm) { struct snd_soc_pcm_runtime *rtd = pcm->private_data;
/* need to sync the delayed work before releasing resources */ flush_delayed_work(&rtd->delayed_work); snd_soc_pcm_component_free(pcm); }
Here, it gets rtd via pcm->private_data. But, topology related rtd are already freed at (A). Above both flush_delayed_work() and snd_soc_pcm_component_free() are using rtd via pcm->private_data. I guess, your kernel access to already freed memory, and get Oops.
Normal sound card is freeing rtd at (C), thus no damage.
These flush_delayed_work() and snd_soc_pcm_component_free() are rtd related data's finalization. If so, these should be called when rtd was freed, not sound card was freed. It is very natural and understandable, I think.
In other words, pcm->private_free = soc_pcm_private_free() is no longer needed.
But, here one other issue exist. If we do above idea, there is zero chance to call soc_remove_dai() to topology related dai at (X). Because (A) removes rtd from card too, and, (X) is based on card connected rtd.
This means, (X) need to be called before (C) (= for normal sound) and (A) (= for topology).
Now, I want to focus this patch which is the reason why snd_card_free() = (B) is located there.
4efda5f2130da033aeedc5b3205569893b910de2 ("ASoC: Fix use-after-free at card unregistration")
Original snd_card_free() was called last of this function. But moved to top to avoid use-after-free issue. The issue was happen at soc_pcm_free() which was pcm->private_free, today it is updated to soc_pcm_private_free().
In other words, (B) need to be called before (C) (= for normal sound) and (A) (= for topology), because it needs (not yet freed) rtd. But, (A) need to be called before (B), because it needs card->snd_card pointer.
If we call flush_delayed_work() and snd_soc_pcm_component_free() (= same as soc_pcm_private_free()) when rtd was freed (= (C), (A)), there is no reason to call snd_card_free() at top of this function. It can be called end of this function, again.
But, in such case, it will likely break unbind again, as Takashi-san said. when unbind is performed in a busy state, the code may release still-in-use resources. At least we need to call snd_card_disconnect_sync() at the first place.
The final code can be...
static void soc_cleanup_card_resources(struct snd_soc_card *card) { struct snd_soc_dai_link *link, *_link;
if (card->snd_card) (Z) snd_card_disconnect_sync(card->snd_card);
(X) soc_remove_link_dais(card); (A) soc_remove_link_components(card);
for_each_card_links_safe(card, link, _link) (C) snd_soc_remove_dai_link(card, link);
... if (card->snd_card) { (B) snd_card_free(card->snd_card); card->snd_card = NULL; } }
To avoid release still-in-use resources, call snd_card_disconnect_sync() at (Z).
(X) is called for both non-topology and topology.
topology removes rtd via (A), and non topology removes rtd via (C).
And snd_card_free() is no longer related to use-after-free issue. Thus, (B) is OK.
But, these are just my guess. It works for me, but I can't re-produce the issue.
Below is the patch for "testing".
------------- diff --git a/include/sound/soc-component.h b/include/sound/soc-component.h index 6aa3ecb..cf726a5 100644 --- a/include/sound/soc-component.h +++ b/include/sound/soc-component.h @@ -413,6 +413,6 @@ struct page *snd_soc_pcm_component_page(struct snd_pcm_substream *substream, int snd_soc_pcm_component_mmap(struct snd_pcm_substream *substream, struct vm_area_struct *vma); int snd_soc_pcm_component_new(struct snd_pcm *pcm); -void snd_soc_pcm_component_free(struct snd_pcm *pcm); +void snd_soc_pcm_component_free(struct snd_soc_pcm_runtime *rtd);
#endif /* __SOC_COMPONENT_H */ diff --git a/sound/soc/soc-component.c b/sound/soc/soc-component.c index 98ef066..195979f 100644 --- a/sound/soc/soc-component.c +++ b/sound/soc/soc-component.c @@ -516,13 +516,12 @@ int snd_soc_pcm_component_new(struct snd_pcm *pcm) return 0; }
-void snd_soc_pcm_component_free(struct snd_pcm *pcm) +void snd_soc_pcm_component_free(struct snd_soc_pcm_runtime *rtd) { - struct snd_soc_pcm_runtime *rtd = pcm->private_data; struct snd_soc_rtdcom_list *rtdcom; struct snd_soc_component *component;
for_each_rtd_components(rtd, rtdcom, component) if (component->driver->pcm_destruct) - component->driver->pcm_destruct(component, pcm); + component->driver->pcm_destruct(component, rtd->pcm); } diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index 92260a9..42d87bb 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -419,6 +419,9 @@ static void soc_free_pcm_runtime(struct snd_soc_pcm_runtime *rtd)
list_del(&rtd->list);
+ flush_delayed_work(&rtd->delayed_work); + snd_soc_pcm_component_free(rtd); + /* * we don't need to call kfree() for rtd->dev * see @@ -1944,17 +1947,12 @@ static void soc_cleanup_card_resources(struct snd_soc_card *card) { struct snd_soc_dai_link *link, *_link;
- /* This should be called before snd_card_free() */ - soc_remove_link_components(card); - - /* free the ALSA card at first; this syncs with pending operations */ - if (card->snd_card) { - snd_card_free(card->snd_card); - card->snd_card = NULL; - } + if (card->snd_card) + snd_card_disconnect_sync(card->snd_card);
/* remove and free each DAI */ soc_remove_link_dais(card); + soc_remove_link_components(card);
for_each_card_links_safe(card, link, _link) snd_soc_remove_dai_link(card, link); @@ -1969,6 +1967,11 @@ static void soc_cleanup_card_resources(struct snd_soc_card *card) /* remove the card */ if (card->remove) card->remove(card); + + if (card->snd_card) { + snd_card_free(card->snd_card); + card->snd_card = NULL; + } }
static int snd_soc_bind_card(struct snd_soc_card *card) diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 4bf71e3..6e24f0a5 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2892,15 +2892,6 @@ static int dpcm_fe_dai_close(struct snd_pcm_substream *fe_substream) return ret; }
-static void soc_pcm_private_free(struct snd_pcm *pcm) -{ - struct snd_soc_pcm_runtime *rtd = pcm->private_data; - - /* need to sync the delayed work before releasing resources */ - flush_delayed_work(&rtd->delayed_work); - snd_soc_pcm_component_free(pcm); -} - /* create a new pcm */ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num) { @@ -3042,7 +3033,6 @@ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num) return ret; }
- pcm->private_free = soc_pcm_private_free; pcm->no_device_suspend = true; out: dev_info(rtd->card->dev, "%s <-> %s mapping ok\n",
Thank you for your help !! Best regards --- Kuninori Morimoto