On Fri, 04 Aug 2023 21:17:56 +0200, Curtis Malainey wrote:
On Fri, Aug 4, 2023 at 1:58 AM Takashi Iwai tiwai@suse.de wrote:
Now I've been reconsidering the problem, and thought of another possible workaround. We may add the card's refcount control for the device init and release, so that we delay the actual resource free. The basic idea is to take card->card_ref at the device init and unref it at the actual device release callback. Then the snd_card_free() call is held until all those refcounted devices are released.
Below is a PoC patch (yes, this can be split, too ;) A quick test on VM seems OK, but needs more reviews and tests.
It contains somewhat ugly conditional put_device() at the dev_free ops. We can make those a bit saner with some helpers later, too.
BTW, this approach may avoid another potential problem by the delayed release; if we finish the driver remove without waiting for the actual device releases, it may hit a code (the piece of the device release callback) of the already removed module, and it's not guaranteed to be present. I'm not sure whether this really happens, but it's another thing to be considered.
thanks,
Takashi
diff --git a/include/sound/core.h b/include/sound/core.h index f6e0dd648b80..00c514a80a4a 100644 --- a/include/sound/core.h +++ b/include/sound/core.h
Unfortunately it doesn't hold up in my testing, hit the devm vs device race bug after a little over 30min of hammering snd-dummy (on top of your for-next branch)
(snip)
I was talking with Stephen Boyd about this bug and his recommendation was to keep snd_card always out of devm and just allocate a pointer in devm to snd_card to puppet it as if it was managed via devm and just reference count rather than release the card so that its always handled through device->release. What do you think? This would go alongside the current patch proposed.
Indeed, that's another issue about devres vs delayed kobj release. A quick fix would be something like below, I suppose. (note: totally untested)
thanks,
Takashi
--- a/sound/core/init.c +++ b/sound/core/init.c @@ -188,7 +188,10 @@ EXPORT_SYMBOL(snd_card_new);
static void __snd_card_release(struct device *dev, void *data) { - snd_card_free(data); + struct snd_card **card_p = data; + + if (card_p) + snd_card_free(*card_p); }
/** @@ -221,21 +224,25 @@ int snd_devm_card_new(struct device *parent, int idx, const char *xid, struct snd_card **card_ret) { struct snd_card *card; + struct snd_card **card_devres; int err;
*card_ret = NULL; - card = devres_alloc(__snd_card_release, sizeof(*card) + extra_size, - GFP_KERNEL); + card_devres = devres_alloc(__snd_card_release, sizeof(void *), GFP_KERNEL); + if (!card_devres) + return -ENOMEM; + devres_add(parent, card_devres); + + card = kzalloc(sizeof(*card) + extra_size, GFP_KERNEL); if (!card) return -ENOMEM; + card->managed = true; err = snd_card_init(card, parent, idx, xid, module, extra_size); - if (err < 0) { - devres_free(card); /* in managed mode, we need to free manually */ + if (err < 0) return err; - }
- devres_add(parent, card); + *card_devres = card; *card_ret = card; return 0; } @@ -295,8 +302,7 @@ static int snd_card_init(struct snd_card *card, struct device *parent, mutex_unlock(&snd_card_mutex); dev_err(parent, "cannot find the slot for index %d (range 0-%i), error: %d\n", idx, snd_ecards_limit - 1, err); - if (!card->managed) - kfree(card); /* manually free here, as no destructor called */ + kfree(card); /* manually free here, as no destructor called */ return err; } set_bit(idx, snd_cards_lock); /* lock it */ @@ -592,8 +598,7 @@ static int snd_card_do_free(struct snd_card *card) #endif if (card->release_completion) complete(card->release_completion); - if (!card->managed) - kfree(card); + kfree(card); return 0; }