On Sun, 06 Aug 2023 20:32:06 +0200, Takashi Iwai wrote:
On Sat, 05 Aug 2023 10:09:58 +0200, Takashi Iwai wrote:
On Fri, 04 Aug 2023 21:17:56 +0200, Curtis Malainey wrote:
On Fri, Aug 4, 2023 at 1:58 AM Takashi Iwai tiwai@suse.de wrote:
Now I've been reconsidering the problem, and thought of another possible workaround. We may add the card's refcount control for the device init and release, so that we delay the actual resource free. The basic idea is to take card->card_ref at the device init and unref it at the actual device release callback. Then the snd_card_free() call is held until all those refcounted devices are released.
Below is a PoC patch (yes, this can be split, too ;) A quick test on VM seems OK, but needs more reviews and tests.
It contains somewhat ugly conditional put_device() at the dev_free ops. We can make those a bit saner with some helpers later, too.
BTW, this approach may avoid another potential problem by the delayed release; if we finish the driver remove without waiting for the actual device releases, it may hit a code (the piece of the device release callback) of the already removed module, and it's not guaranteed to be present. I'm not sure whether this really happens, but it's another thing to be considered.
thanks,
Takashi
diff --git a/include/sound/core.h b/include/sound/core.h index f6e0dd648b80..00c514a80a4a 100644 --- a/include/sound/core.h +++ b/include/sound/core.h
Unfortunately it doesn't hold up in my testing, hit the devm vs device race bug after a little over 30min of hammering snd-dummy (on top of your for-next branch)
(snip)
I was talking with Stephen Boyd about this bug and his recommendation was to keep snd_card always out of devm and just allocate a pointer in devm to snd_card to puppet it as if it was managed via devm and just reference count rather than release the card so that its always handled through device->release. What do you think? This would go alongside the current patch proposed.
Indeed, that's another issue about devres vs delayed kobj release. A quick fix would be something like below, I suppose. (note: totally untested)
Scratch it. It's still obviously broken. Will cook more proper patches later.
Now RFC patchset is ready. I'll submit for testing in another thread.
Takashi