[PATCH RFC 0/6] ALSA: Fix UAF with delayed kobj release
Hi,
this is a test patch set for addressing the UAF problems with delayed kobj releases reported by Curtis: https://lore.kernel.org/r/20230801171928.1460120-1-cujomalainey@chromium.org
The patch introduced a simple helper for allocating memory with a refcount, and converts the card object, control, PCM and compress objects with the new type. With the refcount, the actual memory release is delayed until all referrer are gone.
It's just a RFC and only lightly tested. I myself am not sure whether this is the best way to go. It might be better to take Curtis' approach, just converting the device to its own allocation, too. (But I don't know whether Curtis' patch set covers all cases -- can still be a UAF of card_dev due to devres vs kobj release?)
thanks,
Takashi
===
Takashi Iwai (6): ALSA: core: Introduced referenced memory allocator ALSA: core: Fix potential UAF by delayed kobject release of card_dev ALSA: core: Associate memory reference with device initialization ALSA: pcm: Release memory with reference ALSA: control: Reference card by ctl_dev ALSA: compress: Reference card by the device
include/sound/core.h | 7 ++- sound/core/compress_offload.c | 2 +- sound/core/control.c | 2 +- sound/core/hwdep.c | 2 +- sound/core/init.c | 105 +++++++++++++++++++++++++++------ sound/core/pcm.c | 6 +- sound/core/rawmidi.c | 2 +- sound/core/seq/seq_clientmgr.c | 2 +- sound/core/timer.c | 2 +- 9 files changed, 101 insertions(+), 29 deletions(-)
Introduce simple helpers to allocate memory with a refcount. The refcount can be chained to the parent, so that it assures to keep the parent memory until all children are released.
Signed-off-by: Takashi Iwai tiwai@suse.de --- include/sound/core.h | 5 ++++ sound/core/init.c | 58 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+)
diff --git a/include/sound/core.h b/include/sound/core.h index f6e0dd648b80..6fccec08a12f 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -75,6 +75,11 @@ struct snd_device {
#define snd_device(n) list_entry(n, struct snd_device, list)
+/* referenced memory allocation */ +void *snd_refmem_alloc(size_t bytes, void *parent); +void *snd_refmem_get(void *p); +void snd_refmem_put(void *p); + /* main structure for soundcard */
struct snd_card { diff --git a/sound/core/init.c b/sound/core/init.c index baef2688d0cf..7e7c4b8d4e11 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -111,6 +111,64 @@ static int get_slot_from_bitmask(int mask, int (*check)(struct module *, int), return mask; /* unchanged */ }
+/* + * referenced memory allocation + */ + +struct snd_refmem { + struct kref kref; + void *parent; + char data[]; +}; + +#define to_refmem(p) container_of(p, struct snd_refmem, data) + +void *snd_refmem_alloc(size_t bytes, void *parent) +{ + struct snd_refmem *ref; + + ref = kzalloc(bytes + sizeof(*ref), GFP_KERNEL); + if (!ref) + return NULL; + kref_init(&ref->kref); + ref->parent = parent; + snd_refmem_get(parent); + return ref->data; +} +EXPORT_SYMBOL_GPL(snd_refmem_alloc); + +void *snd_refmem_get(void *p) +{ + struct snd_refmem *ref; + + if (!p) + return NULL; + ref = to_refmem(p); + kref_get(&ref->kref); + return p; +} +EXPORT_SYMBOL_GPL(snd_refmem_get); + +static void snd_refmem_release(struct kref *kref) +{ + struct snd_refmem *ref = container_of(kref, struct snd_refmem, kref); + void *parent = ref->parent; + + kfree(ref); + snd_refmem_put(parent); +} + +void snd_refmem_put(void *p) +{ + struct snd_refmem *ref; + + if (!p) + return; + ref = to_refmem(p); + kref_put(&ref->kref, snd_refmem_release); +} +EXPORT_SYMBOL_GPL(snd_refmem_put); + /* the default release callback set in snd_device_initialize() below; * this is just NOP for now, as almost all jobs are already done in * dev_free callback of snd_device chain instead.
Use a new refmem allocation for the card object, and fix the potential UAF of card object due to the race between the devres and the delayed kobj release. Now the devres keeps only the card object pointer, not the card object itself, and the card object is unreferenced at both releases.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/init.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-)
diff --git a/sound/core/init.c b/sound/core/init.c index 7e7c4b8d4e11..22da438faf40 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -231,7 +231,7 @@ int snd_card_new(struct device *parent, int idx, const char *xid,
if (extra_size < 0) extra_size = 0; - card = kzalloc(sizeof(*card) + extra_size, GFP_KERNEL); + card = snd_refmem_alloc(sizeof(*card) + extra_size, NULL); if (!card) return -ENOMEM;
@@ -246,7 +246,14 @@ EXPORT_SYMBOL(snd_card_new);
static void __snd_card_release(struct device *dev, void *data) { - snd_card_free(data); + struct snd_card **card_p = data; + struct snd_card *card; + + if (card_p) { + card = *card_p; + snd_card_free(card); + snd_refmem_put(card); + } }
/** @@ -279,21 +286,22 @@ int snd_devm_card_new(struct device *parent, int idx, const char *xid, struct snd_card **card_ret) { struct snd_card *card; + struct snd_card **card_devres; int err;
*card_ret = NULL; - card = devres_alloc(__snd_card_release, sizeof(*card) + extra_size, - GFP_KERNEL); - if (!card) + card_devres = devres_alloc(__snd_card_release, sizeof(void *), GFP_KERNEL); + if (!card_devres) return -ENOMEM; - card->managed = true; - err = snd_card_init(card, parent, idx, xid, module, extra_size); - if (err < 0) { - devres_free(card); /* in managed mode, we need to free manually */ - return err; - } + devres_add(parent, card_devres);
- devres_add(parent, card); + err = snd_card_new(parent, idx, xid, module, extra_size, &card); + if (err) + return err; + + card->managed = true; + snd_refmem_get(card); + *card_devres = card; *card_ret = card; return 0; } @@ -353,8 +361,7 @@ static int snd_card_init(struct snd_card *card, struct device *parent, mutex_unlock(&snd_card_mutex); dev_err(parent, "cannot find the slot for index %d (range 0-%i), error: %d\n", idx, snd_ecards_limit - 1, err); - if (!card->managed) - kfree(card); /* manually free here, as no destructor called */ + snd_refmem_put(card); /* manually free here, as no destructor called */ return err; } set_bit(idx, snd_cards_lock); /* lock it */ @@ -650,8 +657,7 @@ static int snd_card_do_free(struct snd_card *card) #endif if (card->release_completion) complete(card->release_completion); - if (!card->managed) - kfree(card); + snd_refmem_put(card); return 0; }
Use a new refmem allocation for the card object, and fix the race between the devres and the delayed kobj release. Now the devres keeps only the card object pointer, not the card object itself, and the card object is unreferenced at both releases.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/init.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-)
diff --git a/sound/core/init.c b/sound/core/init.c index 7e7c4b8d4e11..22da438faf40 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -231,7 +231,7 @@ int snd_card_new(struct device *parent, int idx, const char *xid,
if (extra_size < 0) extra_size = 0; - card = kzalloc(sizeof(*card) + extra_size, GFP_KERNEL); + card = snd_refmem_alloc(sizeof(*card) + extra_size, NULL); if (!card) return -ENOMEM;
@@ -246,7 +246,14 @@ EXPORT_SYMBOL(snd_card_new);
static void __snd_card_release(struct device *dev, void *data) { - snd_card_free(data); + struct snd_card **card_p = data; + struct snd_card *card; + + if (card_p) { + card = *card_p; + snd_card_free(card); + snd_refmem_put(card); + } }
/** @@ -279,21 +286,22 @@ int snd_devm_card_new(struct device *parent, int idx, const char *xid, struct snd_card **card_ret) { struct snd_card *card; + struct snd_card **card_devres; int err;
*card_ret = NULL; - card = devres_alloc(__snd_card_release, sizeof(*card) + extra_size, - GFP_KERNEL); - if (!card) + card_devres = devres_alloc(__snd_card_release, sizeof(void *), GFP_KERNEL); + if (!card_devres) return -ENOMEM; - card->managed = true; - err = snd_card_init(card, parent, idx, xid, module, extra_size); - if (err < 0) { - devres_free(card); /* in managed mode, we need to free manually */ - return err; - } + devres_add(parent, card_devres);
- devres_add(parent, card); + err = snd_card_new(parent, idx, xid, module, extra_size, &card); + if (err) + return err; + + card->managed = true; + snd_refmem_get(card); + *card_devres = card; *card_ret = card; return 0; } @@ -353,8 +361,7 @@ static int snd_card_init(struct snd_card *card, struct device *parent, mutex_unlock(&snd_card_mutex); dev_err(parent, "cannot find the slot for index %d (range 0-%i), error: %d\n", idx, snd_ecards_limit - 1, err); - if (!card->managed) - kfree(card); /* manually free here, as no destructor called */ + snd_refmem_put(card); /* manually free here, as no destructor called */ return err; } set_bit(idx, snd_cards_lock); /* lock it */ @@ -650,8 +657,7 @@ static int snd_card_do_free(struct snd_card *card) #endif if (card->release_completion) complete(card->release_completion); - if (!card->managed) - kfree(card); + snd_refmem_put(card); return 0; }
On Mon, 07 Aug 2023 15:52:03 +0200, Takashi Iwai wrote:
Use a new refmem allocation for the card object, and fix the race between the devres and the delayed kobj release. Now the devres keeps only the card object pointer, not the card object itself, and the card object is unreferenced at both releases.
Signed-off-by: Takashi Iwai tiwai@suse.de
My bad, scratch this mail; it was slipped into the submission from the previously generated patch files...
Other 6 patches (with RFC prefix) are fine.
Takashi
Allow to assign a refmem pointer to snd_device_initialize(). It takes the reference, and does unreference at the release callback in turn.
A caveat is that this uses drvdata for keeping the associated pointer.
Signed-off-by: Takashi Iwai tiwai@suse.de --- include/sound/core.h | 2 +- sound/core/compress_offload.c | 2 +- sound/core/control.c | 2 +- sound/core/hwdep.c | 2 +- sound/core/init.c | 9 ++++++--- sound/core/pcm.c | 2 +- sound/core/rawmidi.c | 2 +- sound/core/seq/seq_clientmgr.c | 2 +- sound/core/timer.c | 2 +- 9 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/include/sound/core.h b/include/sound/core.h index 6fccec08a12f..dfa5b44d9666 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -244,7 +244,7 @@ extern struct dentry *sound_debugfs_root;
void snd_request_card(int card);
-void snd_device_initialize(struct device *dev, struct snd_card *card); +void snd_device_initialize(struct device *dev, struct snd_card *card, void *refp);
int snd_register_device(int type, struct snd_card *card, int dev, const struct file_operations *f_ops, diff --git a/sound/core/compress_offload.c b/sound/core/compress_offload.c index 30f73097447b..d91fa8925cde 100644 --- a/sound/core/compress_offload.c +++ b/sound/core/compress_offload.c @@ -1189,7 +1189,7 @@ int snd_compress_new(struct snd_card *card, int device,
snd_compress_set_id(compr, id);
- snd_device_initialize(&compr->dev, card); + snd_device_initialize(&compr->dev, card, NULL); dev_set_name(&compr->dev, "comprC%iD%i", card->number, device);
ret = snd_device_new(card, SNDRV_DEV_COMPRESS, compr, &ops); diff --git a/sound/core/control.c b/sound/core/control.c index 8386b53acdcd..5b9340f5cb8c 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -2395,7 +2395,7 @@ int snd_ctl_create(struct snd_card *card) if (snd_BUG_ON(card->number < 0 || card->number >= SNDRV_CARDS)) return -ENXIO;
- snd_device_initialize(&card->ctl_dev, card); + snd_device_initialize(&card->ctl_dev, card, NULL); dev_set_name(&card->ctl_dev, "controlC%d", card->number);
err = snd_device_new(card, SNDRV_DEV_CONTROL, card, &ops); diff --git a/sound/core/hwdep.c b/sound/core/hwdep.c index e95fa275c289..5edea1094a07 100644 --- a/sound/core/hwdep.c +++ b/sound/core/hwdep.c @@ -382,7 +382,7 @@ int snd_hwdep_new(struct snd_card *card, char *id, int device, if (id) strscpy(hwdep->id, id, sizeof(hwdep->id));
- snd_device_initialize(&hwdep->dev, card); + snd_device_initialize(&hwdep->dev, card, NULL); hwdep->dev.release = release_hwdep_device; dev_set_name(&hwdep->dev, "hwC%iD%i", card->number, device); #ifdef CONFIG_SND_OSSEMUL diff --git a/sound/core/init.c b/sound/core/init.c index 22da438faf40..6bc77705ecc3 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -170,25 +170,28 @@ void snd_refmem_put(void *p) EXPORT_SYMBOL_GPL(snd_refmem_put);
/* the default release callback set in snd_device_initialize() below; - * this is just NOP for now, as almost all jobs are already done in - * dev_free callback of snd_device chain instead. + * unreference the memory here if it's specified at initialization */ static void default_release(struct device *dev) { + snd_refmem_put(dev_get_drvdata(dev)); }
/** * snd_device_initialize - Initialize struct device for sound devices * @dev: device to initialize * @card: card to assign, optional + * @refp: memory associated with snd_refmem */ -void snd_device_initialize(struct device *dev, struct snd_card *card) +void snd_device_initialize(struct device *dev, struct snd_card *card, void *refp) { device_initialize(dev); if (card) dev->parent = &card->card_dev; dev->class = &sound_class; dev->release = default_release; + dev_set_drvdata(dev, refp); + snd_refmem_get(refp); } EXPORT_SYMBOL_GPL(snd_device_initialize);
diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 9d95e3731123..461a10cc0db9 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -650,7 +650,7 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int stream, int substream_count) if (!substream_count) return 0;
- snd_device_initialize(&pstr->dev, pcm->card); + snd_device_initialize(&pstr->dev, pcm->card, NULL); pstr->dev.groups = pcm_dev_attr_groups; pstr->dev.type = &pcm_dev_type; dev_set_name(&pstr->dev, "pcmC%iD%i%c", pcm->card->number, pcm->device, diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index 2d3cec908154..34f124b126ca 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -1906,7 +1906,7 @@ int snd_rawmidi_init(struct snd_rawmidi *rmidi, if (id != NULL) strscpy(rmidi->id, id, sizeof(rmidi->id));
- snd_device_initialize(&rmidi->dev, card); + snd_device_initialize(&rmidi->dev, card, NULL); rmidi->dev.release = release_rawmidi_device; if (rawmidi_is_ump(rmidi)) dev_set_name(&rmidi->dev, "umpC%iD%i", card->number, device); diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c index e3f9ea67d019..66e73b35e57e 100644 --- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -2730,7 +2730,7 @@ int __init snd_sequencer_device_init(void) { int err;
- snd_device_initialize(&seq_dev, NULL); + snd_device_initialize(&seq_dev, NULL, NULL); dev_set_name(&seq_dev, "seq");
mutex_lock(®ister_mutex); diff --git a/sound/core/timer.c b/sound/core/timer.c index 9d0d2a5c2e15..04e77a89ecb6 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -2311,7 +2311,7 @@ static int __init alsa_timer_init(void) { int err;
- snd_device_initialize(&timer_dev, NULL); + snd_device_initialize(&timer_dev, NULL, NULL); dev_set_name(&timer_dev, "timer");
#ifdef SNDRV_OSS_INFO_DEV_TIMERS
Use refmem allocation for the PCM object that holds two PCM devices (for playback and capture). This fixes the UAF bug by the delayed kobj release.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/pcm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 461a10cc0db9..1e96437f3f0e 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -650,7 +650,7 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int stream, int substream_count) if (!substream_count) return 0;
- snd_device_initialize(&pstr->dev, pcm->card, NULL); + snd_device_initialize(&pstr->dev, pcm->card, pcm); pstr->dev.groups = pcm_dev_attr_groups; pstr->dev.type = &pcm_dev_type; dev_set_name(&pstr->dev, "pcmC%iD%i%c", pcm->card->number, pcm->device, @@ -721,7 +721,7 @@ static int _snd_pcm_new(struct snd_card *card, const char *id, int device, return -ENXIO; if (rpcm) *rpcm = NULL; - pcm = kzalloc(sizeof(*pcm), GFP_KERNEL); + pcm = snd_refmem_alloc(sizeof(*pcm), card); if (!pcm) return -ENOMEM; pcm->card = card; @@ -872,7 +872,7 @@ static int snd_pcm_free(struct snd_pcm *pcm) snd_pcm_lib_preallocate_free_for_all(pcm); snd_pcm_free_stream(&pcm->streams[SNDRV_PCM_STREAM_PLAYBACK]); snd_pcm_free_stream(&pcm->streams[SNDRV_PCM_STREAM_CAPTURE]); - kfree(pcm); + snd_refmem_put(pcm); return 0; }
Add the reference to the card object at control device initialization. This fixes the potential UAF by the delayed kobj release.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/control.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/core/control.c b/sound/core/control.c index 5b9340f5cb8c..7ac077b57709 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -2395,7 +2395,7 @@ int snd_ctl_create(struct snd_card *card) if (snd_BUG_ON(card->number < 0 || card->number >= SNDRV_CARDS)) return -ENXIO;
- snd_device_initialize(&card->ctl_dev, card, NULL); + snd_device_initialize(&card->ctl_dev, card, card); dev_set_name(&card->ctl_dev, "controlC%d", card->number);
err = snd_device_new(card, SNDRV_DEV_CONTROL, card, &ops);
Add the reference to the card object at compress device initialization. This fixes the potential UAF by the delayed kobj release.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/compress_offload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/core/compress_offload.c b/sound/core/compress_offload.c index d91fa8925cde..971f6d9f5906 100644 --- a/sound/core/compress_offload.c +++ b/sound/core/compress_offload.c @@ -1189,7 +1189,7 @@ int snd_compress_new(struct snd_card *card, int device,
snd_compress_set_id(compr, id);
- snd_device_initialize(&compr->dev, card, NULL); + snd_device_initialize(&compr->dev, card, card); dev_set_name(&compr->dev, "comprC%iD%i", card->number, device);
ret = snd_device_new(card, SNDRV_DEV_COMPRESS, compr, &ops);
It's just a RFC and only lightly tested.
Thanks for the series
I will be hammering this in my test setup for next several hours
I myself am not sure whether this is the best way to go. It might be better to take Curtis' approach, just converting the device to its own allocation, too. (But I don't know whether Curtis' patch set covers all cases -- can still be a UAF of card_dev due to devres vs kobj release?)
My original commit does not cover the devres kobj release race, only the race among the kobj themselves.
Curtis
On Mon, Aug 7, 2023 at 3:34 PM Curtis Malainey cujomalainey@google.com wrote:
It's just a RFC and only lightly tested.
Thanks for the series
I will be hammering this in my test setup for next several hours
Testing has yielded 0 bugs overnight.
After discussion it seems like this might be more of a workaround for the APIs than properly using them. Adding Stephen for more input but having two kobj in the same allocation is technically not correct as you essentially refcounting the same thing twice. Also having an empty release function essentially nullifies the purpose of the refcounts. We should probably consider something that uses the API as intended rather than trying to fight their function.
Curtis
Curtis
I myself am not sure whether this is the best way to go. It might be better to take Curtis' approach, just converting the device to its own allocation, too. (But I don't know whether Curtis' patch set covers all cases -- can still be a UAF of card_dev due to devres vs kobj release?)
My original commit does not cover the devres kobj release race, only the race among the kobj themselves.
Curtis
On Tue, 08 Aug 2023 21:26:55 +0200, Curtis Malainey wrote:
On Mon, Aug 7, 2023 at 3:34 PM Curtis Malainey cujomalainey@google.com wrote:
It's just a RFC and only lightly tested.
Thanks for the series
I will be hammering this in my test setup for next several hours
Testing has yielded 0 bugs overnight.
After discussion it seems like this might be more of a workaround for the APIs than properly using them. Adding Stephen for more input but having two kobj in the same allocation is technically not correct as you essentially refcounting the same thing twice. Also having an empty release function essentially nullifies the purpose of the refcounts. We should probably consider something that uses the API as intended rather than trying to fight their function.
Moving each PCM device and control device to own object and properly release in the own device release could be another way to go.
OTOH, I'm still wondering whether how to assure synchronization if all device releases are done asynchronously. If there are some dependencies between the resources (e.g. taking the parent's lock) at release, and how can it be guaranteed to work? Or, the release calls must not touch anything outside its own? If so, we'll still need to two places to finish the stuff: quiesce and release.
thanks,
Takashi
On Wed, 09 Aug 2023 10:10:04 +0200, Takashi Iwai wrote:
On Tue, 08 Aug 2023 21:26:55 +0200, Curtis Malainey wrote:
On Mon, Aug 7, 2023 at 3:34 PM Curtis Malainey cujomalainey@google.com wrote:
It's just a RFC and only lightly tested.
Thanks for the series
I will be hammering this in my test setup for next several hours
Testing has yielded 0 bugs overnight.
After discussion it seems like this might be more of a workaround for the APIs than properly using them. Adding Stephen for more input but having two kobj in the same allocation is technically not correct as you essentially refcounting the same thing twice. Also having an empty release function essentially nullifies the purpose of the refcounts. We should probably consider something that uses the API as intended rather than trying to fight their function.
Moving each PCM device and control device to own object and properly release in the own device release could be another way to go.
OTOH, I'm still wondering whether how to assure synchronization if all device releases are done asynchronously. If there are some dependencies between the resources (e.g. taking the parent's lock) at release, and how can it be guaranteed to work? Or, the release calls must not touch anything outside its own? If so, we'll still need to two places to finish the stuff: quiesce and release.
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Takashi
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Yes I think this would be the long term proper way to go, that way parents just put children and remove their reference, then they cleanup on their own time making everyone happy. My first patch was a very lazy attempt that, if we wanted to do the right thing we would obviously have to split the structs and free functions to operate in their own release. If you have time feel free to take another swing at the patches, otherwise I won't be able to start until next week.
Curtis
On Wed, 09 Aug 2023 23:11:45 +0200, Curtis Malainey wrote:
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Yes I think this would be the long term proper way to go, that way parents just put children and remove their reference, then they cleanup on their own time making everyone happy. My first patch was a very lazy attempt that, if we wanted to do the right thing we would obviously have to split the structs and free functions to operate in their own release. If you have time feel free to take another swing at the patches, otherwise I won't be able to start until next week.
Now looking back at the problem again, I noticed that actually my previous comment was wrong: as default, the device dependencies aren't kept at the release time, but it's already cleared at device_del() call. The device_del() calls kobject_del() and put_device(parent). So after this moment, both device releases become independent, and it'll hit a problem if the released object has still some dependency (such as the case of card vs ctl_dev in our case).
An extra dependency to card_dev as I put in my early patch would "fix" it. But, there is yet another problem: the call of dev_free call for snd_device object with SNDRV_DEV_LOWLEVEL can happen before releasing PCM and other devices when the delayed kobj release is enabled. And, usually this callback does release the top-level resources, which might be still accessed during the other releases.
So, if we tie the object resource with each struct device release, we have a lot of works: 1. Add extra dependencies among device hierarchy 2. Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero 3. Fix race of devres vs card_dev release 4. Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release 5. Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of SNDRV_DEV_LOWLEVEL.)
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Takashi
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote:
On Wed, 09 Aug 2023 23:11:45 +0200, Curtis Malainey wrote:
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Yes I think this would be the long term proper way to go, that way parents just put children and remove their reference, then they cleanup on their own time making everyone happy. My first patch was a very lazy attempt that, if we wanted to do the right thing we would obviously have to split the structs and free functions to operate in their own release. If you have time feel free to take another swing at the patches, otherwise I won't be able to start until next week.
Now looking back at the problem again, I noticed that actually my previous comment was wrong: as default, the device dependencies aren't kept at the release time, but it's already cleared at device_del() call. The device_del() calls kobject_del() and put_device(parent). So after this moment, both device releases become independent, and it'll hit a problem if the released object has still some dependency (such as the case of card vs ctl_dev in our case).
An extra dependency to card_dev as I put in my early patch would "fix" it. But, there is yet another problem: the call of dev_free call for snd_device object with SNDRV_DEV_LOWLEVEL can happen before releasing PCM and other devices when the delayed kobj release is enabled. And, usually this callback does release the top-level resources, which might be still accessed during the other releases.
I was doing some testing late last week that confirms these fears actually. Basically the rig was opening a non-blocking PCM and just never sending data, removing the hw sound device (in this case a USB peripheral) but not closing the userspace app. As a result kobjects were released out of order with the top level "sound" being released at disconnect of the USB device, but the actual card not being released until the app closed. See annotated logs below
[ 690.528577] kobject: 'sound' (00000000266cd308): kobject_add_internal: parent: '3-9:1.0', set: '(null)' <----- plug in device [ 690.528607] kobject: 'card1' (00000000f7fa0903): kobject_add_internal: parent: 'sound', set: 'devices' [ 690.528732] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.528756] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 690.528988] kobject: 'pcmC1D0p' (0000000095ff4473): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.529640] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 690.529778] kobject: 'pcmC1D0c' (00000000c4879d24): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530006] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 690.530108] kobject: 'controlC1' (00000000a0a25449): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530373] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 690.671889] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.671906] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 700.009244] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 700.010131] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 700.011344] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 700.012916] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 700.012951] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' <---- start blocked playback here [ 700.013057] kobject: 'sound' (00000000266cd308): kobject_release, parent 0000000000000000 (delayed 1000) <--- unplug usb device [ 701.054221] kobject: 'sound' (00000000266cd308): kobject_cleanup, parent 0000000000000000 [ 701.054236] kobject: 'sound' (00000000266cd308): calling ktype release [ 701.054257] kobject: 'sound': free name [ 713.639843] kobject: 'card1' (00000000f7fa0903): kobject_release, parent 0000000000000000 (delayed 3000) <--- Send EOF to playback stream [ 716.669776] kobject: 'card1' (00000000f7fa0903): kobject_cleanup, parent 0000000000000000 [ 716.669810] kobject: 'card1' (00000000f7fa0903): calling ktype release [ 716.670834] kobject: 'card1': free name
So, if we tie the object resource with each struct device release, we have a lot of works:
- Add extra dependencies among device hierarchy
- Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero
- Fix race of devres vs card_dev release
- Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release
- Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
Curtis
Takashi
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote:
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote:
On Wed, 09 Aug 2023 23:11:45 +0200, Curtis Malainey wrote:
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Yes I think this would be the long term proper way to go, that way parents just put children and remove their reference, then they cleanup on their own time making everyone happy. My first patch was a very lazy attempt that, if we wanted to do the right thing we would obviously have to split the structs and free functions to operate in their own release. If you have time feel free to take another swing at the patches, otherwise I won't be able to start until next week.
Now looking back at the problem again, I noticed that actually my previous comment was wrong: as default, the device dependencies aren't kept at the release time, but it's already cleared at device_del() call. The device_del() calls kobject_del() and put_device(parent). So after this moment, both device releases become independent, and it'll hit a problem if the released object has still some dependency (such as the case of card vs ctl_dev in our case).
An extra dependency to card_dev as I put in my early patch would "fix" it. But, there is yet another problem: the call of dev_free call for snd_device object with SNDRV_DEV_LOWLEVEL can happen before releasing PCM and other devices when the delayed kobj release is enabled. And, usually this callback does release the top-level resources, which might be still accessed during the other releases.
I was doing some testing late last week that confirms these fears actually. Basically the rig was opening a non-blocking PCM and just never sending data, removing the hw sound device (in this case a USB peripheral) but not closing the userspace app. As a result kobjects were released out of order with the top level "sound" being released at disconnect of the USB device, but the actual card not being released until the app closed. See annotated logs below
[ 690.528577] kobject: 'sound' (00000000266cd308): kobject_add_internal: parent: '3-9:1.0', set: '(null)' <----- plug in device [ 690.528607] kobject: 'card1' (00000000f7fa0903): kobject_add_internal: parent: 'sound', set: 'devices' [ 690.528732] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.528756] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 690.528988] kobject: 'pcmC1D0p' (0000000095ff4473): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.529640] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 690.529778] kobject: 'pcmC1D0c' (00000000c4879d24): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530006] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 690.530108] kobject: 'controlC1' (00000000a0a25449): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530373] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 690.671889] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.671906] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 700.009244] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 700.010131] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 700.011344] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 700.012916] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 700.012951] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' <---- start blocked playback here [ 700.013057] kobject: 'sound' (00000000266cd308): kobject_release, parent 0000000000000000 (delayed 1000) <--- unplug usb device [ 701.054221] kobject: 'sound' (00000000266cd308): kobject_cleanup, parent 0000000000000000 [ 701.054236] kobject: 'sound' (00000000266cd308): calling ktype release [ 701.054257] kobject: 'sound': free name [ 713.639843] kobject: 'card1' (00000000f7fa0903): kobject_release, parent 0000000000000000 (delayed 3000) <--- Send EOF to playback stream [ 716.669776] kobject: 'card1' (00000000f7fa0903): kobject_cleanup, parent 0000000000000000 [ 716.669810] kobject: 'card1' (00000000f7fa0903): calling ktype release [ 716.670834] kobject: 'card1': free name
So, if we tie the object resource with each struct device release, we have a lot of works:
- Add extra dependencies among device hierarchy
- Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero
- Fix race of devres vs card_dev release
- Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release
- Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
I'll submit the patch set once (likely in tomorrow) after running a quick smoke test.
thanks,
Takashi
On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote:
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote:
On Wed, 09 Aug 2023 23:11:45 +0200, Curtis Malainey wrote:
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Yes I think this would be the long term proper way to go, that way parents just put children and remove their reference, then they cleanup on their own time making everyone happy. My first patch was a very lazy attempt that, if we wanted to do the right thing we would obviously have to split the structs and free functions to operate in their own release. If you have time feel free to take another swing at the patches, otherwise I won't be able to start until next week.
Now looking back at the problem again, I noticed that actually my previous comment was wrong: as default, the device dependencies aren't kept at the release time, but it's already cleared at device_del() call. The device_del() calls kobject_del() and put_device(parent). So after this moment, both device releases become independent, and it'll hit a problem if the released object has still some dependency (such as the case of card vs ctl_dev in our case).
An extra dependency to card_dev as I put in my early patch would "fix" it. But, there is yet another problem: the call of dev_free call for snd_device object with SNDRV_DEV_LOWLEVEL can happen before releasing PCM and other devices when the delayed kobj release is enabled. And, usually this callback does release the top-level resources, which might be still accessed during the other releases.
I was doing some testing late last week that confirms these fears actually. Basically the rig was opening a non-blocking PCM and just never sending data, removing the hw sound device (in this case a USB peripheral) but not closing the userspace app. As a result kobjects were released out of order with the top level "sound" being released at disconnect of the USB device, but the actual card not being released until the app closed. See annotated logs below
[ 690.528577] kobject: 'sound' (00000000266cd308): kobject_add_internal: parent: '3-9:1.0', set: '(null)' <----- plug in device [ 690.528607] kobject: 'card1' (00000000f7fa0903): kobject_add_internal: parent: 'sound', set: 'devices' [ 690.528732] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.528756] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 690.528988] kobject: 'pcmC1D0p' (0000000095ff4473): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.529640] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 690.529778] kobject: 'pcmC1D0c' (00000000c4879d24): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530006] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 690.530108] kobject: 'controlC1' (00000000a0a25449): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530373] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 690.671889] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.671906] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 700.009244] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 700.010131] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 700.011344] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 700.012916] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 700.012951] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' <---- start blocked playback here [ 700.013057] kobject: 'sound' (00000000266cd308): kobject_release, parent 0000000000000000 (delayed 1000) <--- unplug usb device [ 701.054221] kobject: 'sound' (00000000266cd308): kobject_cleanup, parent 0000000000000000 [ 701.054236] kobject: 'sound' (00000000266cd308): calling ktype release [ 701.054257] kobject: 'sound': free name [ 713.639843] kobject: 'card1' (00000000f7fa0903): kobject_release, parent 0000000000000000 (delayed 3000) <--- Send EOF to playback stream [ 716.669776] kobject: 'card1' (00000000f7fa0903): kobject_cleanup, parent 0000000000000000 [ 716.669810] kobject: 'card1' (00000000f7fa0903): calling ktype release [ 716.670834] kobject: 'card1': free name
So, if we tie the object resource with each struct device release, we have a lot of works:
- Add extra dependencies among device hierarchy
- Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero
- Fix race of devres vs card_dev release
- Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release
- Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
Took a look and ran it through the tests
You need to apply this diff
diff --git a/sound/usb/media.c b/sound/usb/media.c index 6d11fedb46326..d48db6f3ae659 100644 --- a/sound/usb/media.c +++ b/sound/usb/media.c @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream *subs, struct snd_pcm *pcm, { struct media_device *mdev; struct media_ctl *mctl; - struct device *pcm_dev = &pcm->streams[stream].dev; + struct device *pcm_dev = pcm->streams[stream].dev; u32 intf_type; int ret = 0; u16 mixer_pad;
Hammering probe and remove appears to be fine. Went 45min without issue.
Userspace holding references past hw removal appears to still be broken as sound is released while the app is still running.
-- remove usb device -- [ 4819.827476] kobject: 'controlC1' (00000000255a51c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release, parent 0000000000000000 (delayed 4000) [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup, parent 0000000000000000 [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release [ 4823.873654] kobject: 'sound': free name
-- end app -- [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release, parent 0000000000000000 (delayed 1000) [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup, parent 0000000000000000 [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release [ 4850.625663] kobject: 'card1': free name [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15): kobject_cleanup, parent 0000000000000000 [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release [ 4851.585708] kobject: 'pcmC1D0c': free name [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532): kobject_cleanup, parent 0000000000000000 [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release [ 4851.585752] kobject: 'pcmC1D0p': free name
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
Curtis
I'll submit the patch set once (likely in tomorrow) after running a quick smoke test.
thanks,
Takashi
On Tue, 15 Aug 2023 23:32:31 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote:
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote:
On Wed, 09 Aug 2023 23:11:45 +0200, Curtis Malainey wrote:
And now looking back at kobj code and device code, they do refcount parent objects. Maybe the problem is in our side -- the all devices are created with the original real device as the parent, including the card_dev, while there are some dependencies among children. So, if we build up a proper tree, pci_dev -> card_dev -> ctl_dev, pcm_dev, etc, one of the problems could be solved. It's more or less similar as what I suggested initially (referring card_dev at pcm), while changing the parent would make it implicitly.
Yes I think this would be the long term proper way to go, that way parents just put children and remove their reference, then they cleanup on their own time making everyone happy. My first patch was a very lazy attempt that, if we wanted to do the right thing we would obviously have to split the structs and free functions to operate in their own release. If you have time feel free to take another swing at the patches, otherwise I won't be able to start until next week.
Now looking back at the problem again, I noticed that actually my previous comment was wrong: as default, the device dependencies aren't kept at the release time, but it's already cleared at device_del() call. The device_del() calls kobject_del() and put_device(parent). So after this moment, both device releases become independent, and it'll hit a problem if the released object has still some dependency (such as the case of card vs ctl_dev in our case).
An extra dependency to card_dev as I put in my early patch would "fix" it. But, there is yet another problem: the call of dev_free call for snd_device object with SNDRV_DEV_LOWLEVEL can happen before releasing PCM and other devices when the delayed kobj release is enabled. And, usually this callback does release the top-level resources, which might be still accessed during the other releases.
I was doing some testing late last week that confirms these fears actually. Basically the rig was opening a non-blocking PCM and just never sending data, removing the hw sound device (in this case a USB peripheral) but not closing the userspace app. As a result kobjects were released out of order with the top level "sound" being released at disconnect of the USB device, but the actual card not being released until the app closed. See annotated logs below
[ 690.528577] kobject: 'sound' (00000000266cd308): kobject_add_internal: parent: '3-9:1.0', set: '(null)' <----- plug in device [ 690.528607] kobject: 'card1' (00000000f7fa0903): kobject_add_internal: parent: 'sound', set: 'devices' [ 690.528732] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.528756] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 690.528988] kobject: 'pcmC1D0p' (0000000095ff4473): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.529640] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 690.529778] kobject: 'pcmC1D0c' (00000000c4879d24): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530006] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 690.530108] kobject: 'controlC1' (00000000a0a25449): kobject_add_internal: parent: 'card1', set: 'devices' [ 690.530373] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 690.671889] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 690.671906] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 700.009244] kobject: 'controlC1' (00000000a0a25449): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 700.010131] kobject: 'pcmC1D0p' (0000000095ff4473): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 700.011344] kobject: 'pcmC1D0c' (00000000c4879d24): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 700.012916] kobject: 'card1' (00000000f7fa0903): kobject_uevent_env [ 700.012951] kobject: 'card1' (00000000f7fa0903): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' <---- start blocked playback here [ 700.013057] kobject: 'sound' (00000000266cd308): kobject_release, parent 0000000000000000 (delayed 1000) <--- unplug usb device [ 701.054221] kobject: 'sound' (00000000266cd308): kobject_cleanup, parent 0000000000000000 [ 701.054236] kobject: 'sound' (00000000266cd308): calling ktype release [ 701.054257] kobject: 'sound': free name [ 713.639843] kobject: 'card1' (00000000f7fa0903): kobject_release, parent 0000000000000000 (delayed 3000) <--- Send EOF to playback stream [ 716.669776] kobject: 'card1' (00000000f7fa0903): kobject_cleanup, parent 0000000000000000 [ 716.669810] kobject: 'card1' (00000000f7fa0903): calling ktype release [ 716.670834] kobject: 'card1': free name
So, if we tie the object resource with each struct device release, we have a lot of works:
- Add extra dependencies among device hierarchy
- Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero
- Fix race of devres vs card_dev release
- Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release
- Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
Took a look and ran it through the tests
You need to apply this diff
Thanks, I'll fix it up.
diff --git a/sound/usb/media.c b/sound/usb/media.c index 6d11fedb46326..d48db6f3ae659 100644 --- a/sound/usb/media.c +++ b/sound/usb/media.c @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream *subs, struct snd_pcm *pcm, { struct media_device *mdev; struct media_ctl *mctl;
struct device *pcm_dev = &pcm->streams[stream].dev;
struct device *pcm_dev = pcm->streams[stream].dev; u32 intf_type; int ret = 0; u16 mixer_pad;
Hammering probe and remove appears to be fine. Went 45min without issue.
Userspace holding references past hw removal appears to still be broken as sound is released while the app is still running.
-- remove usb device -- [ 4819.827476] kobject: 'controlC1' (00000000255a51c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release, parent 0000000000000000 (delayed 4000) [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup, parent 0000000000000000 [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release [ 4823.873654] kobject: 'sound': free name
-- end app -- [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release, parent 0000000000000000 (delayed 1000) [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup, parent 0000000000000000 [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release [ 4850.625663] kobject: 'card1': free name [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15): kobject_cleanup, parent 0000000000000000 [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release [ 4851.585708] kobject: 'pcmC1D0c': free name [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532): kobject_cleanup, parent 0000000000000000 [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release [ 4851.585752] kobject: 'pcmC1D0p': free name
It's the designed behavior. Those are device *files* that are deleted immediately at the disconnection while the application is still running. It's for avoiding a new application to be started after the disconnect. That is, only the device files in /dev/snd/* become invisible. Meanwhile, the already opened objects are still handled internally.
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
We want to trigger the top-level release free procedure once when all files are closed. This top-level release does put_device() of all belonging devices.
The card_dev device refcount was used for this purpose. OTOH, if we want to construct the topology of the devices until the actual deletion (i.e. keep card_dev until pcm and others are really released/deleted), the card_dev refcount will be used for managing the topology, too. So, it'll get a side-effect side-effect that the card_dev refcount won't be zero even after all files are closed (it's referred from the children).
So, it's a kind of preparation for the future.
Takashi
On Wed, 16 Aug 2023 07:35:52 +0200, Takashi Iwai wrote:
On Tue, 15 Aug 2023 23:32:31 +0200, Curtis Malainey wrote:
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
We want to trigger the top-level release free procedure once when all files are closed. This top-level release does put_device() of all belonging devices.
The card_dev device refcount was used for this purpose. OTOH, if we want to construct the topology of the devices until the actual deletion (i.e. keep card_dev until pcm and others are really released/deleted), the card_dev refcount will be used for managing the topology, too. So, it'll get a side-effect side-effect that the card_dev refcount won't be zero even after all files are closed (it's referred from the children).
So, it's a kind of preparation for the future.
That said, the first two patches are basically independent from the rest. We can apply the rest changes at first for addressing the existing UAF issues, then think of the further restructuring, too.
Takashi
On Tue, Aug 15, 2023 at 10:35 PM Takashi Iwai tiwai@suse.de wrote:
On Tue, 15 Aug 2023 23:32:31 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote:
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote:
So, if we tie the object resource with each struct device release, we have a lot of works:
- Add extra dependencies among device hierarchy
- Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero
- Fix race of devres vs card_dev release
- Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release
- Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
Took a look and ran it through the tests
You need to apply this diff
Thanks, I'll fix it up.
diff --git a/sound/usb/media.c b/sound/usb/media.c index 6d11fedb46326..d48db6f3ae659 100644 --- a/sound/usb/media.c +++ b/sound/usb/media.c @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream *subs, struct snd_pcm *pcm, { struct media_device *mdev; struct media_ctl *mctl;
struct device *pcm_dev = &pcm->streams[stream].dev;
struct device *pcm_dev = pcm->streams[stream].dev; u32 intf_type; int ret = 0; u16 mixer_pad;
Hammering probe and remove appears to be fine. Went 45min without issue.
Userspace holding references past hw removal appears to still be broken as sound is released while the app is still running.
-- remove usb device -- [ 4819.827476] kobject: 'controlC1' (00000000255a51c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release, parent 0000000000000000 (delayed 4000) [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup, parent 0000000000000000 [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release [ 4823.873654] kobject: 'sound': free name
-- end app -- [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release, parent 0000000000000000 (delayed 1000) [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup, parent 0000000000000000 [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release [ 4850.625663] kobject: 'card1': free name [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15): kobject_cleanup, parent 0000000000000000 [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release [ 4851.585708] kobject: 'pcmC1D0c': free name [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532): kobject_cleanup, parent 0000000000000000 [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release [ 4851.585752] kobject: 'pcmC1D0p': free name
It's the designed behavior. Those are device *files* that are deleted immediately at the disconnection while the application is still running. It's for avoiding a new application to be started after the disconnect. That is, only the device files in /dev/snd/* become invisible. Meanwhile, the already opened objects are still handled internally.
It seems incorrect from a refcounting perspective though. The device still has active children so we need to remove the entry but should not be triggering the release yet.
Based on the docs for kobject this is exactly how kobject_del behaves (remove from sysfs without dropping refcount) so it looks like we just need to fix the refcounting part.
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
We want to trigger the top-level release free procedure once when all files are closed. This top-level release does put_device() of all belonging devices.
Is there a need for this synchronization?
The card_dev device refcount was used for this purpose. OTOH, if we want to construct the topology of the devices until the actual deletion (i.e. keep card_dev until pcm and others are really released/deleted), the card_dev refcount will be used for managing the topology, too. So, it'll get a side-effect side-effect that the card_dev refcount won't be zero even after all files are closed (it's referred from the children).
I think this is the intended use case for this system and would make it much easier to free the code. This will even allow early partial removal of the card until user space lets go of whatever parts it's holding onto (e.g. PCM is still open but controls have cleaned up if there is no dependency on controls). It would allow for a lot of synchronization code to be removed and just let the card handle itself.
So, it's a kind of preparation for the future.
Takashi
On Wed, 16 Aug 2023 23:46:06 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 10:35 PM Takashi Iwai tiwai@suse.de wrote:
On Tue, 15 Aug 2023 23:32:31 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote:
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote:
So, if we tie the object resource with each struct device release, we have a lot of works:
- Add extra dependencies among device hierarchy
- Don't use card_dev refcount for managing the sync to device closes, introduce another kref instead; otherwise card_dev refcount would never reach to zero
- Fix race of devres vs card_dev release
- Move the second half part of snd_card_do_free() to the release callback of card_dev itself to sync with the top-level release
- Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via card->private_free or such; maybe the only problem is hda_intel.c and hda_tegra.c that need some work at the disconnection time, and we may introduce another hook in the card object to replace that
And, at this moment, I feel that it'd be easier to go back to the early way of device management, i.e. it'll be just like your patch, managing the device object independently from the rest resources. (This means also that the way freeing the resource for hwdep and rawmidi will go back again without the embedded device, too; they also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
The change 2 and 3 above can be still applied with your change, which will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
Once after fixing the current problem, we may work further on other stuff (e.g. item 5), so that we can switch again to the device-release model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
Took a look and ran it through the tests
You need to apply this diff
Thanks, I'll fix it up.
diff --git a/sound/usb/media.c b/sound/usb/media.c index 6d11fedb46326..d48db6f3ae659 100644 --- a/sound/usb/media.c +++ b/sound/usb/media.c @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream *subs, struct snd_pcm *pcm, { struct media_device *mdev; struct media_ctl *mctl;
struct device *pcm_dev = &pcm->streams[stream].dev;
struct device *pcm_dev = pcm->streams[stream].dev; u32 intf_type; int ret = 0; u16 mixer_pad;
Hammering probe and remove appears to be fine. Went 45min without issue.
Userspace holding references past hw removal appears to still be broken as sound is released while the app is still running.
-- remove usb device -- [ 4819.827476] kobject: 'controlC1' (00000000255a51c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release, parent 0000000000000000 (delayed 4000) [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup, parent 0000000000000000 [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release [ 4823.873654] kobject: 'sound': free name
-- end app -- [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release, parent 0000000000000000 (delayed 1000) [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup, parent 0000000000000000 [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release [ 4850.625663] kobject: 'card1': free name [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15): kobject_cleanup, parent 0000000000000000 [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release [ 4851.585708] kobject: 'pcmC1D0c': free name [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532): kobject_cleanup, parent 0000000000000000 [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release [ 4851.585752] kobject: 'pcmC1D0p': free name
It's the designed behavior. Those are device *files* that are deleted immediately at the disconnection while the application is still running. It's for avoiding a new application to be started after the disconnect. That is, only the device files in /dev/snd/* become invisible. Meanwhile, the already opened objects are still handled internally.
It seems incorrect from a refcounting perspective though. The device still has active children so we need to remove the entry but should not be triggering the release yet.
Do you mean the release of 'sound'? It's a sound class, and by calling device_del(), all children are gone at the disconnection, so it must be fine. All other calls are only about device file deletion, hence no release happened until the end of app.
Based on the docs for kobject this is exactly how kobject_del behaves (remove from sysfs without dropping refcount) so it looks like we just need to fix the refcounting part.
device_del() decreases the refcount of the *parent*, although it doesn't touch the refcount of the device it self. So, after device_del(), the tree hierarchy is gone, and that's one of the problems that makes things difficult.
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
We want to trigger the top-level release free procedure once when all files are closed. This top-level release does put_device() of all belonging devices.
Is there a need for this synchronization?
Sort of, yes.
The card_dev device refcount was used for this purpose. OTOH, if we want to construct the topology of the devices until the actual deletion (i.e. keep card_dev until pcm and others are really released/deleted), the card_dev refcount will be used for managing the topology, too. So, it'll get a side-effect side-effect that the card_dev refcount won't be zero even after all files are closed (it's referred from the children).
I think this is the intended use case for this system and would make it much easier to free the code. This will even allow early partial removal of the card until user space lets go of whatever parts it's holding onto (e.g. PCM is still open but controls have cleaned up if there is no dependency on controls).
And, that's the problem. If the control is cleared before PCM, PCM release may hit another UAF, as it tries to the associated channel map and other controls. And, for deleting a PCM or a control, it'll touch the card's lock or linked list, so it still relies on the parent. The release of each component can't be done alone, as there are some dependencies.
It would allow for a lot of synchronization code to be removed and just let the card handle itself.
I guess it's doable to rewrite many things, but the handling will be still complex. Imagine you'd need to deal with a dynamic removal of a PCM device while the system is in use (e.g. when a codec is disconnected). This deletion procedure must access the top-level card stuff for locking or such, so the dependency is still present depending on the situation.
Takashi
On Wed, Aug 16, 2023 at 11:21 PM Takashi Iwai tiwai@suse.de wrote:
On Wed, 16 Aug 2023 23:46:06 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 10:35 PM Takashi Iwai tiwai@suse.de wrote:
On Tue, 15 Aug 2023 23:32:31 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote:
On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote: > > > > So, if we tie the object resource with each struct device release, we > have a lot of works: > 1. Add extra dependencies among device hierarchy > 2. Don't use card_dev refcount for managing the sync to device closes, > introduce another kref instead; otherwise card_dev refcount would > never reach to zero > 3. Fix race of devres vs card_dev release > 4. Move the second half part of snd_card_do_free() to the release > callback of card_dev itself to sync with the top-level release > 5. Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via > card->private_free or such; > maybe the only problem is hda_intel.c and hda_tegra.c that need > some work at the disconnection time, and we may introduce another > hook in the card object to replace that > > And, at this moment, I feel that it'd be easier to go back to the > early way of device management, i.e. it'll be just like your patch, > managing the device object independently from the rest resources. > (This means also that the way freeing the resource for hwdep and > rawmidi will go back again without the embedded device, too; they > also suffer from the same problem of .)
I agree, I think as a simple stopgap, my earlier patch would at least appease the test until we can figure out the best way to do some heavier work on the kobj. I think the proxy pointer for devres would also be the best short term for 3.
> > The change 2 and 3 above can be still applied with your change, which > will fix the remaining devres-vs-card_dev problem.
I am not sure I follow the need for 2. If we broke ctl_dev out into its own memory region and structured everything as a proper tree we would have a proper cleanup and be able to use the refcounting properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
> Once after fixing the current problem, we may work further on other > stuff (e.g. item 5), so that we can switch again to the device-release > model eventually later, too.
Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am happy to help out here where I can.
I am going to see if I can split the release card as mentioned but also have refcount work as expected and have the release calls roll up the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
Took a look and ran it through the tests
You need to apply this diff
Thanks, I'll fix it up.
diff --git a/sound/usb/media.c b/sound/usb/media.c index 6d11fedb46326..d48db6f3ae659 100644 --- a/sound/usb/media.c +++ b/sound/usb/media.c @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream *subs, struct snd_pcm *pcm, { struct media_device *mdev; struct media_ctl *mctl;
struct device *pcm_dev = &pcm->streams[stream].dev;
struct device *pcm_dev = pcm->streams[stream].dev; u32 intf_type; int ret = 0; u16 mixer_pad;
Hammering probe and remove appears to be fine. Went 45min without issue.
Userspace holding references past hw removal appears to still be broken as sound is released while the app is still running.
-- remove usb device -- [ 4819.827476] kobject: 'controlC1' (00000000255a51c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release, parent 0000000000000000 (delayed 4000) [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup, parent 0000000000000000 [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release [ 4823.873654] kobject: 'sound': free name
-- end app -- [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release, parent 0000000000000000 (delayed 1000) [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup, parent 0000000000000000 [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release [ 4850.625663] kobject: 'card1': free name [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15): kobject_cleanup, parent 0000000000000000 [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release [ 4851.585708] kobject: 'pcmC1D0c': free name [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532): kobject_cleanup, parent 0000000000000000 [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release [ 4851.585752] kobject: 'pcmC1D0p': free name
It's the designed behavior. Those are device *files* that are deleted immediately at the disconnection while the application is still running. It's for avoiding a new application to be started after the disconnect. That is, only the device files in /dev/snd/* become invisible. Meanwhile, the already opened objects are still handled internally.
It seems incorrect from a refcounting perspective though. The device still has active children so we need to remove the entry but should not be triggering the release yet.
Do you mean the release of 'sound'? It's a sound class, and by calling device_del(), all children are gone at the disconnection, so it must be fine. All other calls are only about device file deletion, hence no release happened until the end of app.
Thanks for the clarification. Having a parent disappear is very strange compared to how I would expect the system to clean up. Removal makes sense but I was not expecting the release on the parent.
Based on the docs for kobject this is exactly how kobject_del behaves (remove from sysfs without dropping refcount) so it looks like we just need to fix the refcounting part.
device_del() decreases the refcount of the *parent*, although it doesn't touch the refcount of the device it self. So, after device_del(), the tree hierarchy is gone, and that's one of the problems that makes things difficult.
Ah and that is what I missed in the call. Thanks for the pointer. Although that seems bizarre, because won't we double put our parent on release as well as part of device_put->kobject_cleanup once we are actually released?
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
We want to trigger the top-level release free procedure once when all files are closed. This top-level release does put_device() of all belonging devices.
Is there a need for this synchronization?
Sort of, yes.
The card_dev device refcount was used for this purpose. OTOH, if we want to construct the topology of the devices until the actual deletion (i.e. keep card_dev until pcm and others are really released/deleted), the card_dev refcount will be used for managing the topology, too. So, it'll get a side-effect side-effect that the card_dev refcount won't be zero even after all files are closed (it's referred from the children).
I think this is the intended use case for this system and would make it much easier to free the code. This will even allow early partial removal of the card until user space lets go of whatever parts it's holding onto (e.g. PCM is still open but controls have cleaned up if there is no dependency on controls).
And, that's the problem. If the control is cleared before PCM, PCM release may hit another UAF, as it tries to the associated channel map and other controls. And, for deleting a PCM or a control, it'll touch the card's lock or linked list, so it still relies on the parent. The release of each component can't be done alone, as there are some dependencies.
Makes sense. I guess that raises the question, are there any cyclic dependencies in the graph? If yes then this indeed would be a lot tougher compared to just letting leaves release in order.
It would allow for a lot of synchronization code to be removed and just let the card handle itself.
I guess it's doable to rewrite many things, but the handling will be still complex. Imagine you'd need to deal with a dynamic removal of a PCM device while the system is in use (e.g. when a codec is disconnected). This deletion procedure must access the top-level card stuff for locking or such, so the dependency is still present depending on the situation.
Makes sense. Yea the parent model doesn't work amazingly here does it for synchronized release. Makes me think maybe we could just put a barrier in the release functions so we know they have all been put to 0.
Takashi
On Thu, Aug 17, 2023 at 10:25 AM Curtis Malainey cujomalainey@google.com wrote:
On Wed, Aug 16, 2023 at 11:21 PM Takashi Iwai tiwai@suse.de wrote:
On Wed, 16 Aug 2023 23:46:06 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 10:35 PM Takashi Iwai tiwai@suse.de wrote:
On Tue, 15 Aug 2023 23:32:31 +0200, Curtis Malainey wrote:
On Tue, Aug 15, 2023 at 9:07 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 14 Aug 2023 22:20:29 +0200, Curtis Malainey wrote: > > On Sun, Aug 13, 2023 at 1:08 AM Takashi Iwai tiwai@suse.de wrote: > > > > > > > > So, if we tie the object resource with each struct device release, we > > have a lot of works: > > 1. Add extra dependencies among device hierarchy > > 2. Don't use card_dev refcount for managing the sync to device closes, > > introduce another kref instead; otherwise card_dev refcount would > > never reach to zero > > 3. Fix race of devres vs card_dev release > > 4. Move the second half part of snd_card_do_free() to the release > > callback of card_dev itself to sync with the top-level release > > 5. Rewrite all SNDRV_DEV_LOWLEVEL usages to be called via > > card->private_free or such; > > maybe the only problem is hda_intel.c and hda_tegra.c that need > > some work at the disconnection time, and we may introduce another > > hook in the card object to replace that > > > > And, at this moment, I feel that it'd be easier to go back to the > > early way of device management, i.e. it'll be just like your patch, > > managing the device object independently from the rest resources. > > (This means also that the way freeing the resource for hwdep and > > rawmidi will go back again without the embedded device, too; they > > also suffer from the same problem of .) > > I agree, I think as a simple stopgap, my earlier patch would at least > appease the test until we can figure out the best way to do some > heavier work on the kobj. I think the proxy pointer for devres would > also be the best short term for 3. > > > > > The change 2 and 3 above can be still applied with your change, which > > will fix the remaining devres-vs-card_dev problem. > > I am not sure I follow the need for 2. If we broke ctl_dev out into > its own memory region and structured everything as a proper tree we > would have a proper cleanup and be able to use the refcounting > properly.
My thought was about the devres release that does kfree() of the card while the card's card_dev release itself is still delayed. This might be a needless fear, though, as snd_card_free() should sync with the actual card_dev release.
But, splitting the release-trigger and the actual memory release could be still worth.
> > Once after fixing the current problem, we may work further on other > > stuff (e.g. item 5), so that we can switch again to the device-release > > model eventually later, too. > > Agreed, I don't have any experience with SNDRV_DEV_LOWLEVEL but I am > happy to help out here where I can. > > I am going to see if I can split the release card as mentioned but > also have refcount work as expected and have the release calls roll up > the tree.
I quickly worked on and made a patch series. It's put in topic/dev-split branch of sound git tree https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic...
It'd be appreciated if you can review / test it.
Took a look and ran it through the tests
You need to apply this diff
Thanks, I'll fix it up.
diff --git a/sound/usb/media.c b/sound/usb/media.c index 6d11fedb46326..d48db6f3ae659 100644 --- a/sound/usb/media.c +++ b/sound/usb/media.c @@ -35,7 +35,7 @@ int snd_media_stream_init(struct snd_usb_substream *subs, struct snd_pcm *pcm, { struct media_device *mdev; struct media_ctl *mctl;
struct device *pcm_dev = &pcm->streams[stream].dev;
struct device *pcm_dev = pcm->streams[stream].dev; u32 intf_type; int ret = 0; u16 mixer_pad;
Hammering probe and remove appears to be fine. Went 45min without issue.
Userspace holding references past hw removal appears to still be broken as sound is released while the app is still running.
-- remove usb device -- [ 4819.827476] kobject: 'controlC1' (00000000255a51c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/controlC1' [ 4819.828114] kobject: 'pcmC1D0p' (00000000f0627532): kobject_uevent_env [ 4819.828145] kobject: 'pcmC1D0p' (00000000f0627532): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0p' [ 4819.828822] kobject: 'pcmC1D0c' (000000001b707a15): kobject_uevent_env [ 4819.828850] kobject: 'pcmC1D0c' (000000001b707a15): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1/pcmC1D0c' [ 4819.829405] kobject: 'card1' (000000005bce975e): kobject_uevent_env [ 4819.829428] kobject: 'card1' (000000005bce975e): fill_kobj_path: path = '/devices/pci0000:00/0000:00:14.0/usb3/3-9/3-9:1.0/sound/card1' [ 4819.829516] kobject: 'sound' (000000000bb52434): kobject_release, parent 0000000000000000 (delayed 4000) [ 4823.873625] kobject: 'sound' (000000000bb52434): kobject_cleanup, parent 0000000000000000 [ 4823.873645] kobject: 'sound' (000000000bb52434): calling ktype release [ 4823.873654] kobject: 'sound': free name
-- end app -- [ 4849.581815] kobject: 'pcmC1D0p' (00000000f0627532): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.581960] kobject: 'pcmC1D0c' (000000001b707a15): kobject_release, parent 0000000000000000 (delayed 2000) [ 4849.582626] kobject: 'card1' (000000005bce975e): kobject_release, parent 0000000000000000 (delayed 1000) [ 4850.625615] kobject: 'card1' (000000005bce975e): kobject_cleanup, parent 0000000000000000 [ 4850.625638] kobject: 'card1' (000000005bce975e): calling ktype release [ 4850.625663] kobject: 'card1': free name [ 4851.585647] kobject: 'pcmC1D0c' (000000001b707a15): kobject_cleanup, parent 0000000000000000 [ 4851.585672] kobject: 'pcmC1D0c' (000000001b707a15): calling ktype release [ 4851.585708] kobject: 'pcmC1D0c': free name [ 4851.585727] kobject: 'pcmC1D0p' (00000000f0627532): kobject_cleanup, parent 0000000000000000 [ 4851.585737] kobject: 'pcmC1D0p' (00000000f0627532): calling ktype release [ 4851.585752] kobject: 'pcmC1D0p': free name
It's the designed behavior. Those are device *files* that are deleted immediately at the disconnection while the application is still running. It's for avoiding a new application to be started after the disconnect. That is, only the device files in /dev/snd/* become invisible. Meanwhile, the already opened objects are still handled internally.
It seems incorrect from a refcounting perspective though. The device still has active children so we need to remove the entry but should not be triggering the release yet.
Do you mean the release of 'sound'? It's a sound class, and by calling device_del(), all children are gone at the disconnection, so it must be fine. All other calls are only about device file deletion, hence no release happened until the end of app.
Thanks for the clarification. Having a parent disappear is very strange compared to how I would expect the system to clean up. Removal makes sense but I was not expecting the release on the parent.
Based on the docs for kobject this is exactly how kobject_del behaves (remove from sysfs without dropping refcount) so it looks like we just need to fix the refcounting part.
device_del() decreases the refcount of the *parent*, although it doesn't touch the refcount of the device it self. So, after device_del(), the tree hierarchy is gone, and that's one of the problems that makes things difficult.
Ah and that is what I missed in the call. Thanks for the pointer. Although that seems bizarre, because won't we double put our parent on release as well as part of device_put->kobject_cleanup once we are actually released?
I think I must be missing a cleanup of the reference to the parent here, which would explain my logic error.
I still don't understand why you need the kref. The devices are already reference counting, why not use them? If we split them up into their own structs we could then just device_put everything on removal and let it roll up the tree with releases automatically, blocking where userspace is still holding references. I will share a patches sometime this week of what I mean. They will probably be a bit bigger blast radius but I think its what is needed here.
We want to trigger the top-level release free procedure once when all files are closed. This top-level release does put_device() of all belonging devices.
Is there a need for this synchronization?
Sort of, yes.
The card_dev device refcount was used for this purpose. OTOH, if we want to construct the topology of the devices until the actual deletion (i.e. keep card_dev until pcm and others are really released/deleted), the card_dev refcount will be used for managing the topology, too. So, it'll get a side-effect side-effect that the card_dev refcount won't be zero even after all files are closed (it's referred from the children).
I think this is the intended use case for this system and would make it much easier to free the code. This will even allow early partial removal of the card until user space lets go of whatever parts it's holding onto (e.g. PCM is still open but controls have cleaned up if there is no dependency on controls).
And, that's the problem. If the control is cleared before PCM, PCM release may hit another UAF, as it tries to the associated channel map and other controls. And, for deleting a PCM or a control, it'll touch the card's lock or linked list, so it still relies on the parent. The release of each component can't be done alone, as there are some dependencies.
Makes sense. I guess that raises the question, are there any cyclic dependencies in the graph? If yes then this indeed would be a lot tougher compared to just letting leaves release in order.
It would allow for a lot of synchronization code to be removed and just let the card handle itself.
I guess it's doable to rewrite many things, but the handling will be still complex. Imagine you'd need to deal with a dynamic removal of a PCM device while the system is in use (e.g. when a codec is disconnected). This deletion procedure must access the top-level card stuff for locking or such, so the dependency is still present depending on the situation.
Makes sense. Yea the parent model doesn't work amazingly here does it for synchronized release. Makes me think maybe we could just put a barrier in the release functions so we know they have all been put to 0.
Scratch the barrier idea, the more I think about it, this would vastly overcomplicate things.
I think your proposal is best that we just split out the devices as they are intended. Then just unbind and disconnect from userpace, then once everything is free call put on everything and let release clean it up in reverse.
Takashi
participants (2)
-
Curtis Malainey
-
Takashi Iwai