[PATCH] ALSA: control led: fix memory leak in snd_ctl_led_register
The snd_ctl_led_sysfs_add and snd_ctl_led_sysfs_remove should contain the refcount operations in pair. However, snd_ctl_led_sysfs_remove fails to decrease the refcount to zero, which causes device_release never to be invoked. This leads to memory leak to some resources, like struct device_private.
Fix this by calling put_device at the end of snd_ctl_led_sysfs_remove
Reported-by: syzbot+08a7d8b51ea048a74ffb@syzkaller.appspotmail.com Fixes: a135dfb5de1 ("ALSA: led control - add sysfs kcontrol LED marking layer") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com --- sound/core/control_led.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..fff2688b5019 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -371,6 +371,10 @@ static void snd_ctl_led_disconnect(struct snd_card *card) snd_ctl_led_refresh(); }
+static void snd_ctl_led_release(struct device *dev) +{ +} + /* * sysfs */ @@ -663,6 +667,7 @@ static void snd_ctl_led_sysfs_add(struct snd_card *card) led_card->number = card->number; led_card->led = led; device_initialize(&led_card->dev); + led_card->dev.release = snd_ctl_led_release; if (dev_set_name(&led_card->dev, "card%d", card->number) < 0) goto cerr; led_card->dev.parent = &led->dev; @@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); + put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
On Fri, May 28, 2021 at 09:17:57PM +0800, Dongliang Mu wrote:
The snd_ctl_led_sysfs_add and snd_ctl_led_sysfs_remove should contain the refcount operations in pair. However, snd_ctl_led_sysfs_remove fails to decrease the refcount to zero, which causes device_release never to be invoked. This leads to memory leak to some resources, like struct device_private.
Fix this by calling put_device at the end of snd_ctl_led_sysfs_remove
Reported-by: syzbot+08a7d8b51ea048a74ffb@syzkaller.appspotmail.com Fixes: a135dfb5de1 ("ALSA: led control - add sysfs kcontrol LED marking layer") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com
sound/core/control_led.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..fff2688b5019 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -371,6 +371,10 @@ static void snd_ctl_led_disconnect(struct snd_card *card) snd_ctl_led_refresh(); }
+static void snd_ctl_led_release(struct device *dev) +{ +}
Whatever you're trying to do, adding a dummy function is never the answer.
regards, dan carpenter
-
On Fri, May 28, 2021 at 9:33 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:17:57PM +0800, Dongliang Mu wrote:
The snd_ctl_led_sysfs_add and snd_ctl_led_sysfs_remove should contain the refcount operations in pair. However, snd_ctl_led_sysfs_remove fails to decrease the refcount to zero, which causes device_release never to be invoked. This leads to memory leak to some resources, like struct device_private.
Fix this by calling put_device at the end of snd_ctl_led_sysfs_remove
Reported-by: syzbot+08a7d8b51ea048a74ffb@syzkaller.appspotmail.com Fixes: a135dfb5de1 ("ALSA: led control - add sysfs kcontrol LED marking layer") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com
sound/core/control_led.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..fff2688b5019 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -371,6 +371,10 @@ static void snd_ctl_led_disconnect(struct snd_card *card) snd_ctl_led_refresh(); }
+static void snd_ctl_led_release(struct device *dev) +{ +}
Whatever you're trying to do, adding a dummy function is never the answer.
I see your point. This function I added is not to fix the root cause, but to fix an issue caused by the release function when the device is released.
The put_device is to fix the root cause(i.e., decrease the refcount to zero), however, the result is dev->p and kobject can be freed, but it will trigger a WARN [1] as it has no release method.
I don't know how to craft a release method for such a device. So this dummy function is generated following the default_release, also a dummy function.
Can you please give some advise on how to fix this WARN issue?
[1] https://elixir.bootlin.com/linux/latest/source/drivers/base/core.c#L2110
regards, dan carpenter
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote:
Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev); + device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM; @@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev); + device_put(&led->dev); } device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote:
Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
} device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);device_put(&led->dev);
Does this patch mean I should add device_put in the init and exit function other than snd_ctl_led_sysfs_remove? This will cause device_release bypass the release method checking?
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote:
Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak. My understanding is that the device object is already freed in the snd_ctl_led_sysfs_remove.
Does this patch mean I should add device_put in the init and exit function other than snd_ctl_led_sysfs_remove? This will cause device_release bypass the release method checking?
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote:
Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
My understanding is that the device object is already freed in the snd_ctl_led_sysfs_remove.
"Already freed"? Is it a memleak or is it a double free??? I probably should have read the syzbot email on this... But you didn't include a link to it or a reported-by tag so I don't have a way to look at the actual bug.
I did fix a bug, though... Just not the one from the report I guess. Please send a link to the bug report so I can look at that. ;)
regards, dan carpenter
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote:
Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; } }
[1] https://syzkaller.appspot.com/bug?id=6d9e1e89003c894e7a1855c92dfa558ebcb8f21...
My understanding is that the device object is already freed in the snd_ctl_led_sysfs_remove.
"Already freed"? Is it a memleak or is it a double free??? I probably should have read the syzbot email on this... But you didn't include a link to it or a reported-by tag so I don't have a way to look at the actual bug.
I listed the reported-by tag and fixes tag in the first email in this thread. The syzbot link is [1].
Please take a look at my patch testing request.
I did fix a bug, though... Just not the one from the report I guess. Please send a link to the bug report so I can look at that. ;)
We should talk about different bugs, memory leak for different objects and different paths.
regards, dan carpenter
On Mon, May 31, 2021 at 02:20:37PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote:
Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
}
This is frustrating to look at because it's not a diff so it doesn't show what you changed. I think you are saying that you added the put_device(&led_card->dev);. That's true. There are some other leaks as well. We should just fix them all. Use device_unregister() because it's cleaner.
If both device_initialize() and device_add() succeed then call device_unregister() to unwind.
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..561fe45e4449 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -700,7 +700,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); - device_del(&led_card->dev); + device_unregister(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; } @@ -739,9 +739,9 @@ static int __init snd_ctl_led_init(void) put_device(&led->dev); for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; - device_del(&led->dev); + device_unregister(&led->dev); } - device_del(&snd_ctl_led_dev); + device_unregister(&snd_ctl_led_dev); return -ENOMEM; } } @@ -767,9 +767,9 @@ static void __exit snd_ctl_led_exit(void) } for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; - device_del(&led->dev); + device_unregister(&led->dev); } - device_del(&snd_ctl_led_dev); + device_unregister(&snd_ctl_led_dev); snd_ctl_led_clean(NULL); }
On Mon, May 31, 2021 at 3:03 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 02:20:37PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote: > > Can you please give some advise on how to fix this WARN issue?
But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
regards, dan carpenter
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
}
This is frustrating to look at because it's not a diff so it doesn't show what you changed. I think you are saying that you added the put_device(&led_card->dev);. That's true. There are some other leaks as well. We should just fix them all. Use device_unregister() because it's cleaner.
Oh, I see your point. Yeah, we should fix these memory leaks all. I agree with device_unregister.
If both device_initialize() and device_add() succeed then call device_unregister() to unwind.
BTW, have you tested this new patch on two memory leaks?
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..561fe45e4449 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -700,7 +700,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card");
device_del(&led_card->dev);
device_unregister(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
@@ -739,9 +739,9 @@ static int __init snd_ctl_led_init(void) put_device(&led->dev); for (; group > 0; group--) { led = &snd_ctl_leds[group - 1];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); return -ENOMEM; } }
@@ -767,9 +767,9 @@ static void __exit snd_ctl_led_exit(void) } for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
}
On Mon, May 31, 2021 at 3:34 PM Dongliang Mu mudongliangabcd@gmail.com wrote:
On Mon, May 31, 2021 at 3:03 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 02:20:37PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
> On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote: > > On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote: >> >> Can you please give some advise on how to fix this WARN issue? > > But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
> > regards, > dan carpenter > > diff --git a/sound/core/control_led.c b/sound/core/control_led.c > index 25f57c14f294..dd357abc1b58 100644 > --- a/sound/core/control_led.c > +++ b/sound/core/control_led.c > @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) > for (; group > 0; group--) { > led = &snd_ctl_leds[group - 1]; > device_del(&led->dev); > + device_put(&led->dev); > } > device_del(&snd_ctl_led_dev); > return -ENOMEM; > @@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) > for (group = 0; group < MAX_LED; group++) { > led = &snd_ctl_leds[group]; > device_del(&led->dev); > + device_put(&led->dev); > } > device_del(&snd_ctl_led_dev); > snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
}
This is frustrating to look at because it's not a diff so it doesn't show what you changed. I think you are saying that you added the put_device(&led_card->dev);. That's true. There are some other leaks as well. We should just fix them all. Use device_unregister() because it's cleaner.
Oh, I see your point. Yeah, we should fix these memory leaks all. I agree with device_unregister.
If both device_initialize() and device_add() succeed then call device_unregister() to unwind.
BTW, have you tested this new patch on two memory leaks?
Please keep in mind that if we don't have any release method for struct snd_ctl_led_card, it will trigger a WARN[1] in the device_release function. That's why I have to add one dummy release function.
if (dev->release) dev->release(dev); else if (dev->type && dev->type->release) dev->type->release(dev); else if (dev->class && dev->class->dev_release) dev->class->dev_release(dev); else WARN(1, KERN_ERR "Device '%s' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.\n", dev_name(dev));
[1] https://elixir.bootlin.com/linux/latest/source/drivers/base/core.c#L2110
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..561fe45e4449 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -700,7 +700,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card");
device_del(&led_card->dev);
device_unregister(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
@@ -739,9 +739,9 @@ static int __init snd_ctl_led_init(void) put_device(&led->dev); for (; group > 0; group--) { led = &snd_ctl_leds[group - 1];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); return -ENOMEM; } }
@@ -767,9 +767,9 @@ static void __exit snd_ctl_led_exit(void) } for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
}
On Mon, May 31, 2021 at 04:08:04PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 3:34 PM Dongliang Mu mudongliangabcd@gmail.com wrote:
On Mon, May 31, 2021 at 3:03 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 02:20:37PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote: > > > > > On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote: > > > > On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote: > >> > >> Can you please give some advise on how to fix this WARN issue? > > > > But it feels like it spoils the fun if I write the commit... Anyway: > > It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch. > > > > > regards, > > dan carpenter > > > > diff --git a/sound/core/control_led.c b/sound/core/control_led.c > > index 25f57c14f294..dd357abc1b58 100644 > > --- a/sound/core/control_led.c > > +++ b/sound/core/control_led.c > > @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) > > for (; group > 0; group--) { > > led = &snd_ctl_leds[group - 1]; > > device_del(&led->dev); > > + device_put(&led->dev); > > } > > device_del(&snd_ctl_led_dev); > > return -ENOMEM; > > @@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) > > for (group = 0; group < MAX_LED; group++) { > > led = &snd_ctl_leds[group]; > > device_del(&led->dev); > > + device_put(&led->dev); > > } > > device_del(&snd_ctl_led_dev); > > snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
}
This is frustrating to look at because it's not a diff so it doesn't show what you changed. I think you are saying that you added the put_device(&led_card->dev);. That's true. There are some other leaks as well. We should just fix them all. Use device_unregister() because it's cleaner.
Oh, I see your point. Yeah, we should fix these memory leaks all. I agree with device_unregister.
If both device_initialize() and device_add() succeed then call device_unregister() to unwind.
BTW, have you tested this new patch on two memory leaks?
Please keep in mind that if we don't have any release method for struct snd_ctl_led_card, it will trigger a WARN[1] in the device_release function. That's why I have to add one dummy release function.
if (dev->release) dev->release(dev); else if (dev->type && dev->type->release) dev->type->release(dev); else if (dev->class && dev->class->dev_release) dev->class->dev_release(dev); else WARN(1, KERN_ERR "Device '%s' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.\n", dev_name(dev));
Oh yeah. You're right. The "kfree(led_card);" needs to be moved to a release function or it can lead to a use after free. For the others, I think a dummy release function is ok (because it is static data).
It feels like there should be a standard way to say that there is no need to release any data. That way it could be verified by static analysis tools.
regards, dan carpenter
[1] https://elixir.bootlin.com/linux/latest/source/drivers/base/core.c#L2110
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..561fe45e4449 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -700,7 +700,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card");
device_del(&led_card->dev);
device_unregister(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
@@ -739,9 +739,9 @@ static int __init snd_ctl_led_init(void) put_device(&led->dev); for (; group > 0; group--) { led = &snd_ctl_leds[group - 1];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); return -ENOMEM; } }
@@ -767,9 +767,9 @@ static void __exit snd_ctl_led_exit(void) } for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
}
On Mon, May 31, 2021 at 4:46 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 04:08:04PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 3:34 PM Dongliang Mu mudongliangabcd@gmail.com wrote:
On Mon, May 31, 2021 at 3:03 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 02:20:37PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote: > On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote: > > > > > > > > > On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote: > > > > > > On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote: > > >> > > >> Can you please give some advise on how to fix this WARN issue? > > > > > > But it feels like it spoils the fun if I write the commit... Anyway: > > > > It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch. > > > > > > > > regards, > > > dan carpenter > > > > > > diff --git a/sound/core/control_led.c b/sound/core/control_led.c > > > index 25f57c14f294..dd357abc1b58 100644 > > > --- a/sound/core/control_led.c > > > +++ b/sound/core/control_led.c > > > @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) > > > for (; group > 0; group--) { > > > led = &snd_ctl_leds[group - 1]; > > > device_del(&led->dev); > > > + device_put(&led->dev); > > > } > > > device_del(&snd_ctl_led_dev); > > > return -ENOMEM; > > > @@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) > > > for (group = 0; group < MAX_LED; group++) { > > > led = &snd_ctl_leds[group]; > > > device_del(&led->dev); > > > + device_put(&led->dev); > > > } > > > device_del(&snd_ctl_led_dev); > > > snd_ctl_led_clean(NULL); > > Hi Dan, > > I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
}
This is frustrating to look at because it's not a diff so it doesn't show what you changed. I think you are saying that you added the put_device(&led_card->dev);. That's true. There are some other leaks as well. We should just fix them all. Use device_unregister() because it's cleaner.
Oh, I see your point. Yeah, we should fix these memory leaks all. I agree with device_unregister.
If both device_initialize() and device_add() succeed then call device_unregister() to unwind.
BTW, have you tested this new patch on two memory leaks?
Please keep in mind that if we don't have any release method for struct snd_ctl_led_card, it will trigger a WARN[1] in the device_release function. That's why I have to add one dummy release function.
if (dev->release) dev->release(dev); else if (dev->type && dev->type->release) dev->type->release(dev); else if (dev->class && dev->class->dev_release) dev->class->dev_release(dev); else WARN(1, KERN_ERR "Device '%s' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.\n", dev_name(dev));
Oh yeah. You're right. The "kfree(led_card);" needs to be moved to a release function or it can lead to a use after free. For the others, I think a dummy release function is ok (because it is static data).
Hi Dan,
I wonder if we shall split the current patch into two patches, one patch for each memory leak. It is better to satisfy the rule - "one patch only fixes one issue".
We should absolutely fix all these memory leaks. But one patch for two different bugs with different objects and different paths is not very suitable.
It feels like there should be a standard way to say that there is no need to release any data. That way it could be verified by static analysis tools.
regards, dan carpenter
[1] https://elixir.bootlin.com/linux/latest/source/drivers/base/core.c#L2110
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..561fe45e4449 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -700,7 +700,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card");
device_del(&led_card->dev);
device_unregister(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
@@ -739,9 +739,9 @@ static int __init snd_ctl_led_init(void) put_device(&led->dev); for (; group > 0; group--) { led = &snd_ctl_leds[group - 1];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); return -ENOMEM; } }
@@ -767,9 +767,9 @@ static void __exit snd_ctl_led_exit(void) } for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group];
device_del(&led->dev);
device_unregister(&led->dev); }
device_del(&snd_ctl_led_dev);
device_unregister(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);
}
On Mon, May 31, 2021 at 05:10:49PM +0800, Dongliang Mu wrote:
Hi Dan,
I wonder if we shall split the current patch into two patches, one patch for each memory leak. It is better to satisfy the rule - "one patch only fixes one issue".
We should absolutely fix all these memory leaks. But one patch for two different bugs with different objects and different paths is not very suitable.
I would just send one patch, because I only see it as one bug. But you do what you think is best.
regards, dan carpenter
On Mon, May 31, 2021 at 5:38 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 05:10:49PM +0800, Dongliang Mu wrote:
Hi Dan,
I wonder if we shall split the current patch into two patches, one patch for each memory leak. It is better to satisfy the rule - "one patch only fixes one issue".
We should absolutely fix all these memory leaks. But one patch for two different bugs with different objects and different paths is not very suitable.
I would just send one patch, because I only see it as one bug. But you do what you think is best.
If you think they are the same bug, that's great. Just send and apply only one patch as it is. I will not send version v2.
BTW, could you please show me the syzbot link for the bug you tried to resolve? If it is not from syzbot dashboard, can you please show the bug report?
regards, dan carpenter
On Mon, May 31, 2021 at 06:35:33PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 5:38 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 05:10:49PM +0800, Dongliang Mu wrote:
Hi Dan,
I wonder if we shall split the current patch into two patches, one patch for each memory leak. It is better to satisfy the rule - "one patch only fixes one issue".
We should absolutely fix all these memory leaks. But one patch for two different bugs with different objects and different paths is not very suitable.
I would just send one patch, because I only see it as one bug. But you do what you think is best.
If you think they are the same bug, that's great. Just send and apply only one patch as it is. I will not send version v2.
Sorry for the miscommunication. That's not what I meant at all.
Your patch only introduces one put_device(). We need all five to fix the bug, and we'll have to change the kfree(link_whatever). Use device_unregister() instead put_device(). Include a Reported-by with the correct syzkaller information.
BTW, could you please show me the syzbot link for the bug you tried to resolve? If it is not from syzbot dashboard, can you please show the bug report?
What are you talking about? I'm not trying to fix a syzkaller bug. I'm just reviewing your patch.
regards, dan carpenter
On Mon, May 31, 2021 at 6:44 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 06:35:33PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 5:38 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 05:10:49PM +0800, Dongliang Mu wrote:
Hi Dan,
I wonder if we shall split the current patch into two patches, one patch for each memory leak. It is better to satisfy the rule - "one patch only fixes one issue".
We should absolutely fix all these memory leaks. But one patch for two different bugs with different objects and different paths is not very suitable.
I would just send one patch, because I only see it as one bug. But you do what you think is best.
If you think they are the same bug, that's great. Just send and apply only one patch as it is. I will not send version v2.
Sorry for the miscommunication. That's not what I meant at all.
Your patch only introduces one put_device(). We need all five to fix the bug, and we'll have to change the kfree(link_whatever). Use device_unregister() instead put_device(). Include a Reported-by with the correct syzkaller information.
BTW, could you please show me the syzbot link for the bug you tried to resolve? If it is not from syzbot dashboard, can you please show the bug report?
What are you talking about? I'm not trying to fix a syzkaller bug. I'm just reviewing your patch.
It seems we truly have some miscommunication. Your message makes me think you are fixing another bug report that shares the same root cause with this bug.
Now let's sync and get on the same page.
You are trying to give me some suggestions to fix this bug. I need to listen to your advice and send another version v2 to you developers. Right?
regards, dan carpenter
On Mon, May 31, 2021 at 03:34:09PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 3:03 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 02:20:37PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 12:40 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Mon, May 31, 2021 at 11:03:36AM +0800, Dongliang Mu wrote:
On Sat, May 29, 2021 at 5:35 AM 慕冬亮 mudongliangabcd@gmail.com wrote:
> On May 28, 2021, at 10:05 PM, Dan Carpenter dan.carpenter@oracle.com wrote: > > On Fri, May 28, 2021 at 09:50:49PM +0800, Dongliang Mu wrote: >> >> Can you please give some advise on how to fix this WARN issue? > > But it feels like it spoils the fun if I write the commit... Anyway:
It’s fine. I am still in the learning process. It’s also good to learn experience by comparing your patch and my patch.
> > regards, > dan carpenter > > diff --git a/sound/core/control_led.c b/sound/core/control_led.c > index 25f57c14f294..dd357abc1b58 100644 > --- a/sound/core/control_led.c > +++ b/sound/core/control_led.c > @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) > for (; group > 0; group--) { > led = &snd_ctl_leds[group - 1]; > device_del(&led->dev); > + device_put(&led->dev); > } > device_del(&snd_ctl_led_dev); > return -ENOMEM; > @@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) > for (group = 0; group < MAX_LED; group++) { > led = &snd_ctl_leds[group]; > device_del(&led->dev); > + device_put(&led->dev); > } > device_del(&snd_ctl_led_dev); > snd_ctl_led_clean(NULL);
Hi Dan,
I tried this patch, and it still triggers the memleak.
Did your patch fix the leak? Because my patch should have been equivalent except for it fixes an additional leak in the snd_ctl_led_init() error path.
The syzbot link is [1]. I have tested my patch in the syzbot dashboard and my local workspace.
I think the reason why your patch did not work should be led_card(struct snd_ctl_led_card) is already freed before returning in snd_ctl_led_sysfs_remove, rather than led(struct snd_ctl_led). See the implementation of snd_ctl_led_sysfs_remove for some details. Please correct me if I make any mistakes.
static void snd_ctl_led_sysfs_remove(struct snd_card *card) { unsigned int group; struct snd_ctl_led_card *led_card; struct snd_ctl_led *led; char link_name[32];
for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; led_card = led->cards[card->number]; if (!led_card) continue; snprintf(link_name, sizeof(link_name), "led-%s", led->name); sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev); put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
}
This is frustrating to look at because it's not a diff so it doesn't show what you changed. I think you are saying that you added the put_device(&led_card->dev);. That's true. There are some other leaks as well. We should just fix them all. Use device_unregister() because it's cleaner.
Oh, I see your point. Yeah, we should fix these memory leaks all. I agree with device_unregister.
If both device_initialize() and device_add() succeed then call device_unregister() to unwind.
BTW, have you tested this new patch on two memory leaks?
No I have not tested it at all, and I don't remember if I even compiled it.
regards, dan carpenter
On Sat, May 29, 2021 at 05:35:22AM +0800, 慕冬亮 wrote:
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..dd357abc1b58 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -740,6 +740,7 @@ static int __init snd_ctl_led_init(void) for (; group > 0; group--) { led = &snd_ctl_leds[group - 1]; device_del(&led->dev);
device_put(&led->dev); } device_del(&snd_ctl_led_dev); return -ENOMEM;
@@ -768,6 +769,7 @@ static void __exit snd_ctl_led_exit(void) for (group = 0; group < MAX_LED; group++) { led = &snd_ctl_leds[group]; device_del(&led->dev);
} device_del(&snd_ctl_led_dev); snd_ctl_led_clean(NULL);device_put(&led->dev);
Does this patch mean I should add device_put in the init and exit function other than snd_ctl_led_sysfs_remove? This will cause device_release bypass the release method checking?
Please fix your email client to line wrap your emails.
I'm not sure what release method checking you're refering to.
regards, dan carpenter
On Fri, May 28, 2021 at 09:17:57PM +0800, Dongliang Mu wrote:
The snd_ctl_led_sysfs_add and snd_ctl_led_sysfs_remove should contain the refcount operations in pair. However, snd_ctl_led_sysfs_remove fails to decrease the refcount to zero, which causes device_release never to be invoked. This leads to memory leak to some resources, like struct device_private.
Fix this by calling put_device at the end of snd_ctl_led_sysfs_remove
Reported-by: syzbot+08a7d8b51ea048a74ffb@syzkaller.appspotmail.com Fixes: a135dfb5de1 ("ALSA: led control - add sysfs kcontrol LED marking layer") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com
sound/core/control_led.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..fff2688b5019 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -371,6 +371,10 @@ static void snd_ctl_led_disconnect(struct snd_card *card) snd_ctl_led_refresh(); }
+static void snd_ctl_led_release(struct device *dev) +{ +}
Just to clarify again some more, this call back has to free "led_card". This patch changes the memory leak into a use after free bug. (A use after free bug is worse than a memory leak).
There were some other leaks as discussed where a dummy free function is fine because they were dealing with static data structures (not allocated memory).
/*
- sysfs
*/ @@ -663,6 +667,7 @@ static void snd_ctl_led_sysfs_add(struct snd_card *card) led_card->number = card->number; led_card->led = led; device_initialize(&led_card->dev);
if (dev_set_name(&led_card->dev, "card%d", card->number) < 0) goto cerr; led_card->dev.parent = &led->dev;led_card->dev.release = snd_ctl_led_release;
@@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev);
kfree(led_card); led->cards[card->number] = NULL; }put_device(&led_card->dev);
Btw, I have created a Smatch warning for this type of code where we have:
put_device(&foo->dev); kfree(foo);
sound/core/control_led.c:709 snd_ctl_led_sysfs_remove() warn: freeing device managed memory: 'led_card'
So hopefully that will prevent future similar bugs. I'll test it out overnight and report back tomorrow how it works.
regards, dan carpenter
/* * Copyright (C) 2021 Oracle. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, see http://www.gnu.org/copyleft/gpl.txt */
#include "smatch.h"
static int my_id;
STATE(managed);
static void set_ignore(struct sm_state *sm, struct expression *mod_expr) { set_state(my_id, sm->name, sm->sym, &undefined); }
static void match_put_device(const char *fn, struct expression *expr, void *param) { struct expression *arg;
arg = get_argument_from_call_expr(expr->args, PTR_INT(param)); arg = strip_expr(arg); if (!arg || arg->type != EXPR_PREOP || arg->op != '&') return; arg = strip_expr(arg->unop); if (!arg || arg->type != EXPR_DEREF) return; arg = strip_expr(arg->deref); if (arg && arg->type == EXPR_PREOP && arg->op == '*') arg = arg->unop; set_state_expr(my_id, arg, &managed); }
static void match_free(const char *fn, struct expression *expr, void *param) { struct expression *arg; char *name;
arg = get_argument_from_call_expr(expr->args, PTR_INT(param)); if (get_state_expr(my_id, arg) != &managed) return; name = expr_to_str(arg); sm_warning("freeing device managed memory: '%s'", name); free_string(name); }
void check_put_device(int id) { my_id = id;
if (option_project != PROJ_KERNEL) return;
add_function_hook("put_device", &match_put_device, INT_PTR(0)); add_function_hook("device_unregister", &match_put_device, INT_PTR(0));
add_function_hook("kfree", &match_free, INT_PTR(0)); add_modification_hook(my_id, &set_ignore); }
I've also created a Smatch check for missing ->release() functions. It find one bug in that file. There are other bugs, but the static checker is not clever enough to find them. I expect Smatch will get smarter about this in the coming year.
sound/core/control_led.c:685 snd_ctl_led_sysfs_add() warn: calling put_device() with no ->release() function
So, yeay, I feel like this thread has been useful and I now understand put_device() a little better. Please review the email thread again and send a v2 patch. :)
regards, dan carpenter
/* * Copyright (C) 2021 Oracle. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, see http://www.gnu.org/copyleft/gpl.txt */
#include "smatch.h" #include "smatch_extra.h"
static int my_id;
static void match_put_device(struct expression *expr, const char *name, struct symbol *sym, void *data) { struct smatch_state *state;
state = get_state(SMATCH_EXTRA, name, sym); if (!state || estate_min(state).value != 0 || estate_max(state).value != 0) return;
sm_warning("calling put_device() with no ->release() function"); }
void check_no_release(int id) { my_id = id;
if (option_project != PROJ_KERNEL) return;
add_function_param_key_hook("put_device", &match_put_device, 0, "$->release", NULL); add_function_param_key_hook("device_unregister", &match_put_device, 0, "$->release", NULL); }
On Mon, May 31, 2021 at 7:07 PM Dan Carpenter dan.carpenter@oracle.com wrote:
I've also created a Smatch check for missing ->release() functions. It find one bug in that file. There are other bugs, but the static checker is not clever enough to find them. I expect Smatch will get smarter about this in the coming year.
sound/core/control_led.c:685 snd_ctl_led_sysfs_add() warn: calling put_device() with no ->release() function
So, yeay, I feel like this thread has been useful and I now understand put_device() a little better. Please review the email thread again and send a v2 patch. :)
Sure. No problem. I will send a v2 patch after reading the whole thread again.
regards, dan carpenter
/*
- Copyright (C) 2021 Oracle.
- This program is free software; you can redistribute it and/or
- modify it under the terms of the GNU General Public License
- as published by the Free Software Foundation; either version 2
- of the License, or (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program; if not, see http://www.gnu.org/copyleft/gpl.txt
*/
#include "smatch.h" #include "smatch_extra.h"
static int my_id;
static void match_put_device(struct expression *expr, const char *name, struct symbol *sym, void *data) { struct smatch_state *state;
state = get_state(SMATCH_EXTRA, name, sym); if (!state || estate_min(state).value != 0 || estate_max(state).value != 0) return; sm_warning("calling put_device() with no ->release() function");
}
void check_no_release(int id) { my_id = id;
if (option_project != PROJ_KERNEL) return; add_function_param_key_hook("put_device", &match_put_device, 0, "$->release", NULL); add_function_param_key_hook("device_unregister", &match_put_device, 0, "$->release", NULL);
}
On Mon, May 31, 2021 at 7:02 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Fri, May 28, 2021 at 09:17:57PM +0800, Dongliang Mu wrote:
The snd_ctl_led_sysfs_add and snd_ctl_led_sysfs_remove should contain the refcount operations in pair. However, snd_ctl_led_sysfs_remove fails to decrease the refcount to zero, which causes device_release never to be invoked. This leads to memory leak to some resources, like struct device_private.
Fix this by calling put_device at the end of snd_ctl_led_sysfs_remove
Reported-by: syzbot+08a7d8b51ea048a74ffb@syzkaller.appspotmail.com Fixes: a135dfb5de1 ("ALSA: led control - add sysfs kcontrol LED marking layer") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com
sound/core/control_led.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/core/control_led.c b/sound/core/control_led.c index 25f57c14f294..fff2688b5019 100644 --- a/sound/core/control_led.c +++ b/sound/core/control_led.c @@ -371,6 +371,10 @@ static void snd_ctl_led_disconnect(struct snd_card *card) snd_ctl_led_refresh(); }
+static void snd_ctl_led_release(struct device *dev) +{ +}
Just to clarify again some more, this call back has to free "led_card". This patch changes the memory leak into a use after free bug. (A use after free bug is worse than a memory leak).
Hi Dan,
I have read the whole thread several times. I don't quite understand why you think this call back needs to free "led_card". In current implementation, the led_card is allocated in snd_ctl_led_sysfs_add, and released in snd_ctl_led_sysfs_remove. It seems there is no logic issue. If we keep a dump function here, I think there should no UAF.
I agree with you. We shall be very careful about any added release function. It might turn a memory leak into double-free or use-after-free.
There were some other leaks as discussed where a dummy free function is fine because they were dealing with static data structures (not allocated memory).
/*
- sysfs
*/ @@ -663,6 +667,7 @@ static void snd_ctl_led_sysfs_add(struct snd_card *card) led_card->number = card->number; led_card->led = led; device_initialize(&led_card->dev);
led_card->dev.release = snd_ctl_led_release; if (dev_set_name(&led_card->dev, "card%d", card->number) < 0) goto cerr; led_card->dev.parent = &led->dev;
@@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev);
put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
Btw, I have created a Smatch warning for this type of code where we have:
put_device(&foo->dev); kfree(foo);
I don't think this should be a bug pattern. put_device will drop the final reference of one object with struct device and invoke device_release to release some resources.
The release function should only clean up the internal resources in the device object. It should not touch the led_card which contains the device object.
sound/core/control_led.c:709 snd_ctl_led_sysfs_remove() warn: freeing device managed memory: 'led_card'
So hopefully that will prevent future similar bugs. I'll test it out overnight and report back tomorrow how it works.
regards, dan carpenter
/*
- Copyright (C) 2021 Oracle.
- This program is free software; you can redistribute it and/or
- modify it under the terms of the GNU General Public License
- as published by the Free Software Foundation; either version 2
- of the License, or (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program; if not, see http://www.gnu.org/copyleft/gpl.txt
*/
#include "smatch.h"
static int my_id;
STATE(managed);
static void set_ignore(struct sm_state *sm, struct expression *mod_expr) { set_state(my_id, sm->name, sm->sym, &undefined); }
static void match_put_device(const char *fn, struct expression *expr, void *param) { struct expression *arg;
arg = get_argument_from_call_expr(expr->args, PTR_INT(param)); arg = strip_expr(arg); if (!arg || arg->type != EXPR_PREOP || arg->op != '&') return; arg = strip_expr(arg->unop); if (!arg || arg->type != EXPR_DEREF) return; arg = strip_expr(arg->deref); if (arg && arg->type == EXPR_PREOP && arg->op == '*') arg = arg->unop; set_state_expr(my_id, arg, &managed);
}
static void match_free(const char *fn, struct expression *expr, void *param) { struct expression *arg; char *name;
arg = get_argument_from_call_expr(expr->args, PTR_INT(param)); if (get_state_expr(my_id, arg) != &managed) return; name = expr_to_str(arg); sm_warning("freeing device managed memory: '%s'", name); free_string(name);
}
void check_put_device(int id) { my_id = id;
if (option_project != PROJ_KERNEL) return; add_function_hook("put_device", &match_put_device, INT_PTR(0)); add_function_hook("device_unregister", &match_put_device, INT_PTR(0)); add_function_hook("kfree", &match_free, INT_PTR(0)); add_modification_hook(my_id, &set_ignore);
}
On Tue, Jun 01, 2021 at 09:17:04PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 7:02 PM Dan Carpenter dan.carpenter@oracle.com wrote:
@@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev);
put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
Btw, I have created a Smatch warning for this type of code where we have:
put_device(&foo->dev); kfree(foo);
I don't think this should be a bug pattern. put_device will drop the final reference of one object with struct device and invoke device_release to release some resources.
The release function should only clean up the internal resources in the device object. It should not touch the led_card which contains the device object.
It's only a use after free if you turn CONFIG_DEBUG_KOBJECT_RELEASE debugging on, which you would never do in a production environment. The put_device() function calls kobject_release():
lib/kobject.c 725 static void kobject_release(struct kref *kref) 726 { 727 struct kobject *kobj = container_of(kref, struct kobject, kref); 728 #ifdef CONFIG_DEBUG_KOBJECT_RELEASE 729 unsigned long delay = HZ + HZ * (get_random_int() & 0x3); 730 pr_info("kobject: '%s' (%p): %s, parent %p (delayed %ld)\n", 731 kobject_name(kobj), kobj, __func__, kobj->parent, delay); 732 INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup); ^^^^^^^^^^^^^^^^^^^^^^^
733 734 schedule_delayed_work(&kobj->release, delay); 735 #else 736 kobject_cleanup(kobj); 737 #endif 738 }
This release will be done later and it references led_card->dev which is now freed.
The Smatch check did work pretty decently. These are all use after free bugs if you have CONFIG_DEBUG_KOBJECT_RELEASE enabled. (Line numbers from Friday's linux-next). I'm not going to bother fixing them because they're only an issue for CONFIG_DEBUG_KOBJECT_RELEASE and not for production but I will email people when more of these bugs are introduced.
sound/core/control_led.c:688 snd_ctl_led_sysfs_add() warn: freeing device managed memory (UAF): 'led_card' drivers/usb/gadget/function/f_mass_storage.c:2649 fsg_common_remove_lun() warn: freeing device managed memory (UAF): 'lun' drivers/usb/gadget/function/f_mass_storage.c:2818 fsg_common_create_lun() warn: freeing device managed memory (UAF): 'lun' drivers/usb/gadget/function/f_mass_storage.c:2881 fsg_common_release() warn: freeing device managed memory (UAF): 'lun' drivers/w1/w1.c:810 w1_unref_slave() warn: freeing device managed memory (UAF): 'sl' drivers/pci/endpoint/pci-epc-core.c:671 pci_epc_destroy() warn: freeing device managed memory (UAF): 'epc' drivers/pci/endpoint/pci-epc-core.c:742 __pci_epc_create() warn: freeing device managed memory (UAF): 'epc' drivers/infiniband/ulp/srp/ib_srp.c:3930 srp_add_port() warn: freeing device managed memory (UAF): 'host' drivers/infiniband/ulp/srp/ib_srp.c:4058 srp_remove_one() warn: freeing device managed memory (UAF): 'host' drivers/infiniband/ulp/rtrs/rtrs-clt.c:2695 alloc_clt() warn: freeing device managed memory (UAF): 'clt' drivers/media/pci/solo6x10/solo6x10-core.c:156 free_solo_dev() warn: freeing device managed memory (UAF): 'solo_dev' drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:203 nfp_cpp_free() warn: freeing device managed memory (UAF): 'cpp' drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1258 nfp_cpp_from_operations() warn: freeing device managed memory (UAF): 'cpp' drivers/net/netdevsim/bus.c:354 nsim_bus_dev_del() warn: freeing device managed memory (UAF): 'nsim_bus_dev' drivers/thermal/thermal_core.c:1002 __thermal_cooling_device_register() warn: freeing device managed memory (UAF): 'cdev'
regards, dan carpenter
On Tue, Jun 1, 2021 at 9:46 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Tue, Jun 01, 2021 at 09:17:04PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 7:02 PM Dan Carpenter dan.carpenter@oracle.com wrote:
@@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev);
put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
Btw, I have created a Smatch warning for this type of code where we have:
put_device(&foo->dev); kfree(foo);
I don't think this should be a bug pattern. put_device will drop the final reference of one object with struct device and invoke device_release to release some resources.
The release function should only clean up the internal resources in the device object. It should not touch the led_card which contains the device object.
It's only a use after free if you turn CONFIG_DEBUG_KOBJECT_RELEASE debugging on, which you would never do in a production environment. The put_device() function calls kobject_release():
This is interesting. Let's dig a little deeper.
lib/kobject.c 725 static void kobject_release(struct kref *kref) 726 { 727 struct kobject *kobj = container_of(kref, struct kobject, kref); 728 #ifdef CONFIG_DEBUG_KOBJECT_RELEASE 729 unsigned long delay = HZ + HZ * (get_random_int() & 0x3); 730 pr_info("kobject: '%s' (%p): %s, parent %p (delayed %ld)\n", 731 kobject_name(kobj), kobj, __func__, kobj->parent, delay); 732 INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup); ^^^^^^^^^^^^^^^^^^^^^^^
733 734 schedule_delayed_work(&kobj->release, delay); 735 #else 736 kobject_cleanup(kobj); 737 #endif 738 }
This release will be done later and it references led_card->dev which is now freed.
The call chain of kobject_delayed_cleanup is kobject_delayed_cleanup -> kobject_cleanup. From the comment, kobject_cleanup should only clean the resources in the kobject, without touching the dev object. To further confirm, I checked the implementation and found out there seem no references to the dev object. Would you mind pointing out the reference to dev object? Moreover, if kobject_cleanup touches the resources out of kobject, shall we directly change this function other than its callees?
#ifdef CONFIG_DEBUG_KOBJECT_RELEASE static void kobject_delayed_cleanup(struct work_struct *work) { kobject_cleanup(container_of(to_delayed_work(work), struct kobject, release)); } #endif
/* * kobject_cleanup - free kobject resources. * @kobj: object to cleanup */ static void kobject_cleanup(struct kobject *kobj) { struct kobject *parent = kobj->parent; struct kobj_type *t = get_ktype(kobj); const char *name = kobj->name;
pr_debug("kobject: '%s' (%p): %s, parent %p\n", kobject_name(kobj), kobj, __func__, kobj->parent);
if (t && !t->release) pr_debug("kobject: '%s' (%p): does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.\n", kobject_name(kobj), kobj);
/* remove from sysfs if the caller did not do it */ if (kobj->state_in_sysfs) { pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n", kobject_name(kobj), kobj); __kobject_del(kobj); } else { /* avoid dropping the parent reference unnecessarily */ parent = NULL; }
if (t && t->release) { pr_debug("kobject: '%s' (%p): calling ktype release\n", kobject_name(kobj), kobj); t->release(kobj); }
/* free name if we allocated it */ if (name) { pr_debug("kobject: '%s': free name\n", name); kfree_const(name); }
kobject_put(parent); }
The Smatch check did work pretty decently. These are all use after free bugs if you have CONFIG_DEBUG_KOBJECT_RELEASE enabled. (Line numbers from Friday's linux-next). I'm not going to bother fixing them because they're only an issue for CONFIG_DEBUG_KOBJECT_RELEASE and not for production but I will email people when more of these bugs are introduced.
sound/core/control_led.c:688 snd_ctl_led_sysfs_add() warn: freeing device managed memory (UAF): 'led_card' drivers/usb/gadget/function/f_mass_storage.c:2649 fsg_common_remove_lun() warn: freeing device managed memory (UAF): 'lun' drivers/usb/gadget/function/f_mass_storage.c:2818 fsg_common_create_lun() warn: freeing device managed memory (UAF): 'lun' drivers/usb/gadget/function/f_mass_storage.c:2881 fsg_common_release() warn: freeing device managed memory (UAF): 'lun' drivers/w1/w1.c:810 w1_unref_slave() warn: freeing device managed memory (UAF): 'sl' drivers/pci/endpoint/pci-epc-core.c:671 pci_epc_destroy() warn: freeing device managed memory (UAF): 'epc' drivers/pci/endpoint/pci-epc-core.c:742 __pci_epc_create() warn: freeing device managed memory (UAF): 'epc' drivers/infiniband/ulp/srp/ib_srp.c:3930 srp_add_port() warn: freeing device managed memory (UAF): 'host' drivers/infiniband/ulp/srp/ib_srp.c:4058 srp_remove_one() warn: freeing device managed memory (UAF): 'host' drivers/infiniband/ulp/rtrs/rtrs-clt.c:2695 alloc_clt() warn: freeing device managed memory (UAF): 'clt' drivers/media/pci/solo6x10/solo6x10-core.c:156 free_solo_dev() warn: freeing device managed memory (UAF): 'solo_dev' drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:203 nfp_cpp_free() warn: freeing device managed memory (UAF): 'cpp' drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1258 nfp_cpp_from_operations() warn: freeing device managed memory (UAF): 'cpp' drivers/net/netdevsim/bus.c:354 nsim_bus_dev_del() warn: freeing device managed memory (UAF): 'nsim_bus_dev' drivers/thermal/thermal_core.c:1002 __thermal_cooling_device_register() warn: freeing device managed memory (UAF): 'cdev'
regards, dan carpenter
On Tue, Jun 01, 2021 at 10:19:22PM +0800, Dongliang Mu wrote:
On Tue, Jun 1, 2021 at 9:46 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Tue, Jun 01, 2021 at 09:17:04PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 7:02 PM Dan Carpenter dan.carpenter@oracle.com wrote:
@@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev);
put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
Btw, I have created a Smatch warning for this type of code where we have:
put_device(&foo->dev); kfree(foo);
I don't think this should be a bug pattern. put_device will drop the final reference of one object with struct device and invoke device_release to release some resources.
The release function should only clean up the internal resources in the device object. It should not touch the led_card which contains the device object.
It's only a use after free if you turn CONFIG_DEBUG_KOBJECT_RELEASE debugging on, which you would never do in a production environment. The put_device() function calls kobject_release():
This is interesting. Let's dig a little deeper.
lib/kobject.c 725 static void kobject_release(struct kref *kref) 726 { 727 struct kobject *kobj = container_of(kref, struct kobject, kref); 728 #ifdef CONFIG_DEBUG_KOBJECT_RELEASE 729 unsigned long delay = HZ + HZ * (get_random_int() & 0x3); 730 pr_info("kobject: '%s' (%p): %s, parent %p (delayed %ld)\n", 731 kobject_name(kobj), kobj, __func__, kobj->parent, delay); 732 INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup); ^^^^^^^^^^^^^^^^^^^^^^^
733 734 schedule_delayed_work(&kobj->release, delay); 735 #else 736 kobject_cleanup(kobj); 737 #endif 738 }
This release will be done later and it references led_card->dev which is now freed.
The call chain of kobject_delayed_cleanup is kobject_delayed_cleanup -> kobject_cleanup. From the comment, kobject_cleanup should only clean the resources in the kobject, without touching the dev object. To further confirm, I checked the implementation and found out there seem no references to the dev object. Would you mind pointing out the reference to dev object?
The kobj struct is included in the dev struct, it's not a pointer.
led_card->dev.kobj.name
See all the '.' characters and only one "->"? If you kfree(led_card) then you can't use led_card->dev.kobj any more.
Moreover, if kobject_cleanup touches the resources out of kobject, shall we directly change this function other than its callees?
I don't understand your question here. The rest of the email looks like some copy and pasted code but I don't know what I'm supposed to be looking for.
I really feel like I have explained things very as well as I can and I'm not sure what more I can do to help... :/
regards, dan carpenter
On Tue, Jun 1, 2021 at 10:43 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Tue, Jun 01, 2021 at 10:19:22PM +0800, Dongliang Mu wrote:
On Tue, Jun 1, 2021 at 9:46 PM Dan Carpenter dan.carpenter@oracle.com wrote:
On Tue, Jun 01, 2021 at 09:17:04PM +0800, Dongliang Mu wrote:
On Mon, May 31, 2021 at 7:02 PM Dan Carpenter dan.carpenter@oracle.com wrote:
@@ -701,6 +706,7 @@ static void snd_ctl_led_sysfs_remove(struct snd_card *card) sysfs_remove_link(&card->ctl_dev.kobj, link_name); sysfs_remove_link(&led_card->dev.kobj, "card"); device_del(&led_card->dev);
put_device(&led_card->dev); kfree(led_card); led->cards[card->number] = NULL; }
Btw, I have created a Smatch warning for this type of code where we have:
put_device(&foo->dev); kfree(foo);
I don't think this should be a bug pattern. put_device will drop the final reference of one object with struct device and invoke device_release to release some resources.
The release function should only clean up the internal resources in the device object. It should not touch the led_card which contains the device object.
It's only a use after free if you turn CONFIG_DEBUG_KOBJECT_RELEASE debugging on, which you would never do in a production environment. The put_device() function calls kobject_release():
This is interesting. Let's dig a little deeper.
lib/kobject.c 725 static void kobject_release(struct kref *kref) 726 { 727 struct kobject *kobj = container_of(kref, struct kobject, kref); 728 #ifdef CONFIG_DEBUG_KOBJECT_RELEASE 729 unsigned long delay = HZ + HZ * (get_random_int() & 0x3); 730 pr_info("kobject: '%s' (%p): %s, parent %p (delayed %ld)\n", 731 kobject_name(kobj), kobj, __func__, kobj->parent, delay); 732 INIT_DELAYED_WORK(&kobj->release, kobject_delayed_cleanup); ^^^^^^^^^^^^^^^^^^^^^^^
733 734 schedule_delayed_work(&kobj->release, delay); 735 #else 736 kobject_cleanup(kobj); 737 #endif 738 }
This release will be done later and it references led_card->dev which is now freed.
The call chain of kobject_delayed_cleanup is kobject_delayed_cleanup -> kobject_cleanup. From the comment, kobject_cleanup should only clean the resources in the kobject, without touching the dev object. To further confirm, I checked the implementation and found out there seem no references to the dev object. Would you mind pointing out the reference to dev object?
The kobj struct is included in the dev struct, it's not a pointer.
led_card->dev.kobj.name
See all the '.' characters and only one "->"? If you kfree(led_card) then you can't use led_card->dev.kobj any more.
Yeah, you're right. I originally thought the field kobj is a pointer and there should no problem. Please leave alone the question below. I thought up this question based on the assumption before.
Moreover, if kobject_cleanup touches the resources out of kobject, shall we directly change this function other than its callees?
I don't understand your question here. The rest of the email looks like some copy and pasted code but I don't know what I'm supposed to be looking for.
I really feel like I have explained things very as well as I can and I'm not sure what more I can do to help... :/
You already helped too much, and I learned a lot from the discussion with you. Don't be bothered by my stupid questions. :)
regards, dan carpenter
participants (3)
-
Dan Carpenter
-
Dongliang Mu
-
慕冬亮