[alsa-devel] [BUG] Kernel crash on Allwinner H3 due to sound core changes
Hi all,
with todays linux-next (next-20180228), kernel on Allwinner H3 SoC crashes with dmesg like that: https://pastebin.com/raw/0D5JeaJ8
I bisected the kernel and first offending commit is: be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm: replace platform to component")
I know that crash message is completely unrelated to sound subsystem, but it turns out that if I disable CONFIG_SND_SUN4I_CODEC kernel works ok, but this way I lose analog audio output.
Any suggestions what can be the issue?
Best regards, Jernej
Hi Jernej
Thank you for your report
with todays linux-next (next-20180228), kernel on Allwinner H3 SoC crashes with dmesg like that: https://pastebin.com/raw/0D5JeaJ8
I bisected the kernel and first offending commit is: be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm: replace platform to component")
I know that crash message is completely unrelated to sound subsystem, but it turns out that if I disable CONFIG_SND_SUN4I_CODEC kernel works ok, but this way I lose analog audio output.
Any suggestions what can be the issue?
Hmm... I'm sorry but I have no idea...
One thing I noticed is that...
=> [ 1.662605] Unable to handle kernel NULL pointer dereference at virtual address 00000004 ... [ 1.703312] PC is at strlen+0x0/0x2c [ 1.706976] LR is at kobject_get_path+0x1c/0xb4
my guess this "strlen" is from get_kobj_path_length() (?) and it is below.
static int get_kobj_path_length(struct kobject *kobj) { ... => if (kobject_name(parent) == NULL) return 0; => length += strlen(kobject_name(parent)) + 1; ... }
Your "parent name" is 0x00000004 instead of NULL somehow...
[ 2.170203] [<c063efbc>] (strlen) from [<c0633f08>] (kobject_get_path+0x1c/0xb4) [ 2.183581] [<c0633f08>] (kobject_get_path) from [<c0635178>] (kobject_uevent_env+0xd4/0x5d0) [ 2.198143] [<c0635178>] (kobject_uevent_env) from [<c0428c54>] (device_add+0x3b4/0x5b4) [ 2.212252] [<c0428c54>] (device_add) from [<c05205d0>] (extcon_dev_register+0x348/0x6c0) [ 2.226443] [<c05205d0>] (extcon_dev_register) from [<c05210a4>] (devm_extcon_dev_register+0x38/0x70) [ 2.241717] [<c05210a4>] (devm_extcon_dev_register) from [<c037505c>] (sun4i_usb_phy_probe+0x180/0x614) [ 2.257279] [<c037505c>] (sun4i_usb_phy_probe) from [<c042d0c0>] (platform_drv_probe+0x50/0xac)
According to log, this crash came from edev of extcon_dev_register() which is *alocated* by devm_extcon_dev_allocate(). I guess "parent" is set by it ? Hmm... does "snd_dmaengine_xxx" and "devm_extcon_dev_allocate" has relation ?
static int sun4i_usb_phy_probe(struct platform_device *pdev) { ... => data->extcon = devm_extcon_dev_allocate(dev, sun4i_usb_phy0_cable); ...
ret = devm_extcon_dev_register(dev, data->extcon); ... ~~~~~~~~~~~~ }
Best regards --- Kuninori Morimoto
Hi Kuninori,
I'm responding to my own mail, since I didn't received yours for some reason but I still saw your response in mailing list archive.
Dne sreda, 28. februar 2018 ob 22:02:09 CET je Jernej Škrabec napisal(a):
Hi all,
with todays linux-next (next-20180228), kernel on Allwinner H3 SoC crashes with dmesg like that: https://pastebin.com/raw/0D5JeaJ8
I bisected the kernel and first offending commit is: be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm: replace platform to component")
I know that crash message is completely unrelated to sound subsystem, but it turns out that if I disable CONFIG_SND_SUN4I_CODEC kernel works ok, but this way I lose analog audio output.
Any suggestions what can be the issue?
I did a bit of research and I can tell you that different kernel options (some drivers added or removed) change how or where kernel crashes. That would suggest some kind of memory corruption.
I removed parts of the code from the sun4i codec driver and interestingly it doesn't crash if I remove following lines:
ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); if (ret) { dev_err(&pdev->dev, "Failed to register against DMAEngine\n"); goto err_assert_reset; }
Is it possible that NULL pointer causes troubles somewhere down the line?
I tested this on linux-next, next-20180228 tag.
Best regards, Jernej
On Thu, Mar 01, 2018 at 11:23:57PM +0100, Jernej Škrabec wrote:
I removed parts of the code from the sun4i codec driver and interestingly it doesn't crash if I remove following lines:
ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); if (ret) { dev_err(&pdev->dev, "Failed to register against DMAEngine\n"); goto err_assert_reset; }
Is it possible that NULL pointer causes troubles somewhere down the line?
Shouldn't be, that's just the configuration which is optional and not what we're crashing trying to register, we can mostly configure things by querying the capabilities of the DMA controller via the dmaengine API these days. You're removing all the DMA support there so cutting out a huge segment of the initialization of both this driver and the machine driver. Other sunxi devices seem to be starting happily in -next so there's something system dependent here...
Hi,
Dne petek, 02. marec 2018 ob 13:40:50 CET je Mark Brown napisal(a):
On Thu, Mar 01, 2018 at 11:23:57PM +0100, Jernej Škrabec wrote:
I removed parts of the code from the sun4i codec driver and interestingly it doesn't crash if I remove following lines:
ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); if (ret) {
dev_err(&pdev->dev, "Failed to register against DMAEngine\n"); goto err_assert_reset;
}
Is it possible that NULL pointer causes troubles somewhere down the line?
Shouldn't be, that's just the configuration which is optional and not what we're crashing trying to register, we can mostly configure things by querying the capabilities of the DMA controller via the dmaengine API these days. You're removing all the DMA support there so cutting out a huge segment of the initialization of both this driver and the machine driver. Other sunxi devices seem to be starting happily in -next so there's something system dependent here...
I enabled memory debugging and it seems that there is an issue caused by loading sun4i-codec driver and it is somehow connected to snd_dmaengine_pcm_unregister().
Here is relevant dmesg: https://pastebin.com/raw/80K9GPnB
Does this tell anything?
Best regards, Jernej
Hi,
Dne ponedeljek, 05. marec 2018 ob 22:30:23 CET je Jernej Škrabec napisal(a):
Hi,
Dne petek, 02. marec 2018 ob 13:40:50 CET je Mark Brown napisal(a):
On Thu, Mar 01, 2018 at 11:23:57PM +0100, Jernej Škrabec wrote:
I removed parts of the code from the sun4i codec driver and interestingly it doesn't crash if I remove following lines:
ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); if (ret) {
dev_err(&pdev->dev, "Failed to register against DMAEngine\n"); goto err_assert_reset;
}
Is it possible that NULL pointer causes troubles somewhere down the line?
Shouldn't be, that's just the configuration which is optional and not what we're crashing trying to register, we can mostly configure things by querying the capabilities of the DMA controller via the dmaengine API these days. You're removing all the DMA support there so cutting out a huge segment of the initialization of both this driver and the machine driver. Other sunxi devices seem to be starting happily in -next so there's something system dependent here...
I enabled memory debugging and it seems that there is an issue caused by loading sun4i-codec driver and it is somehow connected to snd_dmaengine_pcm_unregister().
Here is relevant dmesg: https://pastebin.com/raw/80K9GPnB
I found the issue. Commit be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm: replace platform to component") changes struct dmaengine_pcm:
struct dmaengine_pcm { struct dma_chan *chan[SNDRV_PCM_STREAM_LAST + 1]; const struct snd_dmaengine_pcm_config *config; - struct snd_soc_platform platform; + struct snd_soc_component component; unsigned int flags; };
In snd_dmaengine_pcm_register(): ret = snd_soc_add_component(dev, &pcm->component, &dmaengine_pcm_component, NULL, 0);
And now, sun4i-codec first time returns -EPROBE_DEFER since driver for analog part is not yet loaded. Because of that, all components get destroyed.
snd_dmaengine_pcm_unregister() calls snd_soc_unregister_component() and that one calls __snd_soc_unregister_component() multiple times (until it fails).
Issue is that __snd_soc_unregister_component() uses kfree() on component pointer and that naturally can't succed since component was never kmalloc'ed since it is a part of a bigger structure - struct dmaengine_pcm.
What would be the best fix? Changing struct dmaengine_pcm to have pointer to a component, so it can be freed?
Best regards, Jernej
Hi Jernej
Thank you for your hard work
I found the issue. Commit be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm: replace platform to component") changes struct dmaengine_pcm:
struct dmaengine_pcm { struct dma_chan *chan[SNDRV_PCM_STREAM_LAST + 1]; const struct snd_dmaengine_pcm_config *config;
- struct snd_soc_platform platform;
- struct snd_soc_component component; unsigned int flags;
};
In snd_dmaengine_pcm_register(): ret = snd_soc_add_component(dev, &pcm->component, &dmaengine_pcm_component, NULL, 0);
And now, sun4i-codec first time returns -EPROBE_DEFER since driver for analog part is not yet loaded. Because of that, all components get destroyed.
snd_dmaengine_pcm_unregister() calls snd_soc_unregister_component() and that one calls __snd_soc_unregister_component() multiple times (until it fails).
Issue is that __snd_soc_unregister_component() uses kfree() on component pointer and that naturally can't succed since component was never kmalloc'ed since it is a part of a bigger structure - struct dmaengine_pcm.
What would be the best fix? Changing struct dmaengine_pcm to have pointer to a component, so it can be freed?
Ahh.. indeed. Good catch ! How about to add such flag ? This is just idea. No tested, No compiled, but can help you ?
One note here is that reusing "registered_as_component" flag is not good idea, because it will be removed when platform/codec were removed
------------------------ diff --git a/include/sound/soc.h b/include/sound/soc.h index 1a73232..b9b1b4c 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -853,6 +853,7 @@ struct snd_soc_component { unsigned int ignore_pmdown_time:1; /* pmdown_time is ignored at stop */ unsigned int registered_as_component:1; unsigned int suspended:1; /* is in suspend PM state */ + unsigned int alloced_component:1;
struct list_head list; struct list_head card_aux_list; /* for auxiliary bound components */ diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c0edac8..0e33bcf 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -3492,6 +3492,7 @@ int snd_soc_register_component(struct device *dev, if (!component) return -ENOMEM;
+ component->alloced_component = 1; return snd_soc_add_component(dev, component, component_driver, dai_drv, num_dai); } @@ -3523,7 +3524,9 @@ static int __snd_soc_unregister_component(struct device *dev)
if (found) { snd_soc_component_cleanup(component); - kfree(component); + + if (component->alloced_component) + kfree(component); }
return found; ------------------------
Hi,
Thank you for looking into it so quickly.
Dne četrtek, 08. marec 2018 ob 02:21:02 CET je Kuninori Morimoto napisal(a):
Hi Jernej
Thank you for your hard work
I found the issue. Commit be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm:
replace platform to component") changes struct dmaengine_pcm: struct dmaengine_pcm {
struct dma_chan *chan[SNDRV_PCM_STREAM_LAST + 1]; const struct snd_dmaengine_pcm_config *config;
- struct snd_soc_platform platform;
struct snd_soc_component component;
unsigned int flags;
};
In snd_dmaengine_pcm_register(): ret = snd_soc_add_component(dev, &pcm->component,
&dmaengine_pcm_component, NULL, 0);
And now, sun4i-codec first time returns -EPROBE_DEFER since driver for analog part is not yet loaded. Because of that, all components get destroyed.
snd_dmaengine_pcm_unregister() calls snd_soc_unregister_component() and that one calls __snd_soc_unregister_component() multiple times (until it fails).
Issue is that __snd_soc_unregister_component() uses kfree() on component pointer and that naturally can't succed since component was never kmalloc'ed since it is a part of a bigger structure - struct dmaengine_pcm.
What would be the best fix? Changing struct dmaengine_pcm to have pointer to a component, so it can be freed?
Ahh.. indeed. Good catch ! How about to add such flag ? This is just idea. No tested, No compiled, but can help you ?
One note here is that reusing "registered_as_component" flag is not good idea, because it will be removed when platform/codec were removed
diff --git a/include/sound/soc.h b/include/sound/soc.h index 1a73232..b9b1b4c 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -853,6 +853,7 @@ struct snd_soc_component { unsigned int ignore_pmdown_time:1; /* pmdown_time is ignored at stop */ unsigned int registered_as_component:1; unsigned int suspended:1; /* is in suspend PM state */
unsigned int alloced_component:1;
struct list_head list; struct list_head card_aux_list; /* for auxiliary bound components */
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c0edac8..0e33bcf 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -3492,6 +3492,7 @@ int snd_soc_register_component(struct device *dev, if (!component) return -ENOMEM;
- component->alloced_component = 1; return snd_soc_add_component(dev, component, component_driver, dai_drv, num_dai);
} @@ -3523,7 +3524,9 @@ static int __snd_soc_unregister_component(struct device *dev)
if (found) { snd_soc_component_cleanup(component);
kfree(component);
if (component->alloced_component)
kfree(component);
}
return found;
I tested this patch and there is no crash anymore. If you will send it as a fix, you can add:
Reported-by: Jernej Skrabec jernej.skrabec@siol.net Tested-by: Jernej Skrabec jernej.skrabec@siol.net
Best regards, Jernej
On Thu, Mar 08, 2018 at 01:21:02AM +0000, Kuninori Morimoto wrote:
Ahh.. indeed. Good catch ! How about to add such flag ? This is just idea. No tested, No compiled, but can help you ?
I think this makes sense as a patch. We might want to disallow allocating components as part of a bigger struct so everything is more consistent but that's a bigger thing.
Hi Mark,Jernej
Ahh.. indeed. Good catch ! How about to add such flag ? This is just idea. No tested, No compiled, but can help you ?
I think this makes sense as a patch. We might want to disallow allocating components as part of a bigger struct so everything is more consistent but that's a bigger thing.
(snip)
I tested this patch and there is no crash anymore. If you will send it as a fix, you can add:
Reported-by: Jernej Skrabec jernej.skrabec@siol.net Tested-by: Jernej Skrabec jernej.skrabec@siol.net
previous my patch used new flag (= .alloced_component), but I think it is not good idea. And I noticed that snd_soc_add_component() is also calling kfree(component) (= has same bug).
So how about below one ? I want to post it instead of previous.
# I will go to ELC next week, thus posting patch will be # 2weeks later
------------ diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c0edac8..4a8de23 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -3476,7 +3476,6 @@ int snd_soc_add_component(struct device *dev, err_cleanup: snd_soc_component_cleanup(component); err_free: - kfree(component); return ret; } EXPORT_SYMBOL_GPL(snd_soc_add_component); @@ -3488,7 +3487,7 @@ int snd_soc_register_component(struct device *dev, { struct snd_soc_component *component;
- component = kzalloc(sizeof(*component), GFP_KERNEL); + component = devm_kzalloc(dev, sizeof(*component), GFP_KERNEL); if (!component) return -ENOMEM;
@@ -3523,7 +3522,6 @@ static int __snd_soc_unregister_component(struct device *dev)
if (found) { snd_soc_component_cleanup(component); - kfree(component); }
return found; ------------
Hi,
Dne petek, 09. marec 2018 ob 00:49:18 CET je Kuninori Morimoto napisal(a):
Hi Mark,Jernej
Ahh.. indeed. Good catch ! How about to add such flag ? This is just idea. No tested, No compiled, but can help you ?
I think this makes sense as a patch. We might want to disallow allocating components as part of a bigger struct so everything is more consistent but that's a bigger thing.
(snip)
I tested this patch and there is no crash anymore. If you will send it as a fix, you can add:
Reported-by: Jernej Skrabec jernej.skrabec@siol.net Tested-by: Jernej Skrabec jernej.skrabec@siol.net
previous my patch used new flag (= .alloced_component), but I think it is not good idea. And I noticed that snd_soc_add_component() is also calling kfree(component) (= has same bug).
So how about below one ? I want to post it instead of previous.
# I will go to ELC next week, thus posting patch will be # 2weeks later
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c0edac8..4a8de23 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -3476,7 +3476,6 @@ int snd_soc_add_component(struct device *dev, err_cleanup: snd_soc_component_cleanup(component); err_free:
- kfree(component); return ret;
} EXPORT_SYMBOL_GPL(snd_soc_add_component); @@ -3488,7 +3487,7 @@ int snd_soc_register_component(struct device *dev, { struct snd_soc_component *component;
- component = kzalloc(sizeof(*component), GFP_KERNEL);
- component = devm_kzalloc(dev, sizeof(*component), GFP_KERNEL); if (!component) return -ENOMEM;
@@ -3523,7 +3522,6 @@ static int __snd_soc_unregister_component(struct device *dev)
if (found) { snd_soc_component_cleanup(component);
kfree(component);
}
return found;
That patch also prevents the crash, so you can add my tested-by and reported- by tags for this patch too.
Best regards, Jernej
On Thu, Mar 08, 2018 at 11:49:18PM +0000, Kuninori Morimoto wrote:
previous my patch used new flag (= .alloced_component), but I think it is not good idea. And I noticed that snd_soc_add_component() is also calling kfree(component) (= has same bug).
So how about below one ? I want to post it instead of previous.
That should work also.
# I will go to ELC next week, thus posting patch will be # 2weeks later
I'll be there as well.
The patch
soc-core: don't call kfree() for component
has been applied to the asoc tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
From 7ecbd6a91b1e9bb90a4f3be641669347aacc5ab5 Mon Sep 17 00:00:00 2001
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Date: Mon, 19 Mar 2018 07:27:17 +0000 Subject: [PATCH] soc-core: don't call kfree() for component
When driver register its component to ALSA SoC, almost all drivers are using snd_soc_register_component(), but soc-generic-dmaengine-pcm is using snd_soc_add_component().
Existing component function had been assumed that registered component was allocated, and it calling kfree() for it. But, the user who used snd_soc_add_component() doesn't.
This patch uses devm_kzalloc() instead of kzalloc() for component, and doesn't call kree() anymore. This patch fixes commit be7ee5f32a9a ("ASoC: soc-generic-dmaengine-pcm: replace platform to component"). Allwinner H3 SoC will crash without this patch. Thanks Jernej report.
Reported-by: Jernej Skrabec jernej.skrabec@siol.net Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Tested-by: Jernej Skrabec jernej.skrabec@siol.net Signed-off-by: Mark Brown broonie@kernel.org --- sound/soc/soc-core.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index 9558125b448d..8b1ef90f7b57 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -3454,7 +3454,6 @@ int snd_soc_add_component(struct device *dev, err_cleanup: snd_soc_component_cleanup(component); err_free: - kfree(component); return ret; } EXPORT_SYMBOL_GPL(snd_soc_add_component); @@ -3466,7 +3465,7 @@ int snd_soc_register_component(struct device *dev, { struct snd_soc_component *component;
- component = kzalloc(sizeof(*component), GFP_KERNEL); + component = devm_kzalloc(dev, sizeof(*component), GFP_KERNEL); if (!component) return -ENOMEM;
@@ -3501,7 +3500,6 @@ static int __snd_soc_unregister_component(struct device *dev)
if (found) { snd_soc_component_cleanup(component); - kfree(component); }
return found;
participants (3)
-
Jernej Škrabec
-
Kuninori Morimoto
-
Mark Brown