Re: [alsa-devel] ALSA sound core device deinit crash
(Please don't drop Cc to ML)
On Fri, 08 Jul 2016 07:35:37 +0200, b_lkasam@codeaurora.org wrote:
On 2016-07-08 10:58, Takashi Iwai wrote:
On Fri, 08 Jul 2016 07:19:05 +0200, b_lkasam@codeaurora.org wrote:
Hi Alsa team, There is kernel crash observed when soundcard register failure case as below ->
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c old mode 100644 new mode 100755 index 0495890..60a1eb0 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1382,6 +1382,7 @@ static int soc_probe_link_dais(struct snd_soc_card *card, int num, int order) /* do machine specific initialization */ if (dai_link->init) { ret = dai_link->init(rtd);
- ret = -ENODEV; // -> we can force error here to reproduce crash
easily. if (ret < 0) { dev_err(card->dev, "ASoC: failed to init %s: %d\n", dai_link->name, ret);
If sound card fails at initialize at above, it is crashing in pcm_chmap_ctl_private_free().
<1>[ 40.646642] [01-01-2016 00:07:16 CPU:0x3] Unable to handle kernel paging request at virtual address ffffffc0da644b68 <1>[ 40.646664] [01-01-2016 00:07:16 CPU:0x3] pgd = ffffffc001d28000 <1>[ 40.646673] [01-01-2016 00:07:16 CPU:0x3] [ffffffc0da644b68] *pgd=00000000857fc003, *pud=00000000857fc003, *pmd=000000017ddd8003, *pte=00c000015a644793 <0>[ 40.646697] [01-01-2016 00:07:16 CPU:0x3] Internal error: Oops: 9600004f [#1] PREEMPT SMP <6>[ 40.646708] [01-01-2016 00:07:16 CPU:0x3] Modules linked in: brcm_bt_drv fm_drv brcm_hci_ldisc texfat(PO) <6>[ 40.646735] [01-01-2016 00:07:16 CPU:0x3] CPU: 3 PID: 299 Comm: kworker/u8:8 Tainted: P O 3.18.24-g41b2dda2-00002-gbe25a74 #1 <6>[ 40.646744] [01-01-2016 00:07:16 CPU:0x3] Hardware name: Qualcomm Technologies, Inc. MSM 8996 v3.x + PMI8996 MTP (DT) <6>[ 40.646763] [01-01-2016 00:07:16 CPU:0x3] Workqueue: deferwq deferred_probe_work_func <6>[ 40.646773] [01-01-2016 00:07:16 CPU:0x3] task: ffffffc0e951c880 ti: ffffffc03368c000 task.ti: ffffffc03368c000 <6>[ 40.646787] [01-01-2016 00:07:16 CPU:0x3] PC is at pcm_chmap_ctl_private_free+0x1c/0x2c <6>[ 40.646798] [01-01-2016 00:07:16 CPU:0x3] LR is at snd_ctl_free_one+0x20/0x34
FIX:
Can you look at the change below and share your comments on this? diff --git a/sound/core/device.c b/sound/core/device.c old mode 100644 new mode 100755 index 41bec30..eaffde1 --- a/sound/core/device.c +++ b/sound/core/device.c @@ -219,6 +219,7 @@ void snd_device_free_all(struct snd_card *card)
if (snd_BUG_ON(!card)) return;
list_for_each_entry_safe_reverse(dev, next, &card->devices,
list)
list_for_each_entry_safe(dev, next, &card->devices,
list) __snd_device_free(dev); }
Since control sound device has the lowest type value (SNDRV_DEV_CONTROL), it will be the first entry linked in the card->devices linked list head and will be the last one to be freed.
This issue seems to be resolved by modifying the sequence the sound devices in the card->devices list are freed as shown below (from “prev” direction to “next” direction) but I’m not sure if this is a right approach from ALSA perspective.
This doesn't look correct. The strange thing is that this error shouldn't happen no matter which free loop direction is. The chmap ctl should have been already removed by the disconnection before freeing. It means that either the disconnection isn't done properly or something else is missing.
Could you give the full stack trace? It's important to know which code path triggers it.
thanks,
Takashi
Hi Takashi,
Here is full stack trace -->
[Callstack] : (struct snd_pcm_chmap *)info = 0xFFFFFFC0DA6A5200 : (struct snd_pcm *)pcm = 0xFFFFFFC0DA644A80 //part of buddy page, read-only : info->stream = 0 : snd_card_free() was called
-012|pcm_chmap_ctl_private_free() //info->pcm->streams[info->stream].chmap_kctl = NULL -013|snd_ctl_free_one() -014|snd_ctl_remove() -015|snd_ctl_dev_free() -016|__snd_device_free() -017|snd_device_free_all() -018|snd_card_do_free(inline) -018|release_card_device() -019|device_release() -020|kobject_cleanup(inline) -020|kobject_release() -021|kobject_put() -022|put_device() -023|snd_card_free_when_closed() -024|snd_card_free() -025|snd_soc_instantiate_card(inline) -> //Here instantiate card failed for some reason, then triggers snd_card_free -025|snd_soc_register_card() -026|msm8996_asoc_machine_probe() -027|platform_drv_probe() -028|really_probe(inline) -028|driver_probe_device() -029|__device_attach() -030|bus_for_each_drv() -031|device_attach() -032|bus_probe_device() -033|deferred_probe_work_func() -034|static_key_count(inline) -034|static_key_false(inline) -034|trace_workqueue_execute_end(inline) -034|process_one_work() -035|worker_thread() -036|kthread() -037|ret_from_fork(asm) ---|end of frame
trigger point in soc-core.c in API snd_soc_instantiate_card(),
card_probe_error: if (card->remove) card->remove(card); snd_card_free(card->snd_card); -> this is
snd_card_free which internally leads to above function call.
OK, thanks. So this is the case where it frees before registering, and indeed there is a bug in the PCM chmap code. It's freed at disconnection but the disconnect is called only when it was registered.
Below is a quick fix. Give it a try.
thanks,
Takashi
--- diff --git a/sound/core/control.c b/sound/core/control.c index 9ff081cd03f4..fb096cb20a80 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -160,6 +160,8 @@ void snd_ctl_notify(struct snd_card *card, unsigned int mask, if (snd_BUG_ON(!card || !id)) return; + if (card->shutdown) + return; read_lock(&card->ctl_files_rwlock); #if IS_ENABLED(CONFIG_SND_MIXER_OSS) card->mixer_oss_change_count++; diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 308c9ecf73db..8e980aa678d0 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -849,6 +849,14 @@ int snd_pcm_new_internal(struct snd_card *card, const char *id, int device, } EXPORT_SYMBOL(snd_pcm_new_internal);
+static void free_chmap(struct snd_pcm_str *pstr) +{ + if (pstr->chmap_kctl) { + snd_ctl_remove(pstr->pcm->card, pstr->chmap_kctl); + pstr->chmap_kctl = NULL; + } +} + static void snd_pcm_free_stream(struct snd_pcm_str * pstr) { struct snd_pcm_substream *substream, *substream_next; @@ -871,6 +879,7 @@ static void snd_pcm_free_stream(struct snd_pcm_str * pstr) kfree(setup); } #endif + free_chmap(pstr); if (pstr->substream_count) put_device(&pstr->dev); } @@ -1135,10 +1144,7 @@ static int snd_pcm_dev_disconnect(struct snd_device *device) for (cidx = 0; cidx < 2; cidx++) { if (!pcm->internal) snd_unregister_device(&pcm->streams[cidx].dev); - if (pcm->streams[cidx].chmap_kctl) { - snd_ctl_remove(pcm->card, pcm->streams[cidx].chmap_kctl); - pcm->streams[cidx].chmap_kctl = NULL; - } + free_chmap(&pcm->streams[cidx]); } mutex_unlock(&pcm->open_mutex); mutex_unlock(®ister_mutex);
On 2016-07-08 11:23, Takashi Iwai wrote:
(Please don't drop Cc to ML)
On Fri, 08 Jul 2016 07:35:37 +0200, b_lkasam@codeaurora.org wrote:
On 2016-07-08 10:58, Takashi Iwai wrote:
On Fri, 08 Jul 2016 07:19:05 +0200, b_lkasam@codeaurora.org wrote:
Hi Alsa team, There is kernel crash observed when soundcard register failure case as below ->
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c old mode 100644 new mode 100755 index 0495890..60a1eb0 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1382,6 +1382,7 @@ static int soc_probe_link_dais(struct snd_soc_card *card, int num, int order) /* do machine specific initialization */ if (dai_link->init) { ret = dai_link->init(rtd);
- ret = -ENODEV; // -> we can force error here to reproduce crash
easily. if (ret < 0) { dev_err(card->dev, "ASoC: failed to init %s: %d\n", dai_link->name, ret);
If sound card fails at initialize at above, it is crashing in pcm_chmap_ctl_private_free().
<1>[ 40.646642] [01-01-2016 00:07:16 CPU:0x3] Unable to handle kernel paging request at virtual address ffffffc0da644b68 <1>[ 40.646664] [01-01-2016 00:07:16 CPU:0x3] pgd = ffffffc001d28000 <1>[ 40.646673] [01-01-2016 00:07:16 CPU:0x3] [ffffffc0da644b68] *pgd=00000000857fc003, *pud=00000000857fc003, *pmd=000000017ddd8003, *pte=00c000015a644793 <0>[ 40.646697] [01-01-2016 00:07:16 CPU:0x3] Internal error: Oops: 9600004f [#1] PREEMPT SMP <6>[ 40.646708] [01-01-2016 00:07:16 CPU:0x3] Modules linked in: brcm_bt_drv fm_drv brcm_hci_ldisc texfat(PO) <6>[ 40.646735] [01-01-2016 00:07:16 CPU:0x3] CPU: 3 PID: 299 Comm: kworker/u8:8 Tainted: P O 3.18.24-g41b2dda2-00002-gbe25a74 #1 <6>[ 40.646744] [01-01-2016 00:07:16 CPU:0x3] Hardware name: Qualcomm Technologies, Inc. MSM 8996 v3.x + PMI8996 MTP (DT) <6>[ 40.646763] [01-01-2016 00:07:16 CPU:0x3] Workqueue: deferwq deferred_probe_work_func <6>[ 40.646773] [01-01-2016 00:07:16 CPU:0x3] task: ffffffc0e951c880 ti: ffffffc03368c000 task.ti: ffffffc03368c000 <6>[ 40.646787] [01-01-2016 00:07:16 CPU:0x3] PC is at pcm_chmap_ctl_private_free+0x1c/0x2c <6>[ 40.646798] [01-01-2016 00:07:16 CPU:0x3] LR is at snd_ctl_free_one+0x20/0x34
FIX:
Can you look at the change below and share your comments on this? diff --git a/sound/core/device.c b/sound/core/device.c old mode 100644 new mode 100755 index 41bec30..eaffde1 --- a/sound/core/device.c +++ b/sound/core/device.c @@ -219,6 +219,7 @@ void snd_device_free_all(struct snd_card *card)
if (snd_BUG_ON(!card)) return;
list_for_each_entry_safe_reverse(dev, next, &card->devices,
list)
list_for_each_entry_safe(dev, next, &card->devices,
list) __snd_device_free(dev); }
Since control sound device has the lowest type value (SNDRV_DEV_CONTROL), it will be the first entry linked in the card->devices linked list head and will be the last one to be freed.
This issue seems to be resolved by modifying the sequence the sound devices in the card->devices list are freed as shown below (from “prev” direction to “next” direction) but I’m not sure if this is a right approach from ALSA perspective.
This doesn't look correct. The strange thing is that this error shouldn't happen no matter which free loop direction is. The chmap ctl should have been already removed by the disconnection before freeing. It means that either the disconnection isn't done properly or something else is missing.
Could you give the full stack trace? It's important to know which code path triggers it.
thanks,
Takashi
Hi Takashi,
Here is full stack trace -->
[Callstack] : (struct snd_pcm_chmap *)info = 0xFFFFFFC0DA6A5200 : (struct snd_pcm *)pcm = 0xFFFFFFC0DA644A80 //part of buddy page, read-only : info->stream = 0 : snd_card_free() was called
-012|pcm_chmap_ctl_private_free() //info->pcm->streams[info->stream].chmap_kctl = NULL -013|snd_ctl_free_one() -014|snd_ctl_remove() -015|snd_ctl_dev_free() -016|__snd_device_free() -017|snd_device_free_all() -018|snd_card_do_free(inline) -018|release_card_device() -019|device_release() -020|kobject_cleanup(inline) -020|kobject_release() -021|kobject_put() -022|put_device() -023|snd_card_free_when_closed() -024|snd_card_free() -025|snd_soc_instantiate_card(inline) -> //Here instantiate card failed for some reason, then triggers snd_card_free -025|snd_soc_register_card() -026|msm8996_asoc_machine_probe() -027|platform_drv_probe() -028|really_probe(inline) -028|driver_probe_device() -029|__device_attach() -030|bus_for_each_drv() -031|device_attach() -032|bus_probe_device() -033|deferred_probe_work_func() -034|static_key_count(inline) -034|static_key_false(inline) -034|trace_workqueue_execute_end(inline) -034|process_one_work() -035|worker_thread() -036|kthread() -037|ret_from_fork(asm) ---|end of frame
trigger point in soc-core.c in API snd_soc_instantiate_card(),
card_probe_error: if (card->remove) card->remove(card); snd_card_free(card->snd_card); -> this is
snd_card_free which internally leads to above function call.
OK, thanks. So this is the case where it frees before registering, and indeed there is a bug in the PCM chmap code. It's freed at disconnection but the disconnect is called only when it was registered.
Below is a quick fix. Give it a try.
thanks,
Takashi
diff --git a/sound/core/control.c b/sound/core/control.c index 9ff081cd03f4..fb096cb20a80 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -160,6 +160,8 @@ void snd_ctl_notify(struct snd_card *card, unsigned int mask,
if (snd_BUG_ON(!card || !id)) return;
- if (card->shutdown)
read_lock(&card->ctl_files_rwlock);return;
#if IS_ENABLED(CONFIG_SND_MIXER_OSS) card->mixer_oss_change_count++; diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 308c9ecf73db..8e980aa678d0 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -849,6 +849,14 @@ int snd_pcm_new_internal(struct snd_card *card, const char *id, int device, } EXPORT_SYMBOL(snd_pcm_new_internal);
+static void free_chmap(struct snd_pcm_str *pstr) +{
- if (pstr->chmap_kctl) {
snd_ctl_remove(pstr->pcm->card, pstr->chmap_kctl);
pstr->chmap_kctl = NULL;
- }
+}
static void snd_pcm_free_stream(struct snd_pcm_str * pstr) { struct snd_pcm_substream *substream, *substream_next; @@ -871,6 +879,7 @@ static void snd_pcm_free_stream(struct snd_pcm_str
- pstr) kfree(setup); }
#endif
- free_chmap(pstr); if (pstr->substream_count) put_device(&pstr->dev);
} @@ -1135,10 +1144,7 @@ static int snd_pcm_dev_disconnect(struct snd_device *device) for (cidx = 0; cidx < 2; cidx++) { if (!pcm->internal) snd_unregister_device(&pcm->streams[cidx].dev);
if (pcm->streams[cidx].chmap_kctl) {
snd_ctl_remove(pcm->card, pcm->streams[cidx].chmap_kctl);
pcm->streams[cidx].chmap_kctl = NULL;
}
} mutex_unlock(&pcm->open_mutex); mutex_unlock(®ister_mutex);free_chmap(&pcm->streams[cidx]);
hi Takashi, Thanks for the patch.
It is working fine after above patch.
thanks Kasam
On Mon, 11 Jul 2016 08:45:20 +0200, b_lkasam@codeaurora.org wrote:
On 2016-07-08 11:23, Takashi Iwai wrote:
(Please don't drop Cc to ML)
On Fri, 08 Jul 2016 07:35:37 +0200, b_lkasam@codeaurora.org wrote:
On 2016-07-08 10:58, Takashi Iwai wrote:
On Fri, 08 Jul 2016 07:19:05 +0200, b_lkasam@codeaurora.org wrote:
Hi Alsa team, There is kernel crash observed when soundcard register failure case as below ->
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c old mode 100644 new mode 100755 index 0495890..60a1eb0 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1382,6 +1382,7 @@ static int soc_probe_link_dais(struct snd_soc_card *card, int num, int order) /* do machine specific initialization */ if (dai_link->init) { ret = dai_link->init(rtd);
- ret = -ENODEV; // -> we can force error here to reproduce crash
easily. if (ret < 0) { dev_err(card->dev, "ASoC: failed to init %s: %d\n", dai_link->name, ret);
If sound card fails at initialize at above, it is crashing in pcm_chmap_ctl_private_free().
<1>[ 40.646642] [01-01-2016 00:07:16 CPU:0x3] Unable to handle kernel paging request at virtual address ffffffc0da644b68 <1>[ 40.646664] [01-01-2016 00:07:16 CPU:0x3] pgd = ffffffc001d28000 <1>[ 40.646673] [01-01-2016 00:07:16 CPU:0x3] [ffffffc0da644b68] *pgd=00000000857fc003, *pud=00000000857fc003, *pmd=000000017ddd8003, *pte=00c000015a644793 <0>[ 40.646697] [01-01-2016 00:07:16 CPU:0x3] Internal error: Oops: 9600004f [#1] PREEMPT SMP <6>[ 40.646708] [01-01-2016 00:07:16 CPU:0x3] Modules linked in: brcm_bt_drv fm_drv brcm_hci_ldisc texfat(PO) <6>[ 40.646735] [01-01-2016 00:07:16 CPU:0x3] CPU: 3 PID: 299 Comm: kworker/u8:8 Tainted: P O 3.18.24-g41b2dda2-00002-gbe25a74 #1 <6>[ 40.646744] [01-01-2016 00:07:16 CPU:0x3] Hardware name: Qualcomm Technologies, Inc. MSM 8996 v3.x + PMI8996 MTP (DT) <6>[ 40.646763] [01-01-2016 00:07:16 CPU:0x3] Workqueue: deferwq deferred_probe_work_func <6>[ 40.646773] [01-01-2016 00:07:16 CPU:0x3] task: ffffffc0e951c880 ti: ffffffc03368c000 task.ti: ffffffc03368c000 <6>[ 40.646787] [01-01-2016 00:07:16 CPU:0x3] PC is at pcm_chmap_ctl_private_free+0x1c/0x2c <6>[ 40.646798] [01-01-2016 00:07:16 CPU:0x3] LR is at snd_ctl_free_one+0x20/0x34
FIX:
Can you look at the change below and share your comments on this? diff --git a/sound/core/device.c b/sound/core/device.c old mode 100644 new mode 100755 index 41bec30..eaffde1 --- a/sound/core/device.c +++ b/sound/core/device.c @@ -219,6 +219,7 @@ void snd_device_free_all(struct snd_card *card)
if (snd_BUG_ON(!card)) return;
list_for_each_entry_safe_reverse(dev, next, &card->devices,
list)
list_for_each_entry_safe(dev, next, &card->devices,
list) __snd_device_free(dev); }
Since control sound device has the lowest type value (SNDRV_DEV_CONTROL), it will be the first entry linked in the card->devices linked list head and will be the last one to be freed.
This issue seems to be resolved by modifying the sequence the sound devices in the card->devices list are freed as shown below (from “prev” direction to “next” direction) but I’m not sure if this is a right approach from ALSA perspective.
This doesn't look correct. The strange thing is that this error shouldn't happen no matter which free loop direction is. The chmap ctl should have been already removed by the disconnection before freeing. It means that either the disconnection isn't done properly or something else is missing.
Could you give the full stack trace? It's important to know which code path triggers it.
thanks,
Takashi
Hi Takashi,
Here is full stack trace -->
[Callstack] : (struct snd_pcm_chmap *)info = 0xFFFFFFC0DA6A5200 : (struct snd_pcm *)pcm = 0xFFFFFFC0DA644A80 //part of buddy page, read-only : info->stream = 0 : snd_card_free() was called
-012|pcm_chmap_ctl_private_free() //info->pcm->streams[info->stream].chmap_kctl = NULL -013|snd_ctl_free_one() -014|snd_ctl_remove() -015|snd_ctl_dev_free() -016|__snd_device_free() -017|snd_device_free_all() -018|snd_card_do_free(inline) -018|release_card_device() -019|device_release() -020|kobject_cleanup(inline) -020|kobject_release() -021|kobject_put() -022|put_device() -023|snd_card_free_when_closed() -024|snd_card_free() -025|snd_soc_instantiate_card(inline) -> //Here instantiate card failed for some reason, then triggers snd_card_free -025|snd_soc_register_card() -026|msm8996_asoc_machine_probe() -027|platform_drv_probe() -028|really_probe(inline) -028|driver_probe_device() -029|__device_attach() -030|bus_for_each_drv() -031|device_attach() -032|bus_probe_device() -033|deferred_probe_work_func() -034|static_key_count(inline) -034|static_key_false(inline) -034|trace_workqueue_execute_end(inline) -034|process_one_work() -035|worker_thread() -036|kthread() -037|ret_from_fork(asm) ---|end of frame
trigger point in soc-core.c in API snd_soc_instantiate_card(),
card_probe_error: if (card->remove) card->remove(card); snd_card_free(card->snd_card); -> this is
snd_card_free which internally leads to above function call.
OK, thanks. So this is the case where it frees before registering, and indeed there is a bug in the PCM chmap code. It's freed at disconnection but the disconnect is called only when it was registered.
Below is a quick fix. Give it a try.
thanks,
Takashi
diff --git a/sound/core/control.c b/sound/core/control.c index 9ff081cd03f4..fb096cb20a80 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -160,6 +160,8 @@ void snd_ctl_notify(struct snd_card *card, unsigned int mask,
if (snd_BUG_ON(!card || !id)) return;
- if (card->shutdown)
read_lock(&card->ctl_files_rwlock);return;
#if IS_ENABLED(CONFIG_SND_MIXER_OSS) card->mixer_oss_change_count++; diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 308c9ecf73db..8e980aa678d0 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -849,6 +849,14 @@ int snd_pcm_new_internal(struct snd_card *card, const char *id, int device, } EXPORT_SYMBOL(snd_pcm_new_internal);
+static void free_chmap(struct snd_pcm_str *pstr) +{
- if (pstr->chmap_kctl) {
snd_ctl_remove(pstr->pcm->card, pstr->chmap_kctl);
pstr->chmap_kctl = NULL;
- }
+}
static void snd_pcm_free_stream(struct snd_pcm_str * pstr) { struct snd_pcm_substream *substream, *substream_next; @@ -871,6 +879,7 @@ static void snd_pcm_free_stream(struct snd_pcm_str * pstr) kfree(setup); } #endif
- free_chmap(pstr); if (pstr->substream_count) put_device(&pstr->dev);
} @@ -1135,10 +1144,7 @@ static int snd_pcm_dev_disconnect(struct snd_device *device) for (cidx = 0; cidx < 2; cidx++) { if (!pcm->internal) snd_unregister_device(&pcm->streams[cidx].dev);
if (pcm->streams[cidx].chmap_kctl) {
snd_ctl_remove(pcm->card, pcm->streams[cidx].chmap_kctl);
pcm->streams[cidx].chmap_kctl = NULL;
}
} mutex_unlock(&pcm->open_mutex); mutex_unlock(®ister_mutex);free_chmap(&pcm->streams[cidx]);
hi Takashi, Thanks for the patch.
It is working fine after above patch.
Good to hear. I queued the fix now. It'll be likely included in 4.7-final.
thanks,
Takashi
participants (2)
-
b_lkasam@codeaurora.org
-
Takashi Iwai