[alsa-devel] [PATCH v3 06/19] ASoC: soc-core: add soc_unbind_dai_link()

Pierre-Louis Bossart pierre-louis.bossart at linux.intel.com
Tue Nov 12 18:11:32 CET 2019



>> Does it happen from soc-topology.c :: remove_link ?

it seems to happen after the topology remove link, see the traces below

> 
> I can't test, but can this patch solve your issue?

No, the problem remains after applying your suggested fix.

I added a bunch of traces and it seems we have a nasty case of corrupted 
linked lists:

diff --git a/sound/soc/soc-component.c b/sound/soc/soc-component.c
index 98ef0666add2..5b0139ebe8f3 100644
--- a/sound/soc/soc-component.c
+++ b/sound/soc/soc-component.c
@@ -518,11 +518,39 @@ int snd_soc_pcm_component_new(struct snd_pcm *pcm)

  void snd_soc_pcm_component_free(struct snd_pcm *pcm)
  {
-       struct snd_soc_pcm_runtime *rtd = pcm->private_data;
+       struct snd_soc_pcm_runtime *rtd;
         struct snd_soc_rtdcom_list *rtdcom;
         struct snd_soc_component *component;

-       for_each_rtd_components(rtd, rtdcom, component)
-               if (component->driver->pcm_destruct)
+       pr_err("plb: %s start\n", __func__);
+
+       if (!pcm)
+               pr_err("plb: %s PCM is NULL\n", __func__);
+
+       pr_err("plb: %s accessing private data\n", __func__);
+       rtd = pcm->private_data;
+       pr_err("plb: %s accessed private data\n", __func__);
+
+       if (!rtd)
+               pr_err("plb: %s RTD is NULL\n", __func__);
+
+       pr_err("plb: %s accessing components\n", __func__);
+       for_each_rtd_components(rtd, rtdcom, component) {
+               pr_err("plb: %s processing component\n", __func__);
+               if (!component)
+                       pr_err("plb: %s component is NULL\n", __func__);
+
+               if (!component->driver)
+                       pr_err("plb: %s component driver is NULL\n", 
__func__);
+
+               pr_err("plb: %s pcm_destruct checks\n", __func__);
+               if (component->driver->pcm_destruct) {
+                       pr_err("plb: %s pcm_destruct start\n", __func__);
                         component->driver->pcm_destruct(component, pcm);
+                       pr_err("plb: %s pcm_destruct done\n", __func__);
+               }
+               pr_err("plb: %s processing component done\n", __func__);
+       }
+
+       pr_err("plb: %s done\n", __func__);
  }

And the results show the for_each_rtd_components loop goes in the weeds.

    82.069990] sof-audio-pci 0000:00:1f.3: plb: remove_link start
[   82.069993] sof-audio-pci 0000:00:1f.3: plb: remove_link 2
[   82.069996] sof-audio-pci 0000:00:1f.3: plb: remove_link before 
snd_soc_remove_dai_link
[   82.069998] plb: snd_soc_remove_dai_link start
[   82.070016] plb: snd_soc_remove_dai_link done
[   82.070020] sof-audio-pci 0000:00:1f.3: plb: remove_link done
<removed DSP power down sequence>
[   82.179021] plb: snd_soc_pcm_component_free start
[   82.179023] plb: snd_soc_pcm_component_free accessing private data
[   82.179024] plb: snd_soc_pcm_component_free accessed private data
[   82.179025] plb: snd_soc_pcm_component_free accessing components
[   82.179025] plb: snd_soc_pcm_component_free processing component
[   82.179029] BUG: kernel NULL pointer dereference, address: 
0000000000000064
[   82.179030] #PF: supervisor read access in kernel mode
[   82.179031] #PF: error_code(0x0000) - not-present page
[   82.179032] PGD 0 P4D 0
[   82.179034] Oops: 0000 [#1] SMP NOPTI
[   82.179036] CPU: 3 PID: 768 Comm: pulseaudio Not tainted 
5.4.0-rc5-test+ #31
[   82.179036] Hardware name: Acer Swift SF314-55/MILLER_WL, BIOS V1.05 
10/03/2018
[   82.179042] RIP: 0010:snd_soc_pcm_component_free+0xc7/0x16a 
[snd_soc_core]
[   82.179043] Code: 43 08 48 c7 c6 f0 24 6e c0 4c 39 e0 0f 84 a9 00 00 
00 48 8b 2b 48 85 ed 0f 84 9d 00 00 00 48 c7 c7 00 51 6e c0 e8 d2 5d 5d 
f2 <48> 83 7d 60 00 75 13 48 c7 c6 f0 24 6e c0 48 c7 c7 20 51 6e c0 e8
[   82.179044] RSP: 0018:ffffa70180bf3d78 EFLAGS: 00010246
[   82.179046] RAX: 0000000000000034 RBX: ffffa00f7aaf3968 RCX: 
0000000000000006
[   82.179047] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 
ffffa00fa5ad63d0
[   82.179048] RBP: 0000000000000004 R08: ffffa70180bf3c3d R09: 
0000000000001518
[   82.179049] R10: ffffa70180bf3c38 R11: ffffa70180bf3c3d R12: 
ffffa00fa1be4eb0
[   82.179050] R13: ffffa00fa27aa000 R14: dead000000000122 R15: 
dead000000000100
[   82.179052] FS:  00007f4e7e5ebc80(0000) GS:ffffa00fa5ac0000(0000) 
knlGS:0000000000000000
[   82.179054] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   82.179055] CR2: 0000000000000064 CR3: 0000000253d68005 CR4: 
00000000003606e0
[   82.179056] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   82.179057] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[   82.179058] Call Trace:
[   82.179064]  snd_pcm_free+0x1a/0x50 [snd_pcm]


I have absolutely no idea what all these data structures are, just 
reporting this.

reverting "ASoC: soc-core: add soc_unbind_dai_link()" is the only 
work-around at this point. i've tested this module load/unload for hours 
without issues.

It's actually quite interesting since this snd_soc_pcm_component_free() 
calls a .pcm_destruct() callback that's not used by the SOF driver. It's 
only used on Intel platforms for the Skylake/SST driver, not sure why 
and if SOF is missing something.


More information about the Alsa-devel mailing list