[alsa-devel] [BUG BISECT] NULL pointer after commit "ASoC: dapm: Only power up active channels from a DAI"
Hi,
Bisect pointed to commit: commit 078a85f2806f0ffd11289009462a6a390f9adb5c Author: Charles Keepax ckeepax@opensource.cirrus.com Date: Thu Jan 31 13:30:18 2019 +0000 ASoC: dapm: Only power up active channels from a DAI
as a bad commit for NULL pointer on my Odroid XU3 and Odroid U3 board when doing "aplay /usr/share/sounds/alsa/Front_Right.wav".
1. Arch ARM Linux 2. exynos_defconfig 3. Odroid U3, XU3, Exynos SoC, ARMv7
Last address is in calltrace: c079552c dapm_update_dai_unlocked sound/soc/soc-dapm.c:2586
Full log:
Unable to handle kernel NULL pointer dereference at virtual address 0000007c pgd = d5b32ef7 [0000007c] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: s5p_mfc s5p_jpeg exynos_gsc v4l2_mem2mem v4l2_common videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common videodev media s5p_cec CPU: 7 PID: 359 Comm: aplay Not tainted 5.0.0-rc4-00177-g078a85f2806f #32 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) PC is at dapm_update_dai_unlocked+0x38/0x138 LR is at snd_soc_dapm_update_dai+0x34/0x4c pc : [<c079552c>] lr : [<c0795660>] psr: 60010013 sp : eb9d9be8 ip : eabc8b64 fp : ecce0e10 r10: c100b448 r9 : ecce0e0c r8 : 00000000 r7 : 00000000 r6 : 00000000 r5 : 00000002 r4 : ecce0800 r3 : 00000000 r2 : eccaa080 r1 : ecdab000 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 6c09806a DAC: 00000051 Process aplay (pid: 359, stack limit = 0x4ae0a814) Stack: (0xeb9d9be8 to 0xeb9da000) 9be0: ecce0e0c c100b448 ecce0e10 ecce0800 ecd64400 ecdab000 9c00: eccaa080 c0795660 ee3c480c ecd64400 ecce0800 ecdab000 00000000 c079d760 9c20: eccaa080 ecce080c 00000000 00000001 00000000 00000000 00000000 00000000 9c40: 00000000 00000000 00000000 00000004 00000000 00000000 00000000 00000000 9c60: 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 9c80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9ca0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9cc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9ce0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9d00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9d20: 00000000 00000000 00000000 00000010 00000010 00000004 00000020 00000020 9d40: 00000004 00000002 00000002 00000004 0000bb80 0000bb80 00000004 0001e848 9d60: 0001e848 00000004 00001770 00001770 00000004 00005dc0 00005dc0 00000004 9d80: 00000004 00000004 00000004 0007a120 0007a120 00000004 00005dc0 00005dc0 9da0: 00000004 00017700 00017700 00000004 00000000 00000000 00000004 00000000 9dc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9de0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9e00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9e20: 00000000 00000000 00000000 000fff07 00200103 00000010 0000bb80 00000001 9e40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9e60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9e80: 00000000 8c6972bb ecdab000 ecd64400 ecdab000 00000000 c0b97b1c ebfe8000 9ea0: 00000000 bea4eea0 b6effb88 c07825e4 c100b448 ecdab000 ecd64400 00000000 9ec0: ecfdd180 ec1bde40 bea4eea0 c07836e0 eb9d9f78 00000000 00000000 00000001 9ee0: 61746164 c0294d70 00000001 8c6972bb c100b448 bea4eea0 ecfdd180 c02a9bac 9f00: bea4eea0 eb9d8000 ed4aafa8 c02a9244 eb9cbb80 c100b448 00000000 00000000 9f20: 00000000 00000000 00000000 8c6972bb eb9d9fb0 ec6f11c0 eb9cbb80 00000002 9f40: eb9cbb88 00000001 b6e85b6f 00000001 61746164 c0295028 00000000 00000000 9f60: c100b448 8c6972bb 00000000 ecfdd180 ecfdd180 00000004 c25c4111 bea4eea0 9f80: eb9d8000 00000036 b6effb88 c02a9bac 004cf1b8 bea4eea0 004cf208 00000036 9fa0: c01011c4 c0101000 004cf1b8 bea4eea0 00000004 c25c4111 bea4eea0 00020001 9fc0: 004cf1b8 bea4eea0 004cf208 00000036 bea4ee10 00000000 b6fb5900 b6effb88 9fe0: b6f6c82c bea4edf4 b6efbfc4 b6e0bbdc 200d0010 00000004 00000000 00000000 [<c079552c>] (dapm_update_dai_unlocked) from [<c0795660>] (snd_soc_dapm_update_dai+0x34/0x4c) [<c0795660>] (snd_soc_dapm_update_dai) from [<c079d760>] (soc_pcm_hw_params+0x22c/0x624) [<c079d760>] (soc_pcm_hw_params) from [<c07825e4>] (snd_pcm_hw_params+0x128/0x378) [<c07825e4>] (snd_pcm_hw_params) from [<c07836e0>] (snd_pcm_ioctl+0x26c/0x1258) [<c07836e0>] (snd_pcm_ioctl) from [<c02a9244>] (do_vfs_ioctl+0xb0/0x9e4) [<c02a9244>] (do_vfs_ioctl) from [<c02a9bac>] (ksys_ioctl+0x34/0x58) [<c02a9bac>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0xeb9d9fa8 to 0xeb9d9ff0) 9fa0: 004cf1b8 bea4eea0 00000004 c25c4111 bea4eea0 00020001 9fc0: 004cf1b8 bea4eea0 004cf208 00000036 bea4ee10 00000000 b6fb5900 b6effb88 9fe0: b6f6c82c bea4edf4 b6efbfc4 b6e0bbdc Code: e3530000 e5926020 1a00003a e1a07006 (e5b7407c) ---[ end trace 8561600efcd2b2f1 ]---
Let me know if you need any other data.
Best regards, Krzysztof
Hi Krzysztof,
Cc: Charles
On 2/5/19 22:16, Krzysztof Kozlowski wrote:
Bisect pointed to commit: commit 078a85f2806f0ffd11289009462a6a390f9adb5c Author: Charles Keepax ckeepax@opensource.cirrus.com Date: Thu Jan 31 13:30:18 2019 +0000 ASoC: dapm: Only power up active channels from a DAI
as a bad commit for NULL pointer on my Odroid XU3 and Odroid U3 board when doing "aplay /usr/share/sounds/alsa/Front_Right.wav".
- Arch ARM Linux
- exynos_defconfig
- Odroid U3, XU3, Exynos SoC, ARMv7
Last address is in calltrace: c079552c dapm_update_dai_unlocked sound/soc/soc-dapm.c:2586
Thanks for bisecting this, I ran into same issue last night and I'm starting to debug this now. I have added some debug prints and it looks like it oopses on NULL playback_widget of the dummy DAI.
[ 30.701182] hdmi-audio-codec hdmi-audio-codec.0.auto: Update DAI routes for i2s-hifi playback [ 30.709630] dapm_update_dai_unlocked:2586 w=8bd27d28 [ 30.714403] dapm_update_dai_unlocked:2594 w=8bd27d28 [ 30.724688] max98090 5-0010: Update DAI routes for HiFi playback [ 30.730163] dapm_update_dai_unlocked:2586 w=3fc942af [ 30.735154] dapm_update_dai_unlocked:2594 w=3fc942af [ 30.745051] snd-soc-dummy snd-soc-dummy: Update DAI routes for snd-soc-dummy-dai playback [ 30.753128] dapm_update_dai_unlocked:2586 w= (null) [ 30.758114] Unable to handle kernel NULL pointer dereference at virtual address 0000007c
DAIs linked to the dummy will not have an associated playback/capture widget, so we need to skip the update in that case.
Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com ---
Ok so that all makes sense, this patch is probably the best fix?
Thanks, Charles
sound/soc/soc-dapm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index 482ddb825fb59..5235d8828758a 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2570,6 +2570,9 @@ static int dapm_update_dai_unlocked(struct snd_pcm_substream *substream, else w = dai->capture_widget;
+ if (!w) + return 0; + dev_dbg(dai->dev, "Update DAI routes for %s %s\n", dai->name, dir == SNDRV_PCM_STREAM_PLAYBACK ? "playback" : "capture");
On 2/6/19 11:05, Charles Keepax wrote:
DAIs linked to the dummy will not have an associated playback/capture widget, so we need to skip the update in that case.
Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com
Tested-by: Sylwester Nawrocki s.nawrocki@samsung.com
Ok so that all makes sense, this patch is probably the best fix?
It seems so, everything works well with such change, thank you.
sound/soc/soc-dapm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index 482ddb825fb59..5235d8828758a 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2570,6 +2570,9 @@ static int dapm_update_dai_unlocked(struct snd_pcm_substream *substream, else w = dai->capture_widget;
- if (!w)
return 0;
- dev_dbg(dai->dev, "Update DAI routes for %s %s\n", dai->name, dir == SNDRV_PCM_STREAM_PLAYBACK ? "playback" : "capture");
On Wed, 6 Feb 2019 at 11:05, Charles Keepax ckeepax@opensource.cirrus.com wrote:
DAIs linked to the dummy will not have an associated playback/capture widget, so we need to skip the update in that case.
Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com
Ok so that all makes sense, this patch is probably the best fix?
Thanks, Charles
For this particular issue (NULL-pointer): Reported-by: Krzysztof Kozlowski krzk@kernel.org Tested-by: Krzysztof Kozlowski krzk@kernel.org
However now I see bug sleeping in atomic context:
[ 64.000828] BUG: sleeping function called from invalid context at ../kernel/locking/mutex.c:908 [ 64.008483] in_atomic(): 1, irqs_disabled(): 128, pid: 353, name: aplay [ 64.014978] 2 locks held by aplay/353: [ 64.018671] #0: 1fb9b63d (&(&group->lock)->rlock){....}, at: snd_pcm_action_lock_irq+0x18/0x3c [ 64.027337] #1: 8b42bfe8 (&(&pri_dai->spinlock)->rlock){....}, at: i2s_trigger+0x130/0x6c8 [ 64.035654] irq event stamp: 8754 [ 64.038925] hardirqs last enabled at (8753): [<c0a78758>] _raw_spin_unlock_irq+0x24/0x5c [ 64.047094] hardirqs last disabled at (8754): [<c0a785a0>] _raw_spin_lock_irq+0x18/0x50 [ 64.055068] softirqs last enabled at (8096): [<c0102698>] __do_softirq+0x4f0/0x5e4 [ 64.062680] softirqs last disabled at (8083): [<c012ee94>] irq_exit+0x160/0x16c [ 64.069953] Preemption disabled at: [ 64.069956] [<00000000>] (null) [ 64.076700] CPU: 6 PID: 353 Comm: aplay Not tainted 5.0.0-rc5-next-20190206-00001-g101ffa564f78 #348 [ 64.085822] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 64.091882] [<c0112318>] (unwind_backtrace) from [<c010df2c>] (show_stack+0x10/0x14) [ 64.099601] [<c010df2c>] (show_stack) from [<c0a5626c>] (dump_stack+0x98/0xc4) [ 64.106788] [<c0a5626c>] (dump_stack) from [<c0156ac8>] (___might_sleep+0x264/0x2cc) [ 64.114501] [<c0156ac8>] (___might_sleep) from [<c0a732b4>] (__mutex_lock+0x40/0xa98) [ 64.122300] [<c0a732b4>] (__mutex_lock) from [<c0a73d28>] (mutex_lock_nested+0x1c/0x24) [ 64.130275] [<c0a73d28>] (mutex_lock_nested) from [<c04c2838>] (clk_prepare_lock+0x50/0xf8) [ 64.138596] [<c04c2838>] (clk_prepare_lock) from [<c04c5f48>] (clk_core_get_rate+0xc/0x60) [ 64.146824] [<c04c5f48>] (clk_core_get_rate) from [<c07b419c>] (i2s_trigger+0x444/0x6c8) [ 64.154883] [<c07b419c>] (i2s_trigger) from [<c079cba0>] (soc_pcm_trigger+0x100/0x140) [ 64.162768] [<c079cba0>] (soc_pcm_trigger) from [<c07839c0>] (snd_pcm_action_single+0x38/0x80) [ 64.171349] [<c07839c0>] (snd_pcm_action_single) from [<c0783a5c>] (snd_pcm_action+0x54/0x5c) [ 64.179841] [<c0783a5c>] (snd_pcm_action) from [<c0783bd4>] (snd_pcm_action_lock_irq+0x28/0x3c) [ 64.188508] [<c0783bd4>] (snd_pcm_action_lock_irq) from [<c07860b0>] (snd_pcm_ioctl+0x920/0x123c) [ 64.197350] [<c07860b0>] (snd_pcm_ioctl) from [<c02aa6d4>] (do_vfs_ioctl+0xb0/0x9f8) [ 64.205054] [<c02aa6d4>] (do_vfs_ioctl) from [<c02ab050>] (ksys_ioctl+0x34/0x5c) [ 64.212418] [<c02ab050>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28) [ 64.220045] Exception stack(0xe69dffa8 to 0xe69dfff0) [ 64.225058] ffa0: 004391b8 00001770 00000004 00004142 00439340 00439340 [ 64.233218] ffc0: 004391b8 00001770 00001770 00000036 00001770 00000000 00000000 be8db940 [ 64.241361] ffe0: b6fa382c be8db8ec b6f32a3c b6e42bdc [ 64.246386] [ 64.247823] ====================================================== [ 64.253979] WARNING: possible circular locking dependency detected [ 64.260133] 5.0.0-rc5-next-20190206-00001-g101ffa564f78 #348 Tainted: G W [ 64.268193] ------------------------------------------------------ [ 64.274342] aplay/353 is trying to acquire lock: [ 64.278937] 51044846 (prepare_lock){+.+.}, at: clk_prepare_lock+0x50/0xf8 [ 64.285694] [ 64.285694] but task is already holding lock: [ 64.291500] 8b42bfe8 (&(&pri_dai->spinlock)->rlock){....}, at: i2s_trigger+0x130/0x6c8 [ 64.299387] [ 64.299387] which lock already depends on the new lock. [ 64.299387] [ 64.307534] [ 64.307534] the existing dependency chain (in reverse order) is: [ 64.314985] [ 64.314985] -> #1 (&(&pri_dai->spinlock)->rlock){....}: [ 64.321667] clk_mux_set_parent+0x34/0xb8 [ 64.326162] clk_core_set_parent_nolock+0x21c/0x54c [ 64.331535] clk_set_parent+0x38/0x6c [ 64.335696] of_clk_set_defaults+0x11c/0x384 [ 64.340460] of_clk_add_provider+0x8c/0xc8 [ 64.345054] samsung_i2s_probe+0x484/0x64c [ 64.349651] platform_drv_probe+0x6c/0xa4 [ 64.354153] really_probe+0x280/0x414 [ 64.358311] driver_probe_device+0x78/0x1c4 [ 64.362991] bus_for_each_drv+0x74/0xb8 [ 64.367323] __device_attach+0xd4/0x16c [ 64.371655] bus_probe_device+0x88/0x90 [ 64.375988] deferred_probe_work_func+0x4c/0xd0 [ 64.381017] process_one_work+0x228/0x810 [ 64.385520] worker_thread+0x30/0x570 [ 64.389681] kthread+0x134/0x164 [ 64.393405] ret_from_fork+0x14/0x20 [ 64.397477] (null) [ 64.400249] [ 64.400249] -> #0 (prepare_lock){+.+.}: [ 64.405539] __mutex_lock+0x7c/0xa98 [ 64.409610] mutex_lock_nested+0x1c/0x24 [ 64.414029] clk_prepare_lock+0x50/0xf8 [ 64.418362] clk_core_get_rate+0xc/0x60 [ 64.422695] i2s_trigger+0x444/0x6c8 [ 64.426768] soc_pcm_trigger+0x100/0x140 [ 64.431188] snd_pcm_action_single+0x38/0x80 [ 64.435953] snd_pcm_action+0x54/0x5c [ 64.440112] snd_pcm_action_lock_irq+0x28/0x3c [ 64.445052] snd_pcm_ioctl+0x920/0x123c [ 64.449386] do_vfs_ioctl+0xb0/0x9f8 [ 64.453457] ksys_ioctl+0x34/0x5c [ 64.457269] ret_fast_syscall+0x0/0x28 [ 64.461516] 0xbe8db8ec [ 64.464460] [ 64.464460] other info that might help us debug this: [ 64.464460] [ 64.472438] Possible unsafe locking scenario: [ 64.472438] [ 64.478327] CPU0 CPU1 [ 64.482831] ---- ---- [ 64.487336] lock(&(&pri_dai->spinlock)->rlock); [ 64.492017] lock(prepare_lock); [ 64.497823] lock(&(&pri_dai->spinlock)->rlock); [ 64.505017] lock(prepare_lock); [ 64.508306] [ 64.508306] *** DEADLOCK *** [ 64.508306] [ 64.514203] 2 locks held by aplay/353: [ 64.517935] #0: 1fb9b63d (&(&group->lock)->rlock){....}, at: snd_pcm_action_lock_irq+0x18/0x3c [ 64.526596] #1: 8b42bfe8 (&(&pri_dai->spinlock)->rlock){....}, at: i2s_trigger+0x130/0x6c8 [ 64.534915] [ 64.534915] stack backtrace: [ 64.539246] CPU: 6 PID: 353 Comm: aplay Tainted: G W 5.0.0-rc5-next-20190206-00001-g101ffa564f78 #348 [ 64.549734] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 64.555802] [<c0112318>] (unwind_backtrace) from [<c010df2c>] (show_stack+0x10/0x14) [ 64.563515] [<c010df2c>] (show_stack) from [<c0a5626c>] (dump_stack+0x98/0xc4) [ 64.570708] [<c0a5626c>] (dump_stack) from [<c01846b0>] (print_circular_bug+0x210/0x334) [ 64.578765] [<c01846b0>] (print_circular_bug) from [<c01874ec>] (__lock_acquire+0x12cc/0x1a5c) [ 64.587344] [<c01874ec>] (__lock_acquire) from [<c01886d8>] (lock_acquire+0xe0/0x268) [ 64.595142] [<c01886d8>] (lock_acquire) from [<c0a732f0>] (__mutex_lock+0x7c/0xa98) [ 64.602768] [<c0a732f0>] (__mutex_lock) from [<c0a73d28>] (mutex_lock_nested+0x1c/0x24) [ 64.610740] [<c0a73d28>] (mutex_lock_nested) from [<c04c2838>] (clk_prepare_lock+0x50/0xf8) [ 64.619059] [<c04c2838>] (clk_prepare_lock) from [<c04c5f48>] (clk_core_get_rate+0xc/0x60) [ 64.627291] [<c04c5f48>] (clk_core_get_rate) from [<c07b419c>] (i2s_trigger+0x444/0x6c8) [ 64.635349] [<c07b419c>] (i2s_trigger) from [<c079cba0>] (soc_pcm_trigger+0x100/0x140) [ 64.643235] [<c079cba0>] (soc_pcm_trigger) from [<c07839c0>] (snd_pcm_action_single+0x38/0x80) [ 64.651815] [<c07839c0>] (snd_pcm_action_single) from [<c0783a5c>] (snd_pcm_action+0x54/0x5c) [ 64.660306] [<c0783a5c>] (snd_pcm_action) from [<c0783bd4>] (snd_pcm_action_lock_irq+0x28/0x3c) [ 64.668972] [<c0783bd4>] (snd_pcm_action_lock_irq) from [<c07860b0>] (snd_pcm_ioctl+0x920/0x123c) [ 64.677811] [<c07860b0>] (snd_pcm_ioctl) from [<c02aa6d4>] (do_vfs_ioctl+0xb0/0x9f8) [ 64.685522] [<c02aa6d4>] (do_vfs_ioctl) from [<c02ab050>] (ksys_ioctl+0x34/0x5c) [ 64.692887] [<c02ab050>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28) [ 64.700510] Exception stack(0xe69dffa8 to 0xe69dfff0) [ 64.705536] ffa0: 004391b8 00001770 00000004 00004142 00439340 00439340 [ 64.713684] ffc0: 004391b8 00001770 00001770 00000036 00001770 00000000 00000000 be8db940 [ 64.721828] ffe0: b6fa382c be8db8ec b6f32a3c b6e42bdc
Best regards, Krzysztof
On Wed, Feb 06, 2019 at 11:22:33AM +0100, Krzysztof Kozlowski wrote:
On Wed, 6 Feb 2019 at 11:05, Charles Keepax ckeepax@opensource.cirrus.com wrote:
DAIs linked to the dummy will not have an associated playback/capture widget, so we need to skip the update in that case.
Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com
Ok so that all makes sense, this patch is probably the best fix?
Thanks, Charles
For this particular issue (NULL-pointer): Reported-by: Krzysztof Kozlowski krzk@kernel.org Tested-by: Krzysztof Kozlowski krzk@kernel.org
However now I see bug sleeping in atomic context:
[ 64.000828] BUG: sleeping function called from invalid context at ../kernel/locking/mutex.c:908
Does this probably definitely get fixed by reverting my patch? It's just a bit odd as this seems to be complaining about a clock operation in i2s_trigger and I don't think my patch should have any affect on the trigger callback. It should get run from either the dai_link DAPM event or hw_params, neither of which should happen in an atomic context.
[ 64.307534] the existing dependency chain (in reverse order) is: [ 64.314985] [ 64.314985] -> #1 (&(&pri_dai->spinlock)->rlock){....}: [ 64.321667] clk_mux_set_parent+0x34/0xb8 [ 64.326162] clk_core_set_parent_nolock+0x21c/0x54c [ 64.331535] clk_set_parent+0x38/0x6c [ 64.335696] of_clk_set_defaults+0x11c/0x384 [ 64.340460] of_clk_add_provider+0x8c/0xc8 [ 64.345054] samsung_i2s_probe+0x484/0x64c [ 64.349651] platform_drv_probe+0x6c/0xa4 [ 64.354153] really_probe+0x280/0x414 [ 64.358311] driver_probe_device+0x78/0x1c4 [ 64.362991] bus_for_each_drv+0x74/0xb8 [ 64.367323] __device_attach+0xd4/0x16c [ 64.371655] bus_probe_device+0x88/0x90 [ 64.375988] deferred_probe_work_func+0x4c/0xd0 [ 64.381017] process_one_work+0x228/0x810 [ 64.385520] worker_thread+0x30/0x570 [ 64.389681] kthread+0x134/0x164 [ 64.393405] ret_from_fork+0x14/0x20 [ 64.397477] (null) [ 64.400249] [ 64.400249] -> #0 (prepare_lock){+.+.}: [ 64.405539] __mutex_lock+0x7c/0xa98 [ 64.409610] mutex_lock_nested+0x1c/0x24 [ 64.414029] clk_prepare_lock+0x50/0xf8 [ 64.418362] clk_core_get_rate+0xc/0x60 [ 64.422695] i2s_trigger+0x444/0x6c8 [ 64.426768] soc_pcm_trigger+0x100/0x140 [ 64.431188] snd_pcm_action_single+0x38/0x80 [ 64.435953] snd_pcm_action+0x54/0x5c [ 64.440112] snd_pcm_action_lock_irq+0x28/0x3c [ 64.445052] snd_pcm_ioctl+0x920/0x123c [ 64.449386] do_vfs_ioctl+0xb0/0x9f8 [ 64.453457] ksys_ioctl+0x34/0x5c [ 64.457269] ret_fast_syscall+0x0/0x28 [ 64.461516] 0xbe8db8ec
Thanks, Charles
On Wed, 6 Feb 2019 at 12:00, Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Wed, Feb 06, 2019 at 11:22:33AM +0100, Krzysztof Kozlowski wrote:
On Wed, 6 Feb 2019 at 11:05, Charles Keepax ckeepax@opensource.cirrus.com wrote:
DAIs linked to the dummy will not have an associated playback/capture widget, so we need to skip the update in that case.
Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com
Ok so that all makes sense, this patch is probably the best fix?
Thanks, Charles
For this particular issue (NULL-pointer): Reported-by: Krzysztof Kozlowski krzk@kernel.org Tested-by: Krzysztof Kozlowski krzk@kernel.org
However now I see bug sleeping in atomic context:
[ 64.000828] BUG: sleeping function called from invalid context at ../kernel/locking/mutex.c:908
Does this probably definitely get fixed by reverting my patch? It's just a bit odd as this seems to be complaining about a clock operation in i2s_trigger and I don't think my patch should have any affect on the trigger callback. It should get run from either the dai_link DAPM event or hw_params, neither of which should happen in an atomic context.
Before this fixup, probably NULL pointer happened before any of this. I tried it now few times and the possible deadlock and sleeping in invalid context did not appear. It might be random/racy or totally unrelated to your change.
Best regards, Krzysztof
On Wed, Feb 06, 2019 at 12:14:56PM +0100, Krzysztof Kozlowski wrote:
On Wed, 6 Feb 2019 at 12:00, Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Wed, Feb 06, 2019 at 11:22:33AM +0100, Krzysztof Kozlowski wrote:
On Wed, 6 Feb 2019 at 11:05, Charles Keepax ckeepax@opensource.cirrus.com wrote:
DAIs linked to the dummy will not have an associated playback/capture widget, so we need to skip the update in that case.
Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com
Ok so that all makes sense, this patch is probably the best fix?
Thanks, Charles
For this particular issue (NULL-pointer): Reported-by: Krzysztof Kozlowski krzk@kernel.org Tested-by: Krzysztof Kozlowski krzk@kernel.org
However now I see bug sleeping in atomic context:
[ 64.000828] BUG: sleeping function called from invalid context at ../kernel/locking/mutex.c:908
Does this probably definitely get fixed by reverting my patch? It's just a bit odd as this seems to be complaining about a clock operation in i2s_trigger and I don't think my patch should have any affect on the trigger callback. It should get run from either the dai_link DAPM event or hw_params, neither of which should happen in an atomic context.
Before this fixup, probably NULL pointer happened before any of this. I tried it now few times and the possible deadlock and sleeping in invalid context did not appear. It might be random/racy or totally unrelated to your change.
Looking through I think this is an unrelated issue. Assuming the driver in question is (sound/soc/samsung/i2s.c). Inside i2s_trigger, there is a call to config_setup(i2s), which in turn will call clk_get_rate if i2s->quirks contains the flag QUIRK_NO_MUXPSR.
The trigger callback can be made from an atomic context and clk_get_rate will take the prepare lock of the clock. The clock prepare lock is always a mutex which shouldn't be locked from an atomic context. So it seems like this will fail whenever that QUIRK_NO_MUXPSR flag is set, no idea what causes that to be set.
It looks like this bug was introduced by this change:
647d04f8e07a ("ASoC: samsung: i2s: Ensure the RCLK rate is properly determined")
Thanks, Charles
On 2/6/19 12:45, Charles Keepax wrote:
Looking through I think this is an unrelated issue. Assuming the driver in question is (sound/soc/samsung/i2s.c). Inside i2s_trigger, there is a call to config_setup(i2s), which in turn will call clk_get_rate if i2s->quirks contains the flag QUIRK_NO_MUXPSR.
The trigger callback can be made from an atomic context and clk_get_rate will take the prepare lock of the clock. The clock prepare lock is always a mutex which shouldn't be locked from an atomic context. So it seems like this will fail whenever that QUIRK_NO_MUXPSR flag is set, no idea what causes that to be set.
It looks like this bug was introduced by this change:
647d04f8e07a ("ASoC: samsung: i2s: Ensure the RCLK rate is properly determined")
Thanks or having a look at this. I somehow overlooked it before, there are multiple issues with that clk_get_rate() call. Apart of the problem described above config_setup() is already called with the i2s->lock spinlock held, exactly same spinlock that is passed to the clock API when registering a clock of which we try to get rate. :/ Presumably it works due to clk rate caching.
Whether QUIRK_NO_MUXPSR flag is set or not depends on the HW type, it is not set for modern SoCs so most of the time we will hit the problem in I2S master configurations.
As we are not using set_sysclk() of the CPU DAI I'm thinking about moving the clk_get_rate() call to the CPU DAI's hw_params() callback. The clock rate may be changed in hw_params() of an ASoC machine driver, and the CPU DAI needs to adjust to those changes.
It feels like locking in that driver might need revisiting, there is quite a lot happening while holding a spinlock.
On 2/6/19 10:46, Sylwester Nawrocki wrote:
On 2/5/19 22:16, Krzysztof Kozlowski wrote:
Bisect pointed to commit: commit 078a85f2806f0ffd11289009462a6a390f9adb5c Author: Charles Keepax ckeepax@opensource.cirrus.com Date: Thu Jan 31 13:30:18 2019 +0000 ASoC: dapm: Only power up active channels from a DAI
as a bad commit for NULL pointer on my Odroid XU3 and Odroid U3 board when doing "aplay /usr/share/sounds/alsa/Front_Right.wav".
- Arch ARM Linux
- exynos_defconfig
- Odroid U3, XU3, Exynos SoC, ARMv7
Last address is in calltrace: c079552c dapm_update_dai_unlocked sound/soc/soc-dapm.c:2586
Thanks for bisecting this, I ran into same issue last night and I'm starting to debug this now. I have added some debug prints and it looks like it oopses on NULL playback_widget of the dummy DAI.
[ 30.701182] hdmi-audio-codec hdmi-audio-codec.0.auto: Update DAI routes for i2s-hifi playback [ 30.709630] dapm_update_dai_unlocked:2586 w=8bd27d28 [ 30.714403] dapm_update_dai_unlocked:2594 w=8bd27d28 [ 30.724688] max98090 5-0010: Update DAI routes for HiFi playback [ 30.730163] dapm_update_dai_unlocked:2586 w=3fc942af [ 30.735154] dapm_update_dai_unlocked:2594 w=3fc942af [ 30.745051] snd-soc-dummy snd-soc-dummy: Update DAI routes for snd-soc-dummy-dai playback [ 30.753128] dapm_update_dai_unlocked:2586 w= (null) [ 30.758114] Unable to handle kernel NULL pointer dereference at virtual address 0000007c
With a change as below there is no oops and everything works again, but I'm not sure this is a proper fix.
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index 5b74dffc9c11..111a23a9708a 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2580,6 +2580,9 @@ static int dapm_update_dai_unlocked(struct snd_pcm_substream *substream, else w = dai->capture_widget;
+ if (!w) + return 0; + dev_dbg(dai->dev, "Update DAI routes for %s %s\n", dai->name, dir == SNDRV_PCM_STREAM_PLAYBACK ? "playback" : "capture");
On Wed, Feb 06, 2019 at 11:11:03AM +0100, Sylwester Nawrocki wrote:
On 2/6/19 10:46, Sylwester Nawrocki wrote:
On 2/5/19 22:16, Krzysztof Kozlowski wrote:
Bisect pointed to commit: commit 078a85f2806f0ffd11289009462a6a390f9adb5c Author: Charles Keepax ckeepax@opensource.cirrus.com Date: Thu Jan 31 13:30:18 2019 +0000 ASoC: dapm: Only power up active channels from a DAI
as a bad commit for NULL pointer on my Odroid XU3 and Odroid U3 board when doing "aplay /usr/share/sounds/alsa/Front_Right.wav".
- Arch ARM Linux
- exynos_defconfig
- Odroid U3, XU3, Exynos SoC, ARMv7
Last address is in calltrace: c079552c dapm_update_dai_unlocked sound/soc/soc-dapm.c:2586
Thanks for bisecting this, I ran into same issue last night and I'm starting to debug this now. I have added some debug prints and it looks like it oopses on NULL playback_widget of the dummy DAI.
[ 30.701182] hdmi-audio-codec hdmi-audio-codec.0.auto: Update DAI routes for i2s-hifi playback [ 30.709630] dapm_update_dai_unlocked:2586 w=8bd27d28 [ 30.714403] dapm_update_dai_unlocked:2594 w=8bd27d28 [ 30.724688] max98090 5-0010: Update DAI routes for HiFi playback [ 30.730163] dapm_update_dai_unlocked:2586 w=3fc942af [ 30.735154] dapm_update_dai_unlocked:2594 w=3fc942af [ 30.745051] snd-soc-dummy snd-soc-dummy: Update DAI routes for snd-soc-dummy-dai playback [ 30.753128] dapm_update_dai_unlocked:2586 w= (null) [ 30.758114] Unable to handle kernel NULL pointer dereference at virtual address 0000007c
With a change as below there is no oops and everything works again, but I'm not sure this is a proper fix.
I think this is the proper fix, if we have DAIs which don't have capture/playback widgets then there should be no parts of the DAPM graph that need updated in relation to the number of channels, so skipping the processing seems fine.
Thanks, Charles
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index 5b74dffc9c11..111a23a9708a 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2580,6 +2580,9 @@ static int dapm_update_dai_unlocked(struct snd_pcm_substream *substream, else w = dai->capture_widget;
if (!w)
return 0;
dev_dbg(dai->dev, "Update DAI routes for %s %s\n", dai->name, dir == SNDRV_PCM_STREAM_PLAYBACK ? "playback" : "capture");
-- Thanks, Sylwester
On Tue, Feb 05, 2019 at 10:16:22PM +0100, Krzysztof Kozlowski wrote:
Bisect pointed to commit: commit 078a85f2806f0ffd11289009462a6a390f9adb5c Author: Charles Keepax ckeepax@opensource.cirrus.com Date: Thu Jan 31 13:30:18 2019 +0000 ASoC: dapm: Only power up active channels from a DAI
as a bad commit for NULL pointer on my Odroid XU3 and Odroid U3 board when doing "aplay /usr/share/sounds/alsa/Front_Right.wav".
- Arch ARM Linux
- exynos_defconfig
- Odroid U3, XU3, Exynos SoC, ARMv7
Last address is in calltrace: c079552c dapm_update_dai_unlocked sound/soc/soc-dapm.c:2586
Apologies for this one, is a little strange. I am assuming either we end up with a NULL widget or more likely given the line number a NULL path/widget on the path. Having some difficulty seeing how we get into that state though.
Could you confirm what machine driver/DT is being used here? I am assuming this is arch/arm/boot/dts/exynos4412-odroidu3.dts and sound/soc/samsung/odroid.c. So will investigate those see if I can find anything.
And could you try this diff and send through the output:
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index 482ddb825fb59..9ee44b0192874 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2570,16 +2570,37 @@ static int dapm_update_dai_unlocked(struct snd_pcm_substream *substream, else w = dai->capture_widget;
+ if (!w) { + pr_err("CK -- NULL widget on %s\n", dai->name); + return 0; + } + dev_dbg(dai->dev, "Update DAI routes for %s %s\n", dai->name, dir == SNDRV_PCM_STREAM_PLAYBACK ? "playback" : "capture");
snd_soc_dapm_widget_for_each_sink_path(w, p) { + if (!p) { + pr_err("DEBUG -- NULL sink path on %s\n", w->name); + continue; + } + if (!p->sink) { + pr_err("DEBUG -- NULL wink widget on %s\n", w->name); + continue; + } ret = dapm_update_dai_chan(p, p->sink, channels); if (ret < 0) return ret; }
snd_soc_dapm_widget_for_each_source_path(w, p) { + if (!p) { + pr_err("DEBUG -- NULL source path on %s\n", w->name); + continue; + } + if (!p->source) { + pr_err("DEBUG -- NULL source widget on %s\n", w->name); + continue; + } ret = dapm_update_dai_chan(p, p->source, channels); if (ret < 0) return ret;
Thanks, Charles
participants (3)
-
Charles Keepax
-
Krzysztof Kozlowski
-
Sylwester Nawrocki