On Mon, 01 Jan 2018 17:14:25 +0100, Takashi Iwai wrote:
On Mon, 01 Jan 2018 11:29:51 +0100, Lars-Peter Clausen wrote:
On 01/01/2018 10:03 AM, Takashi Iwai wrote: [...]
CPU: 0 PID: 3502 Comm: syzkaller781065 Not tainted 4.15.0-rc5+ #154 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x211/0x2d0 lib/bug.c:184 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 RIP: 0010:snd_pcm_hw_param_first+0x289/0x690 sound/core/pcm_lib.c:1635 RSP: 0018:ffff8801c013f1a0 EFLAGS: 00010293 RAX: ffff8801c03bc3c0 RBX: ffff8801bff08dc0 RCX: ffffffff841bee19 RDX: 0000000000000000 RSI: 00000000ffffffea RDI: ffffed0038027e28 RBP: ffff8801c013f1f0 R08: ffffed0038027d63 R09: ffff8801c013eb10 R10: 0000000000000001 R11: ffffed0038027d62 R12: 000000000000000d R13: 00000000ffffffea R14: 0000000000000005 R15: 0000000000002000 snd_pcm_hw_param_near.constprop.27+0x78d/0x9a0 sound/core/oss/pcm_oss.c:457 snd_pcm_oss_change_params+0x17d3/0x3720 sound/core/oss/pcm_oss.c:969 snd_pcm_oss_make_ready+0xaa/0x130 sound/core/oss/pcm_oss.c:1128 snd_pcm_oss_sync+0x257/0x830 sound/core/oss/pcm_oss.c:1638 snd_pcm_oss_release+0x20b/0x280 sound/core/oss/pcm_oss.c:2431 __fput+0x327/0x7e0 fs/file_table.c:210 ____fput+0x15/0x20 fs/file_table.c:244 task_work_run+0x199/0x270 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x9bb/0x1ad0 kernel/exit.c:865 do_group_exit+0x149/0x400 kernel/exit.c:968 SYSC_exit_group kernel/exit.c:979 [inline] SyS_exit_group+0x1d/0x20 kernel/exit.c:977 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129 RIP: 0023:0xf7f4ec79 RSP: 002b:00000000ffc2c18c EFLAGS: 00000292 ORIG_RAX: 00000000000000fc RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000080f0298 RDX: 0000000000000000 RSI: 00000000080d9b78 RDI: 00000000080f02a0 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds..
This must be a superfluous WARN_ON() call invoked by snd_BUG_ON() check that can be safely ignored. A quick fix patch is below.
I believe those snd_BUG_ON() are there because these calls to refine should never fail based on the checks done earlier. And them triggering indicates that something is wrong somewhere else.
I guess it's a place where no hw_params was set properly (it's through OSS emulation) but at closing it syncs forcibly that triggers the refine snd_BUG_ON(). But maybe better to double-check the condition, yes.
The situation became clearer after looking at the reproducer and the configs closely. This happens because the program tries to open and set up the loopback PCM devices concurrently. Since the aloop driver modifies the hw_params restriction dynamically depending on the connection, the configuration space restriction does change as well even while trying to determine the parameters. Although the whole code has a proper protection and no race happens, the space itself changes; that resulted in the unexpected hw_refine error which assumes only the static rules.
So I still think that the proper way to address this is just to get rid of snd_BUG_ON() checks. We can't remove the dynamic configuration of aloop, as it's the core feature. At most, we may invalidate the configuration in the other side, but still it's tricky and fragile, risky for regressions. In anyway, what's wrong is the assumption of static restrictions in hw params engine, after all.
I'll update the text in the patch to reflect that.
thanks,
Takashi