On Sun, 29 Oct 2017 10:57:58 +0100, Takashi Iwai wrote:
On Fri, 27 Oct 2017 10:11:18 +0200, Dmitry Vyukov wrote:
On Fri, Oct 27, 2017 at 10:09 AM, syzbot bot+7feb8de6b4d6bf810cf098bef942cc387e79d0ad@syzkaller.appspotmail.com wrote:
Hello,
syzkaller hit the following crash on 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master compiler: gcc (GCC) 7.1.1 20170620 .config is attached Raw console output is attached. C reproducer is attached syzkaller reproducer is attached. See https://goo.gl/kgGztJ for information about syzkaller reproducers
============================================ WARNING: possible recursive locking detected 4.14.0-rc1+ #88 Not tainted
syzkaller883997/2981 is trying to acquire lock: (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline] (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
but task is already holding lock: (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline] (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ----
lock(&grp->list_mutex); lock(&grp->list_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by syzkaller883997/2981: #0: (register_mutex#4){+.+.}, at: [<ffffffff83d60ada>] odev_release+0x4a/0x70 sound/core/seq/oss/seq_oss.c:152 #1: (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline] #1: (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
stack backtrace: CPU: 1 PID: 2981 Comm: syzkaller883997 Not tainted 4.14.0-rc1+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_deadlock_bug kernel/locking/lockdep.c:1797 [inline] check_deadlock kernel/locking/lockdep.c:1844 [inline] validate_chain kernel/locking/lockdep.c:2453 [inline] __lock_acquire+0x1232/0x4620 kernel/locking/lockdep.c:3498 lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4002 down_read+0x96/0x150 kernel/locking/rwsem.c:23 deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline] snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807 snd_seq_kernel_client_dispatch+0x11e/0x150 sound/core/seq/seq_clientmgr.c:2309 dummy_input+0x2c4/0x400 sound/core/seq/seq_dummy.c:104 snd_seq_deliver_single_event.constprop.11+0x2fb/0x940 sound/core/seq/seq_clientmgr.c:621 deliver_to_subscribers sound/core/seq/seq_clientmgr.c:676 [inline] snd_seq_deliver_event+0x318/0x790 sound/core/seq/seq_clientmgr.c:807 snd_seq_kernel_client_dispatch+0x11e/0x150 sound/core/seq/seq_clientmgr.c:2309 dummy_input+0x2c4/0x400 sound/core/seq/seq_dummy.c:104 snd_seq_deliver_single_event.constprop.11+0x2fb/0x940 sound/core/seq/seq_clientmgr.c:621 snd_seq_deliver_event+0x12c/0x790 sound/core/seq/seq_clientmgr.c:818 snd_seq_kernel_client_dispatch+0x11e/0x150 sound/core/seq/seq_clientmgr.c:2309 snd_seq_oss_dispatch sound/core/seq/oss/seq_oss_device.h:150 [inline] snd_seq_oss_midi_reset+0x44b/0x700 sound/core/seq/oss/seq_oss_midi.c:481 snd_seq_oss_synth_reset+0x398/0x980 sound/core/seq/oss/seq_oss_synth.c:416 snd_seq_oss_reset+0x6c/0x260 sound/core/seq/oss/seq_oss_init.c:448 snd_seq_oss_release+0x71/0x120 sound/core/seq/oss/seq_oss_init.c:425 odev_release+0x52/0x70 sound/core/seq/oss/seq_oss.c:153 __fput+0x333/0x7f0 fs/file_table.c:210 ____fput+0x15/0x20 fs/file_table.c:244 task_work_run+0x199/0x270 kernel/task_work.c:112 exit_task_work include/linux/task_work.h:21 [inline] do_exit+0xa52/0x1b40 kernel/exit.c:865 do_group_exit+0x149/0x400 kernel/exit.c:968 SYSC_exit_group kernel/exit.c:979 [inline] SyS_exit_group+0x1d/0x20 kernel/exit.c:977 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x442c58 RSP: 002b:00007ffd15d4f8d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000442c58 RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 RBP: 0000000000000082 R08: 00000000000000e7 R09: ffffffffffffffd0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401ca0 R13: 0000000000401d30 R14
I've just re-reproduced this on upstream 15f859ae5c43c7f0a064ed92d33f7a5bc5de6de0 (Oct 26):
============================================ WARNING: possible recursive locking detected 4.14.0-rc6+ #10 Not tainted
a.out/3062 is trying to acquire lock: (&grp->list_mutex){++++}, at: [<ffffffff83d28879>] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline] (&grp->list_mutex){++++}, at: [<ffffffff83d28879>] snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
but task is already holding lock: (&grp->list_mutex){++++}, at: [<ffffffff83d28879>] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline] (&grp->list_mutex){++++}, at: [<ffffffff83d28879>] snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ----
lock(&grp->list_mutex); lock(&grp->list_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
Indeed, this looks more like a simply missing nesting annotation. A totally untested patch is below.
FWIW, the official patch with a proper description is below.
thanks,
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: seq: Fix nested rwsem annotation for lockdep splat
syzkaller reported the lockdep splat due to the possible deadlock of grp->list_mutex of each sequencer client object. Actually this is rather a false-positive report due to the missing nested lock annotations. The sequencer client may deliver the event directly to another client which takes own other lock.
For addressing this issue, this patch replaces the simple down_read() with down_read_nested(). As a lock subclass, the already existing "hop" can be re-used, which indicates the depth of the call.
Reference: http://lkml.kernel.org/r/089e082686ac9b482e055c832617@google.com Reported-by: syzbot bot+7feb8de6b4d6bf810cf098bef942cc387e79d0ad@syzkaller.appspotmail.com Reported-by: Dmitry Vyukov dvyukov@google.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/seq/seq_clientmgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c index 6c9cba2166d9..d10c780dfd54 100644 --- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -663,7 +663,7 @@ static int deliver_to_subscribers(struct snd_seq_client *client, if (atomic) read_lock(&grp->list_lock); else - down_read(&grp->list_mutex); + down_read_nested(&grp->list_mutex, hop); list_for_each_entry(subs, &grp->list_head, src_list) { /* both ports ready? */ if (atomic_read(&subs->ref_count) != 2)