[alsa-devel] sound: deadlock involving snd_hrtimer_callback
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted ------------------------------------------------------- swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&(&stime->lock)->rlock){-.....}: [<ffffffff81477693>] lock_acquire+0x1e3/0x460 kernel/locking/lockdep.c:3677 [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112 [<ffffffff867c005f>] _raw_spin_lock_irqsave+0x9f/0xd0 kernel/locking/spinlock.c:159 [<ffffffff85383baa>] snd_hrtimer_start+0x4a/0xf0 sound/core/hrtimer.c:112 [<ffffffff85379c74>] snd_timer_start1+0x2b4/0x5a0 sound/core/timer.c:457 [<ffffffff8537d33d>] snd_timer_start+0x5d/0xa0 sound/core/timer.c:571 [< inline >] seq_timer_start sound/core/seq/seq_timer.c:393 [<ffffffff853fc260>] snd_seq_timer_start+0x1a0/0x2b0 sound/core/seq/seq_timer.c:405 [< inline >] snd_seq_queue_process_event sound/core/seq/seq_queue.c:687 [<ffffffff853f6f14>] snd_seq_control_queue+0x304/0x8b0 sound/core/seq/seq_queue.c:748 [<ffffffff853fcbb5>] event_input_timer+0x25/0x30 sound/core/seq/seq_system.c:118 [<ffffffff853ef7d4>] snd_seq_deliver_single_event.constprop.11+0x3f4/0x740 sound/core/seq/seq_clientmgr.c:636 [<ffffffff853efc38>] snd_seq_deliver_event+0x118/0x800 sound/core/seq/seq_clientmgr.c:833 [<ffffffff853f1026>] snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418 [<ffffffff85405c2b>] send_timer_event.isra.0+0x10b/0x150 sound/core/seq/oss/seq_oss_timer.c:153 [<ffffffff8540615a>] snd_seq_oss_timer_start+0x1ca/0x310 sound/core/seq/oss/seq_oss_timer.c:174 [< inline >] old_event sound/core/seq/oss/seq_oss_event.c:125 [<ffffffff854095ff>] snd_seq_oss_process_event+0xa1f/0x2ce0 sound/core/seq/oss/seq_oss_event.c:100 [< inline >] insert_queue sound/core/seq/oss/seq_oss_rw.c:179 [<ffffffff8540c201>] snd_seq_oss_write+0x321/0x810 sound/core/seq/oss/seq_oss_rw.c:148 [<ffffffff85403f99>] odev_write+0x59/0xa0 sound/core/seq/oss/seq_oss.c:177 [<ffffffff817f3fc3>] __vfs_write+0x113/0x4b0 fs/read_write.c:529 [<ffffffff817f5a97>] vfs_write+0x167/0x4a0 fs/read_write.c:578 [< inline >] SYSC_write fs/read_write.c:625 [<ffffffff817f98d1>] SyS_write+0x111/0x220 fs/read_write.c:617 [<ffffffff867c0440>] entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
-> #0 (&(&timer->lock)->rlock){-.-...}: [< inline >] check_prev_add kernel/locking/lockdep.c:1823 [< inline >] check_prevs_add kernel/locking/lockdep.c:1933 [< inline >] validate_chain kernel/locking/lockdep.c:2238 [<ffffffff81474035>] __lock_acquire+0x3625/0x4d00 kernel/locking/lockdep.c:3298 [<ffffffff81477693>] lock_acquire+0x1e3/0x460 kernel/locking/lockdep.c:3677 [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112 [<ffffffff867c005f>] _raw_spin_lock_irqsave+0x9f/0xd0 kernel/locking/spinlock.c:159 [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701 [<ffffffff85383e75>] snd_hrtimer_callback+0x185/0x2b0 sound/core/hrtimer.c:59 [< inline >] __run_hrtimer kernel/time/hrtimer.c:1242 [<ffffffff814d1071>] __hrtimer_run_queues+0x331/0xe90 kernel/time/hrtimer.c:1306 [<ffffffff814d3a42>] hrtimer_interrupt+0x182/0x430 kernel/time/hrtimer.c:1340 [<ffffffff8125aa52>] local_apic_timer_interrupt+0x72/0xe0 arch/x86/kernel/apic/apic.c:907 [<ffffffff867c2d59>] smp_apic_timer_interrupt+0x79/0xa0 arch/x86/kernel/apic/apic.c:931 [<ffffffff867c10ac>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:454 [< inline >] arch_safe_halt ./arch/x86/include/asm/paravirt.h:118 [<ffffffff8120f402>] default_idle+0x52/0x370 arch/x86/kernel/process.c:307 [<ffffffff81210bba>] arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:298 [<ffffffff81454c28>] default_idle_call+0x48/0xa0 kernel/sched/idle.c:93 [< inline >] cpuidle_idle_call kernel/sched/idle.c:151 [< inline >] cpu_idle_loop kernel/sched/idle.c:242 [<ffffffff8145520f>] cpu_startup_entry+0x58f/0x7b0 kernel/sched/idle.c:291 [<ffffffff8679ac6d>] rest_init+0x18d/0x1a0 init/main.c:408 [<ffffffff88ce7759>] start_kernel+0x63a/0x660 init/main.c:661 [<ffffffff88ce6364>] x86_64_start_reservations+0x38/0x3a arch/x86/kernel/head64.c:195 [<ffffffff88ce64be>] x86_64_start_kernel+0x158/0x167 arch/x86/kernel/head64.c:176
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&(&stime->lock)->rlock); lock(&(&timer->lock)->rlock); lock(&(&stime->lock)->rlock); lock(&(&timer->lock)->rlock);
*** DEADLOCK ***
1 lock held by swapper/0/0: #0: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc4+ #351 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffffffff87eb25c0 ffff88003ec07a50 ffffffff82c8f92f ffffffff00000000 fffffbfff0fd64b8 ffffffff8997bad0 ffffffff8997bad0 ffffffff8998aa70 ffffffff87e27510 ffffffff87e26d00 ffff88003ec07aa0 ffffffff8146cf58 Call Trace: <IRQ> [< inline >] __dump_stack lib/dump_stack.c:15 <IRQ> [<ffffffff82c8f92f>] dump_stack+0x12e/0x18f lib/dump_stack.c:51 [<ffffffff8146cf58>] print_circular_bug+0x288/0x340 kernel/locking/lockdep.c:1196 [< inline >] check_prev_add kernel/locking/lockdep.c:1823 [< inline >] check_prevs_add kernel/locking/lockdep.c:1933 [< inline >] validate_chain kernel/locking/lockdep.c:2238 [<ffffffff81474035>] __lock_acquire+0x3625/0x4d00 kernel/locking/lockdep.c:3298 [<ffffffff81477693>] lock_acquire+0x1e3/0x460 kernel/locking/lockdep.c:3677 [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112 [<ffffffff867c005f>] _raw_spin_lock_irqsave+0x9f/0xd0 kernel/locking/spinlock.c:159 [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701 [<ffffffff85383e75>] snd_hrtimer_callback+0x185/0x2b0 sound/core/hrtimer.c:59 [< inline >] __run_hrtimer kernel/time/hrtimer.c:1242 [<ffffffff814d1071>] __hrtimer_run_queues+0x331/0xe90 kernel/time/hrtimer.c:1306 [<ffffffff814d3a42>] hrtimer_interrupt+0x182/0x430 kernel/time/hrtimer.c:1340 [<ffffffff8125aa52>] local_apic_timer_interrupt+0x72/0xe0 arch/x86/kernel/apic/apic.c:907 [<ffffffff867c2d59>] smp_apic_timer_interrupt+0x79/0xa0 arch/x86/kernel/apic/apic.c:931 [<ffffffff867c10ac>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:454 [< inline >] arch_safe_halt ./arch/x86/include/asm/paravirt.h:118 [<ffffffff8120f402>] default_idle+0x52/0x370 arch/x86/kernel/process.c:307 [<ffffffff81210bba>] arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:298 [<ffffffff81454c28>] default_idle_call+0x48/0xa0 kernel/sched/idle.c:93 [< inline >] cpuidle_idle_call kernel/sched/idle.c:151 [< inline >] cpu_idle_loop kernel/sched/idle.c:242 [<ffffffff8145520f>] cpu_startup_entry+0x58f/0x7b0 kernel/sched/idle.c:291 [<ffffffff8679ac6d>] rest_init+0x18d/0x1a0 init/main.c:408 [<ffffffff88ce7759>] start_kernel+0x63a/0x660 init/main.c:661 [<ffffffff88ce6364>] x86_64_start_reservations+0x38/0x3a arch/x86/kernel/head64.c:195 [<ffffffff88ce64be>] x86_64_start_kernel+0x158/0x167 arch/x86/kernel/head64.c:176
They are also followed by rcu stalls and actual spinlock lockups:
BUG: spinlock lockup suspected on CPU#3, syz-fuzzer/7040 lock: 0xffff880032ae2bd8, .magic: dead4ead, .owner: syz-executor/11206, .owner_cpu: 0 CPU: 3 PID: 7040 Comm: syz-fuzzer Not tainted 4.6.0-rc4+ #351 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffffffff87eb25c0 ffff88006d507c40 ffffffff82c8f92f ffffffff2c963698 fffffbfff0fd64b8 ffff880032ae2bd8 ffff88002c963040 ffff880061238040 ffff880032ae2be0 000000009a9d0630 ffff88006d507c78 ffffffff8147f58d Call Trace: <IRQ> [< inline >] __dump_stack lib/dump_stack.c:15 <IRQ> [<ffffffff82c8f92f>] dump_stack+0x12e/0x18f lib/dump_stack.c:51 [<ffffffff8147f58d>] spin_dump+0x14d/0x280 kernel/locking/spinlock_debug.c:67 [< inline >] __spin_lock_debug kernel/locking/spinlock_debug.c:117 [<ffffffff8147f8ad>] do_raw_spin_lock+0x15d/0x2b0 kernel/locking/spinlock_debug.c:137 [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:119 [<ffffffff867c0067>] _raw_spin_lock_irqsave+0xa7/0xd0 kernel/locking/spinlock.c:159 [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701 [<ffffffff85383e75>] snd_hrtimer_callback+0x185/0x2b0 sound/core/hrtimer.c:59 [< inline >] __run_hrtimer kernel/time/hrtimer.c:1242 [<ffffffff814d1071>] __hrtimer_run_queues+0x331/0xe90 kernel/time/hrtimer.c:1306 [<ffffffff814d3a42>] hrtimer_interrupt+0x182/0x430 kernel/time/hrtimer.c:1340 [<ffffffff8125aa52>] local_apic_timer_interrupt+0x72/0xe0 arch/x86/kernel/apic/apic.c:907 [<ffffffff867c2d59>] smp_apic_timer_interrupt+0x79/0xa0 arch/x86/kernel/apic/apic.c:931 [<ffffffff867c10ac>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:454 <EOI>
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
thanks,
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hrtimer: Use manual start/stop in callback (v2)
With the new SNDRV_TIMER_HW_RET_CTRL flag, hrtimer can manage the callback behavior more correctly. Now it gets a return value from snd_timer_interrupt() whether to reprogram or stop the timer, and it can choose the right return value.
Thanks to this change, we can revert the long-standing hack to work around the hrtimer_cancel() call at stopping. Since the stop is called only explicitly outside the callback, hrtimer_cancel() can be called safely now.
This fix essentially reverts the old commit [fcfdebe70759: ALSA: hrtimer - Fix lock-up] while converting to use the new framework with SNDRV_TIMER_HW_RET_CTRL.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/hrtimer.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-)
diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c index 656d9a9032dc..27e2493c94e4 100644 --- a/sound/core/hrtimer.c +++ b/sound/core/hrtimer.c @@ -38,7 +38,6 @@ static unsigned int resolution; struct snd_hrtimer { struct snd_timer *timer; struct hrtimer hrt; - atomic_t running; };
static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt) @@ -46,14 +45,11 @@ static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt) struct snd_hrtimer *stime = container_of(hrt, struct snd_hrtimer, hrt); struct snd_timer *t = stime->timer; unsigned long oruns; - - if (!atomic_read(&stime->running)) - return HRTIMER_NORESTART; + int ret;
oruns = hrtimer_forward_now(hrt, ns_to_ktime(t->sticks * resolution)); - snd_timer_interrupt(stime->timer, t->sticks * oruns); - - if (!atomic_read(&stime->running)) + ret = snd_timer_interrupt(stime->timer, t->sticks * oruns); + if (ret == SNDRV_TIMER_RET_STOP) return HRTIMER_NORESTART; return HRTIMER_RESTART; } @@ -68,7 +64,6 @@ static int snd_hrtimer_open(struct snd_timer *t) hrtimer_init(&stime->hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL); stime->timer = t; stime->hrt.function = snd_hrtimer_callback; - atomic_set(&stime->running, 0); t->private_data = stime; return 0; } @@ -89,24 +84,22 @@ static int snd_hrtimer_start(struct snd_timer *t) { struct snd_hrtimer *stime = t->private_data;
- atomic_set(&stime->running, 0); - hrtimer_try_to_cancel(&stime->hrt); hrtimer_start(&stime->hrt, ns_to_ktime(t->sticks * resolution), HRTIMER_MODE_REL); - atomic_set(&stime->running, 1); return 0; }
static int snd_hrtimer_stop(struct snd_timer *t) { struct snd_hrtimer *stime = t->private_data; - atomic_set(&stime->running, 0); - hrtimer_try_to_cancel(&stime->hrt); + + hrtimer_cancel(&stime->hrt); return 0; }
static struct snd_timer_hardware hrtimer_hw = { - .flags = SNDRV_TIMER_HW_AUTO | SNDRV_TIMER_HW_TASKLET, + .flags = SNDRV_TIMER_HW_AUTO | SNDRV_TIMER_HW_TASKLET | + SNDRV_TIMER_HW_RET_CTRL, .open = snd_hrtimer_open, .close = snd_hrtimer_close, .start = snd_hrtimer_start,
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
On Sun, Apr 24, 2016 at 5:16 PM, Dmitry Vyukov dvyukov@google.com wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Another one:
[ 509.169244] BUG: spinlock lockup suspected on CPU#2, syz-executor/23652 [ 509.169244] lock: 0xffff880065662038, .magic: dead4ead, .owner: syz-executor/23642, .owner_cpu: 0 [ 509.169244] CPU: 2 PID: 23652 Comm: syz-executor Not tainted 4.6.0-rc4+ #352 [ 509.169244] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 509.169244] ffffffff87eb25c0 ffff88006d407c58 ffffffff82c8fe0f ffffffff5f369e98 [ 509.169244] fffffbfff0fd64b8 ffff880065662038 ffff88005f369840 ffff88005e806040 [ 509.169244] ffff880065662040 000000009a9d0630 ffff88006d407c90 ffffffff8147f75d [ 509.169244] Call Trace: [ 509.169244] <IRQ> [<ffffffff82c8fe0f>] dump_stack+0x12e/0x18f [ 509.169244] [<ffffffff8147f75d>] spin_dump+0x14d/0x280 [ 509.169244] [<ffffffff8537ad39>] ? snd_timer_interrupt+0xa9/0xd30 [ 509.169244] [<ffffffff8147fa7d>] do_raw_spin_lock+0x15d/0x2b0 [ 509.169244] [<ffffffff867c04e7>] _raw_spin_lock_irqsave+0xa7/0xd0 [ 509.169244] [<ffffffff8537ad39>] ? snd_timer_interrupt+0xa9/0xd30 [ 509.169244] [<ffffffff8537ad39>] snd_timer_interrupt+0xa9/0xd30 [ 509.169244] [<ffffffff8127c405>] ? kvm_clock_get_cycles+0x25/0x30 [ 509.169244] [<ffffffff814e53ba>] ? ktime_get+0x32a/0x550 [ 509.169244] [<ffffffff853842e7>] ? snd_hrtimer_callback+0xb7/0x180 [ 509.169244] [<ffffffff814cff2c>] ? hrtimer_forward+0x1cc/0x3e0 [ 509.169244] [<ffffffff85384338>] snd_hrtimer_callback+0x108/0x180 [ 509.169244] [<ffffffff814d1241>] __hrtimer_run_queues+0x331/0xe90 [ 509.169244] [<ffffffff85384230>] ? snd_hrtimer_close+0xa0/0xa0 [ 509.169244] [<ffffffff814d0f10>] ? enqueue_hrtimer+0x3d0/0x3d0 [ 509.169244] [<ffffffff814d3c12>] hrtimer_interrupt+0x182/0x430 [ 509.169244] [<ffffffff8125aa62>] local_apic_timer_interrupt+0x72/0xe0 [ 509.169244] [<ffffffff867c31d9>] smp_apic_timer_interrupt+0x79/0xa0 [ 509.169244] [<ffffffff867c152c>] apic_timer_interrupt+0x8c/0xa0 [ 509.169244] <EOI> [<ffffffff8171902e>] ? unmap_page_range+0x8be/0x19f0
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
thanks,
Takashi
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
On Sun, 24 Apr 2016 19:09:48 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
After reconsideration, I rewrote the whole patchset again. Could you scratch all the previous three patches and replace with the single patch below? Sorry for inconvenience!
thanks,
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hrtimer: Handle start/stop more properly
This patch tries to address the still remaining issues in ALSA hrtimer driver: - Spurious use-after-free was detected in hrtimer callback - Incorrect rescheduling due to delayed start - WARN_ON() is triggered in hrtimer_forward() invoked in hrtimer callback
The first issue happens only when the new timer is scheduled even while hrtimer is being closed. It's related with the second and third items; since ALSA timer core invokes hw.start callback during hrtimer interrupt, this may result in the explicit call of hrtimer_start().
Also, the similar problem is seen for the stop; ALSA timer core invokes hw.stop callback even in the hrtimer handler, too. Since we must not call the synced hrtimer_cancel() in such a context, it's just a hrtimer_try_to_cancel() call that doesn't properly work.
Another culprit of the second and third items is the call of hrtimer_forward_now() before snd_timer_interrupt(). The timer->stick value may change during snd_timer_interrupt() call, but this possibility is ignored completely.
For covering these subtle and messy issues, the following changes have been done in this patch: - A new flag, in_callback, is introduced in the private data to indicate that the hrtimer handler is being processed. - Both start and stop callbacks skip when called from (during) in_callback flag. - The hrtimer handler returns properly HRTIMER_RESTART and NORESTART depending on the running state now. - The hrtimer handler reprograms the expiry properly after snd_timer_interrupt() call, instead of before. - The close callback clears running flag and sets in_callback flag to block any further start/stop calls.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/hrtimer.c | 56 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 17 deletions(-)
diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c index 656d9a9032dc..e2f27022b363 100644 --- a/sound/core/hrtimer.c +++ b/sound/core/hrtimer.c @@ -38,37 +38,53 @@ static unsigned int resolution; struct snd_hrtimer { struct snd_timer *timer; struct hrtimer hrt; - atomic_t running; + bool in_callback; };
static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt) { struct snd_hrtimer *stime = container_of(hrt, struct snd_hrtimer, hrt); struct snd_timer *t = stime->timer; - unsigned long oruns; - - if (!atomic_read(&stime->running)) - return HRTIMER_NORESTART; - - oruns = hrtimer_forward_now(hrt, ns_to_ktime(t->sticks * resolution)); - snd_timer_interrupt(stime->timer, t->sticks * oruns); + ktime_t delta; + unsigned long ticks; + enum hrtimer_restart ret = HRTIMER_NORESTART; + + spin_lock(&t->lock); + if (!t->running) + goto out; /* fast path */ + stime->in_callback = true; + ticks = t->sticks; + spin_unlock(&t->lock); + + /* calculate the drift */ + delta = ktime_sub(hrt->base->get_time(), hrtimer_get_expires(hrt)); + if (delta.tv64 > 0) + ticks += ktime_divns(delta, ticks * resolution); + + snd_timer_interrupt(stime->timer, ticks); + + spin_lock(&t->lock); + if (t->running) { + hrtimer_add_expires_ns(hrt, t->sticks * resolution); + ret = HRTIMER_RESTART; + }
- if (!atomic_read(&stime->running)) - return HRTIMER_NORESTART; - return HRTIMER_RESTART; + stime->in_callback = false; + out: + spin_unlock(&t->lock); + return ret; }
static int snd_hrtimer_open(struct snd_timer *t) { struct snd_hrtimer *stime;
- stime = kmalloc(sizeof(*stime), GFP_KERNEL); + stime = kzalloc(sizeof(*stime), GFP_KERNEL); if (!stime) return -ENOMEM; hrtimer_init(&stime->hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL); stime->timer = t; stime->hrt.function = snd_hrtimer_callback; - atomic_set(&stime->running, 0); t->private_data = stime; return 0; } @@ -78,6 +94,11 @@ static int snd_hrtimer_close(struct snd_timer *t) struct snd_hrtimer *stime = t->private_data;
if (stime) { + spin_lock_irq(&t->lock); + t->running = 0; /* just to be sure */ + stime->in_callback = 1; /* skip start/stop */ + spin_unlock_irq(&t->lock); + hrtimer_cancel(&stime->hrt); kfree(stime); t->private_data = NULL; @@ -89,18 +110,19 @@ static int snd_hrtimer_start(struct snd_timer *t) { struct snd_hrtimer *stime = t->private_data;
- atomic_set(&stime->running, 0); - hrtimer_try_to_cancel(&stime->hrt); + if (stime->in_callback) + return 0; hrtimer_start(&stime->hrt, ns_to_ktime(t->sticks * resolution), HRTIMER_MODE_REL); - atomic_set(&stime->running, 1); return 0; }
static int snd_hrtimer_stop(struct snd_timer *t) { struct snd_hrtimer *stime = t->private_data; - atomic_set(&stime->running, 0); + + if (stime->in_callback) + return 0; hrtimer_try_to_cancel(&stime->hrt); return 0; }
On Sun, Apr 24, 2016 at 11:31 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 19:09:48 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
After reconsideration, I rewrote the whole patchset again. Could you scratch all the previous three patches and replace with the single patch below? Sorry for inconvenience!
I did not yet reapply the patch, but I hit this with over night: https://gist.githubusercontent.com/dvyukov/85531fe7923ebc9be4376e009d6c960a/...
On Mon, 25 Apr 2016 10:03:34 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 11:31 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 19:09:48 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote: > > Hi Takashi, > > I've incorporated your hrtimer fixes (but also updated to > ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of > the following deadlock messages: > > > [ INFO: possible circular locking dependency detected ] > 4.6.0-rc4+ #351 Not tainted > ------------------------------------------------------- > swapper/0/0 is trying to acquire lock: > (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] > snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701 > > but task is already holding lock: > (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] > snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54 > > which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
After reconsideration, I rewrote the whole patchset again. Could you scratch all the previous three patches and replace with the single patch below? Sorry for inconvenience!
I did not yet reapply the patch, but I hit this with over night: https://gist.githubusercontent.com/dvyukov/85531fe7923ebc9be4376e009d6c960a/...
The previous patches still use hrtimer_cancel() and this might causes an issue like this. The latest revised one has no blocking behavior, so hopefully we won't fall into that hole again.
thanks,
Takashi
On Sun, Apr 24, 2016 at 11:31 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 19:09:48 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote:
Hi Takashi,
I've incorporated your hrtimer fixes (but also updated to ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of the following deadlock messages:
[ INFO: possible circular locking dependency detected ] 4.6.0-rc4+ #351 Not tainted
swapper/0/0 is trying to acquire lock: (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
but task is already holding lock: (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
After reconsideration, I rewrote the whole patchset again. Could you scratch all the previous three patches and replace with the single patch below? Sorry for inconvenience!
Reapplied. I will let you know if I see any crashes in sound.
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hrtimer: Handle start/stop more properly
This patch tries to address the still remaining issues in ALSA hrtimer driver:
- Spurious use-after-free was detected in hrtimer callback
- Incorrect rescheduling due to delayed start
- WARN_ON() is triggered in hrtimer_forward() invoked in hrtimer callback
The first issue happens only when the new timer is scheduled even while hrtimer is being closed. It's related with the second and third items; since ALSA timer core invokes hw.start callback during hrtimer interrupt, this may result in the explicit call of hrtimer_start().
Also, the similar problem is seen for the stop; ALSA timer core invokes hw.stop callback even in the hrtimer handler, too. Since we must not call the synced hrtimer_cancel() in such a context, it's just a hrtimer_try_to_cancel() call that doesn't properly work.
Another culprit of the second and third items is the call of hrtimer_forward_now() before snd_timer_interrupt(). The timer->stick value may change during snd_timer_interrupt() call, but this possibility is ignored completely.
For covering these subtle and messy issues, the following changes have been done in this patch:
- A new flag, in_callback, is introduced in the private data to indicate that the hrtimer handler is being processed.
- Both start and stop callbacks skip when called from (during) in_callback flag.
- The hrtimer handler returns properly HRTIMER_RESTART and NORESTART depending on the running state now.
- The hrtimer handler reprograms the expiry properly after snd_timer_interrupt() call, instead of before.
- The close callback clears running flag and sets in_callback flag to block any further start/stop calls.
Signed-off-by: Takashi Iwai tiwai@suse.de
sound/core/hrtimer.c | 56 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 17 deletions(-)
diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c index 656d9a9032dc..e2f27022b363 100644 --- a/sound/core/hrtimer.c +++ b/sound/core/hrtimer.c @@ -38,37 +38,53 @@ static unsigned int resolution; struct snd_hrtimer { struct snd_timer *timer; struct hrtimer hrt;
atomic_t running;
bool in_callback;
};
static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt) { struct snd_hrtimer *stime = container_of(hrt, struct snd_hrtimer, hrt); struct snd_timer *t = stime->timer;
unsigned long oruns;
if (!atomic_read(&stime->running))
return HRTIMER_NORESTART;
oruns = hrtimer_forward_now(hrt, ns_to_ktime(t->sticks * resolution));
snd_timer_interrupt(stime->timer, t->sticks * oruns);
ktime_t delta;
unsigned long ticks;
enum hrtimer_restart ret = HRTIMER_NORESTART;
spin_lock(&t->lock);
if (!t->running)
goto out; /* fast path */
stime->in_callback = true;
ticks = t->sticks;
spin_unlock(&t->lock);
/* calculate the drift */
delta = ktime_sub(hrt->base->get_time(), hrtimer_get_expires(hrt));
if (delta.tv64 > 0)
ticks += ktime_divns(delta, ticks * resolution);
snd_timer_interrupt(stime->timer, ticks);
spin_lock(&t->lock);
if (t->running) {
hrtimer_add_expires_ns(hrt, t->sticks * resolution);
ret = HRTIMER_RESTART;
}
if (!atomic_read(&stime->running))
return HRTIMER_NORESTART;
return HRTIMER_RESTART;
stime->in_callback = false;
- out:
spin_unlock(&t->lock);
return ret;
}
static int snd_hrtimer_open(struct snd_timer *t) { struct snd_hrtimer *stime;
stime = kmalloc(sizeof(*stime), GFP_KERNEL);
stime = kzalloc(sizeof(*stime), GFP_KERNEL); if (!stime) return -ENOMEM; hrtimer_init(&stime->hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL); stime->timer = t; stime->hrt.function = snd_hrtimer_callback;
atomic_set(&stime->running, 0); t->private_data = stime; return 0;
} @@ -78,6 +94,11 @@ static int snd_hrtimer_close(struct snd_timer *t) struct snd_hrtimer *stime = t->private_data;
if (stime) {
spin_lock_irq(&t->lock);
t->running = 0; /* just to be sure */
stime->in_callback = 1; /* skip start/stop */
spin_unlock_irq(&t->lock);
hrtimer_cancel(&stime->hrt); kfree(stime); t->private_data = NULL;
@@ -89,18 +110,19 @@ static int snd_hrtimer_start(struct snd_timer *t) { struct snd_hrtimer *stime = t->private_data;
atomic_set(&stime->running, 0);
hrtimer_try_to_cancel(&stime->hrt);
if (stime->in_callback)
return 0; hrtimer_start(&stime->hrt, ns_to_ktime(t->sticks * resolution), HRTIMER_MODE_REL);
atomic_set(&stime->running, 1); return 0;
}
static int snd_hrtimer_stop(struct snd_timer *t) { struct snd_hrtimer *stime = t->private_data;
atomic_set(&stime->running, 0);
if (stime->in_callback)
return 0; hrtimer_try_to_cancel(&stime->hrt); return 0;
}
2.8.1
On Mon, May 2, 2016 at 1:54 PM, Dmitry Vyukov dvyukov@google.com wrote:
On Sun, Apr 24, 2016 at 11:31 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 19:09:48 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote:
On Sat, 23 Apr 2016 15:40:21 +0200, Dmitry Vyukov wrote: > > Hi Takashi, > > I've incorporated your hrtimer fixes (but also updated to > ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of > the following deadlock messages: > > > [ INFO: possible circular locking dependency detected ] > 4.6.0-rc4+ #351 Not tainted > ------------------------------------------------------- > swapper/0/0 is trying to acquire lock: > (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] > snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701 > > but task is already holding lock: > (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] > snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54 > > which lock already depends on the new lock.
Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. The first patch is still OK, as it just adds a new behavior mode.
Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
After reconsideration, I rewrote the whole patchset again. Could you scratch all the previous three patches and replace with the single patch below? Sorry for inconvenience!
Reapplied. I will let you know if I see any crashes in sound.
No crashes over two nights. We can consider it as fixed.
On Wed, 04 May 2016 10:34:33 +0200, Dmitry Vyukov wrote:
On Mon, May 2, 2016 at 1:54 PM, Dmitry Vyukov dvyukov@google.com wrote:
On Sun, Apr 24, 2016 at 11:31 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 19:09:48 +0200, Dmitry Vyukov wrote:
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai tiwai@suse.de wrote:
On Sun, 24 Apr 2016 17:16:32 +0200, Dmitry Vyukov wrote:
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai tiwai@suse.de wrote: > On Sat, 23 Apr 2016 15:40:21 +0200, > Dmitry Vyukov wrote: >> >> Hi Takashi, >> >> I've incorporated your hrtimer fixes (but also updated to >> ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of >> the following deadlock messages: >> >> >> [ INFO: possible circular locking dependency detected ] >> 4.6.0-rc4+ #351 Not tainted >> ------------------------------------------------------- >> swapper/0/0 is trying to acquire lock: >> (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>] >> snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701 >> >> but task is already holding lock: >> (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>] >> snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54 >> >> which lock already depends on the new lock. > > Oh crap, my second patch is buggy, it leads to ABBA lock, indeed. > The first patch is still OK, as it just adds a new behavior mode. > > Could you replace the second patch with the below one?
I've replaced the second path with this one. The deadlocks has gone, but I've hit these two hangs that look related:
https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/... https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/...
Hmm, so it wasn't a good idea to call hrtimer_cancel() in the spinlock, in anyway. Scratch the previous one.
OK, below is the yet revised two patches. One is the simplified version of the patch, and another is to call hrtimer_cancel() in a new timer op without spinlock. Apply these after the first patch "ALSA: timer: Allow backend disabling start/stop from handler".
Done. I will let you know if I see any failures.
After reconsideration, I rewrote the whole patchset again. Could you scratch all the previous three patches and replace with the single patch below? Sorry for inconvenience!
Reapplied. I will let you know if I see any crashes in sound.
No crashes over two nights. We can consider it as fixed.
Good to hear. The patch will be included in 4.7.
thanks,
Takashi
participants (2)
-
Dmitry Vyukov
-
Takashi Iwai