[alsa-devel] ALSA trigger soft lockup since "ALSA: seq: Fix race at creating a queue"
Jerome Glisse
jglisse at redhat.com
Mon Nov 6 02:08:32 CET 2017
On Sun, Nov 05, 2017 at 10:49:10AM +0100, Takashi Iwai wrote:
> On Sun, 05 Nov 2017 10:22:57 +0100,
> Takashi Iwai wrote:
> >
> > On Sat, 04 Nov 2017 21:32:42 +0100,
> > Jerome Glisse wrote:
> > >
> > > On Sat, Nov 04, 2017 at 08:14:50AM +0100, Takashi Iwai wrote:
> > > > On Sat, 04 Nov 2017 03:34:09 +0100,
> > > > Jerome Glisse wrote:
> > > > >
> > > > > On Wed, Nov 01, 2017 at 01:02:27PM -0700, Andrew Morton wrote:
> > > > > > Begin forwarded message:
> > > > > >
> > > > > > Date: Wed, 01 Nov 2017 12:54:16 -0700
> > > > > > From: syzbot <bot+63583aefef5457348dcfa06b87d4fd1378b26b09 at syzkaller.appspotmail.com>
> > > > > > To: aaron.lu at intel.com, akpm at linux-foundation.org, aneesh.kumar at linux.vnet.ibm.com, jack at suse.cz, kirill.shutemov at linux.intel.com, linux-kernel at vger.kernel.org, linux-mm at kvack.org, mhocko at suse.com, minchan at kernel.org, peterz at infradead.org, rientjes at google.com, sfr at canb.auug.org.au, shli at fb.com, syzkaller-bugs at googlegroups.com, willy at linux.intel.com, ying.huang at intel.com, zi.yan at cs.rutgers.edu
> > > > > > Subject: BUG: soft lockup
> > > > > >
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > syzkaller hit the following crash on
> > > > > > 1d53d908b79d7870d89063062584eead4cf83448
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > > > > > compiler: gcc (GCC) 7.1.1 20170620
> > > > > > .config is attached
> > > > > > Raw console output is attached.
> > > > > > C reproducer is attached
> > > > > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > > > > > for information about syzkaller reproducers
> > > > >
> > > > > Sorry to be the bringer of bad news but after:
> > > > > 4842e98f26dd80be3623c4714a244ba52ea096a8
> > > > > and
> > > > > 7e1d90f60a0d501c8503e636942ca704a454d910
> > > > >
> > > > > The attached syzkaller repro trigger a softlockup in workqueue. I am unfamiliar
> > > > > with that code so some familiar in that domain is more likely to find a proper
> > > > > solution.
> > > > >
> > > > > Note with
> > > > > git revert 7e1d90f60a0d501c8503e636942ca704a454d910
> > > > > git revert 4842e98f26dd80be3623c4714a244ba52ea096a8
> > > > > fix the issue
> > > >
> > > > Could you give the stack trace?
> > >
> > > Stack trace doesn't say anything it is different every single time.
> > > It was first reported against mm as stack was showing some mm code
> > > but that was coincidence. Stack will show what ever is running at
> > > the time the workqueue soft lockup kick in. The only constant is the
> > > watchdog stack trace but that does not tell you anything:
> > >
> > > RIP: 0033:0x4012c8
> > > RSP: 002b:00007ffc12d93e10 EFLAGS: 00010286
> > > RAX: ffffffffffffffff RBX: ffffffffffffffff RCX: 0000000000439779
> > > RDX: 0000000000000000 RSI: 000000002076e000 RDI: ffffffffffffffff
> > > RBP: 00327265636e6575 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000286 R12: 7165732f7665642f
> > > R13: 646e732f7665642f R14: 0030656c69662f2e R15: 0000000000000000
> > > Code: 60 39 d1 fc e9 8d fd ff ff be 08 00 00 00 4c 89 ff e8 2e 39 d1 fc e9
> > > 1c fd ff ff 90 90 90 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3
> > > 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00
> > > Kernel panic - not syncing: softlockup: hung tasks
> > > CPU: 3 PID: 2995 Comm: syzkaller355698 Tainted: G L
> > > 4.13.0-rc7-next-20170901+ #13
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > > Call Trace:
> > > <IRQ>
> > > __dump_stack lib/dump_stack.c:16 [inline]
> > > dump_stack+0x194/0x257 lib/dump_stack.c:52
> > > panic+0x1e4/0x417 kernel/panic.c:181
> > > watchdog_timer_fn+0x401/0x410 kernel/watchdog.c:433
> > > __run_hrtimer kernel/time/hrtimer.c:1213 [inline]
> > > __hrtimer_run_queues+0x349/0xe10 kernel/time/hrtimer.c:1277
> > > hrtimer_interrupt+0x1d4/0x5f0 kernel/time/hrtimer.c:1311
> > > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1021 [inline]
> > > smp_apic_timer_interrupt+0x156/0x710 arch/x86/kernel/apic/apic.c:1046
> > > apic_timer_interrupt+0x9d/0xb0 arch/x86/entry/entry_64.S:577
> > >
> > >
> > > If you want i can collect a bunch but i haven't seen anything usefull in
> > > any of them
> >
> > OK, thanks for information.
> >
> > So this looks like simply a consequence of too many concurrently
> > opened / processed ALSA timer instances. Below is the patch to plug
> > it, setting the upper limit for avoiding the system resource hog.
>
> Scratch that, it may lead to an unbalanced count.
> Below is the fixed version.
>
>
> Takashi
>
> -- 8< --
> From: Takashi Iwai <tiwai at suse.de>
> Subject: [PATCH v2] ALSA: timer: Limit max instances per timer
>
> Currently we allow unlimited number of timer instances, and it may
> bring the system hogging way too much CPU when too many timer
> instances are opened and processed concurrently. This may end up with
> a soft-lockup report as triggered by syzkaller, especially when
> hrtimer backend is deployed.
>
> Since such insane number of instances aren't demanded by the normal
> use case of ALSA sequencer and it merely opens a risk only for abuse,
> this patch introduces the upper limit for the number of instances per
> timer backend. As default, it's set to 1000, but for the fine-grained
> timer like hrtimer, it's set to 100.
>
> Reported-by: syzbot
> Cc: <stable at vger.kernel.org>
> Signed-off-by: Takashi Iwai <tiwai at suse.de>
Tested-by: Jérôme Glisse <jglisse at redhat.com>
> ---
> include/sound/timer.h | 2 ++
> sound/core/hrtimer.c | 1 +
> sound/core/timer.c | 67 +++++++++++++++++++++++++++++++++++++++++----------
> 3 files changed, 57 insertions(+), 13 deletions(-)
>
> diff --git a/include/sound/timer.h b/include/sound/timer.h
> index c4d76ff056c6..7ae226ab6990 100644
> --- a/include/sound/timer.h
> +++ b/include/sound/timer.h
> @@ -90,6 +90,8 @@ struct snd_timer {
> struct list_head ack_list_head;
> struct list_head sack_list_head; /* slow ack list head */
> struct tasklet_struct task_queue;
> + int max_instances; /* upper limit of timer instances */
> + int num_instances; /* current number of timer instances */
> };
>
> struct snd_timer_instance {
> diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c
> index 5603e49f7bf7..18cb6f476bf4 100644
> --- a/sound/core/hrtimer.c
> +++ b/sound/core/hrtimer.c
> @@ -159,6 +159,7 @@ static int __init snd_hrtimer_init(void)
> timer->hw = hrtimer_hw;
> timer->hw.resolution = resolution;
> timer->hw.ticks = NANO_SEC / resolution;
> + timer->max_instances = 100; /* lower the limit */
>
> err = snd_timer_global_register(timer);
> if (err < 0) {
> diff --git a/sound/core/timer.c b/sound/core/timer.c
> index 09acaf2b2e57..ee09dace8bb1 100644
> --- a/sound/core/timer.c
> +++ b/sound/core/timer.c
> @@ -180,7 +180,7 @@ static void snd_timer_request(struct snd_timer_id *tid)
> *
> * call this with register_mutex down.
> */
> -static void snd_timer_check_slave(struct snd_timer_instance *slave)
> +static int snd_timer_check_slave(struct snd_timer_instance *slave)
> {
> struct snd_timer *timer;
> struct snd_timer_instance *master;
> @@ -190,16 +190,21 @@ static void snd_timer_check_slave(struct snd_timer_instance *slave)
> list_for_each_entry(master, &timer->open_list_head, open_list) {
> if (slave->slave_class == master->slave_class &&
> slave->slave_id == master->slave_id) {
> + if (master->timer->num_instances >=
> + master->timer->max_instances)
> + return -EBUSY;
> list_move_tail(&slave->open_list,
> &master->slave_list_head);
> + master->timer->num_instances++;
> spin_lock_irq(&slave_active_lock);
> slave->master = master;
> slave->timer = master->timer;
> spin_unlock_irq(&slave_active_lock);
> - return;
> + return 0;
> }
> }
> }
> + return 0;
> }
>
> /*
> @@ -208,7 +213,7 @@ static void snd_timer_check_slave(struct snd_timer_instance *slave)
> *
> * call this with register_mutex down.
> */
> -static void snd_timer_check_master(struct snd_timer_instance *master)
> +static int snd_timer_check_master(struct snd_timer_instance *master)
> {
> struct snd_timer_instance *slave, *tmp;
>
> @@ -216,7 +221,11 @@ static void snd_timer_check_master(struct snd_timer_instance *master)
> list_for_each_entry_safe(slave, tmp, &snd_timer_slave_list, open_list) {
> if (slave->slave_class == master->slave_class &&
> slave->slave_id == master->slave_id) {
> + if (master->timer->num_instances >=
> + master->timer->max_instances)
> + return -EBUSY;
> list_move_tail(&slave->open_list, &master->slave_list_head);
> + master->timer->num_instances++;
> spin_lock_irq(&slave_active_lock);
> spin_lock(&master->timer->lock);
> slave->master = master;
> @@ -228,8 +237,11 @@ static void snd_timer_check_master(struct snd_timer_instance *master)
> spin_unlock_irq(&slave_active_lock);
> }
> }
> + return 0;
> }
>
> +static int snd_timer_close_locked(struct snd_timer_instance *timeri);
> +
> /*
> * open a timer instance
> * when opening a master, the slave id must be here given.
> @@ -240,6 +252,7 @@ int snd_timer_open(struct snd_timer_instance **ti,
> {
> struct snd_timer *timer;
> struct snd_timer_instance *timeri = NULL;
> + int err;
>
> if (tid->dev_class == SNDRV_TIMER_CLASS_SLAVE) {
> /* open a slave instance */
> @@ -259,10 +272,14 @@ int snd_timer_open(struct snd_timer_instance **ti,
> timeri->slave_id = tid->device;
> timeri->flags |= SNDRV_TIMER_IFLG_SLAVE;
> list_add_tail(&timeri->open_list, &snd_timer_slave_list);
> - snd_timer_check_slave(timeri);
> + err = snd_timer_check_slave(timeri);
> + if (err < 0) {
> + snd_timer_close_locked(timeri);
> + timeri = NULL;
> + }
> mutex_unlock(®ister_mutex);
> *ti = timeri;
> - return 0;
> + return err;
> }
>
> /* open a master instance */
> @@ -288,6 +305,10 @@ int snd_timer_open(struct snd_timer_instance **ti,
> return -EBUSY;
> }
> }
> + if (timer->num_instances >= timer->max_instances) {
> + mutex_unlock(®ister_mutex);
> + return -EBUSY;
> + }
> timeri = snd_timer_instance_new(owner, timer);
> if (!timeri) {
> mutex_unlock(®ister_mutex);
> @@ -314,25 +335,27 @@ int snd_timer_open(struct snd_timer_instance **ti,
> }
>
> list_add_tail(&timeri->open_list, &timer->open_list_head);
> - snd_timer_check_master(timeri);
> + timer->num_instances++;
> + err = snd_timer_check_master(timeri);
> + if (err < 0) {
> + snd_timer_close_locked(timeri);
> + timeri = NULL;
> + }
> mutex_unlock(®ister_mutex);
> *ti = timeri;
> - return 0;
> + return err;
> }
> EXPORT_SYMBOL(snd_timer_open);
>
> /*
> * close a timer instance
> + * call this with register_mutex down.
> */
> -int snd_timer_close(struct snd_timer_instance *timeri)
> +static int snd_timer_close_locked(struct snd_timer_instance *timeri)
> {
> struct snd_timer *timer = NULL;
> struct snd_timer_instance *slave, *tmp;
>
> - if (snd_BUG_ON(!timeri))
> - return -ENXIO;
> -
> - mutex_lock(®ister_mutex);
> list_del(&timeri->open_list);
>
> /* force to stop the timer */
> @@ -340,6 +363,7 @@ int snd_timer_close(struct snd_timer_instance *timeri)
>
> timer = timeri->timer;
> if (timer) {
> + timer->num_instances--;
> /* wait, until the active callback is finished */
> spin_lock_irq(&timer->lock);
> while (timeri->flags & SNDRV_TIMER_IFLG_CALLBACK) {
> @@ -355,6 +379,7 @@ int snd_timer_close(struct snd_timer_instance *timeri)
> list_for_each_entry_safe(slave, tmp, &timeri->slave_list_head,
> open_list) {
> list_move_tail(&slave->open_list, &snd_timer_slave_list);
> + timer->num_instances--;
> slave->master = NULL;
> slave->timer = NULL;
> list_del_init(&slave->ack_list);
> @@ -382,9 +407,24 @@ int snd_timer_close(struct snd_timer_instance *timeri)
> module_put(timer->module);
> }
>
> - mutex_unlock(®ister_mutex);
> return 0;
> }
> +
> +/*
> + * close a timer instance
> + */
> +int snd_timer_close(struct snd_timer_instance *timeri)
> +{
> + int err;
> +
> + if (snd_BUG_ON(!timeri))
> + return -ENXIO;
> +
> + mutex_lock(®ister_mutex);
> + err = snd_timer_close_locked(timeri);
> + mutex_unlock(®ister_mutex);
> + return err;
> +}
> EXPORT_SYMBOL(snd_timer_close);
>
> unsigned long snd_timer_resolution(struct snd_timer_instance *timeri)
> @@ -856,6 +896,7 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid,
> spin_lock_init(&timer->lock);
> tasklet_init(&timer->task_queue, snd_timer_tasklet,
> (unsigned long)timer);
> + timer->max_instances = 1000; /* default limit per timer */
> if (card != NULL) {
> timer->module = card->module;
> err = snd_device_new(card, SNDRV_DEV_TIMER, timer, &ops);
> --
> 2.15.0
>
More information about the Alsa-devel
mailing list