[alsa-devel] ALSA trigger soft lockup since "ALSA: seq: Fix race at creating a queue"

Jerome Glisse jglisse at redhat.com
Mon Nov 6 02:08:32 CET 2017


On Sun, Nov 05, 2017 at 10:49:10AM +0100, Takashi Iwai wrote:
> On Sun, 05 Nov 2017 10:22:57 +0100,
> Takashi Iwai wrote:
> > 
> > On Sat, 04 Nov 2017 21:32:42 +0100,
> > Jerome Glisse wrote:
> > > 
> > > On Sat, Nov 04, 2017 at 08:14:50AM +0100, Takashi Iwai wrote:
> > > > On Sat, 04 Nov 2017 03:34:09 +0100,
> > > > Jerome Glisse wrote:
> > > > > 
> > > > > On Wed, Nov 01, 2017 at 01:02:27PM -0700, Andrew Morton wrote:
> > > > > > Begin forwarded message:
> > > > > > 
> > > > > > Date: Wed, 01 Nov 2017 12:54:16 -0700
> > > > > > From: syzbot <bot+63583aefef5457348dcfa06b87d4fd1378b26b09 at syzkaller.appspotmail.com>
> > > > > > To: aaron.lu at intel.com, akpm at linux-foundation.org, aneesh.kumar at linux.vnet.ibm.com, jack at suse.cz, kirill.shutemov at linux.intel.com, linux-kernel at vger.kernel.org, linux-mm at kvack.org, mhocko at suse.com, minchan at kernel.org, peterz at infradead.org, rientjes at google.com, sfr at canb.auug.org.au, shli at fb.com, syzkaller-bugs at googlegroups.com, willy at linux.intel.com, ying.huang at intel.com, zi.yan at cs.rutgers.edu
> > > > > > Subject: BUG: soft lockup
> > > > > > 
> > > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > syzkaller hit the following crash on  
> > > > > > 1d53d908b79d7870d89063062584eead4cf83448
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > > > > > compiler: gcc (GCC) 7.1.1 20170620
> > > > > > .config is attached
> > > > > > Raw console output is attached.
> > > > > > C reproducer is attached
> > > > > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > > > > > for information about syzkaller reproducers
> > > > > 
> > > > > Sorry to be the bringer of bad news but after:
> > > > > 4842e98f26dd80be3623c4714a244ba52ea096a8
> > > > > and
> > > > > 7e1d90f60a0d501c8503e636942ca704a454d910
> > > > > 
> > > > > The attached syzkaller repro trigger a softlockup in workqueue. I am unfamiliar
> > > > > with that code so some familiar in that domain is more likely to find a proper
> > > > > solution.
> > > > > 
> > > > > Note with
> > > > > git revert 7e1d90f60a0d501c8503e636942ca704a454d910
> > > > > git revert 4842e98f26dd80be3623c4714a244ba52ea096a8
> > > > > fix the issue
> > > > 
> > > > Could you give the stack trace?
> > > 
> > > Stack trace doesn't say anything it is different every single time.
> > > It was first reported against mm as stack was showing some mm code
> > > but that was coincidence. Stack will show what ever is running at
> > > the time the workqueue soft lockup kick in. The only constant is the
> > > watchdog stack trace but that does not tell you anything:
> > > 
> > > RIP: 0033:0x4012c8
> > > RSP: 002b:00007ffc12d93e10 EFLAGS: 00010286
> > > RAX: ffffffffffffffff RBX: ffffffffffffffff RCX: 0000000000439779
> > > RDX: 0000000000000000 RSI: 000000002076e000 RDI: ffffffffffffffff
> > > RBP: 00327265636e6575 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000286 R12: 7165732f7665642f
> > > R13: 646e732f7665642f R14: 0030656c69662f2e R15: 0000000000000000
> > > Code: 60 39 d1 fc e9 8d fd ff ff be 08 00 00 00 4c 89 ff e8 2e 39 d1 fc e9  
> > > 1c fd ff ff 90 90 90 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3  
> > > 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00
> > > Kernel panic - not syncing: softlockup: hung tasks
> > > CPU: 3 PID: 2995 Comm: syzkaller355698 Tainted: G             L  
> > > 4.13.0-rc7-next-20170901+ #13
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > > Call Trace:
> > >   <IRQ>
> > >   __dump_stack lib/dump_stack.c:16 [inline]
> > >   dump_stack+0x194/0x257 lib/dump_stack.c:52
> > >   panic+0x1e4/0x417 kernel/panic.c:181
> > >   watchdog_timer_fn+0x401/0x410 kernel/watchdog.c:433
> > >   __run_hrtimer kernel/time/hrtimer.c:1213 [inline]
> > >   __hrtimer_run_queues+0x349/0xe10 kernel/time/hrtimer.c:1277
> > >   hrtimer_interrupt+0x1d4/0x5f0 kernel/time/hrtimer.c:1311
> > >   local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1021 [inline]
> > >   smp_apic_timer_interrupt+0x156/0x710 arch/x86/kernel/apic/apic.c:1046
> > >   apic_timer_interrupt+0x9d/0xb0 arch/x86/entry/entry_64.S:577
> > > 
> > > 
> > > If you want i can collect a bunch but i haven't seen anything usefull in
> > > any of them
> > 
> > OK, thanks for information.
> > 
> > So this looks like simply a consequence of too many concurrently
> > opened / processed ALSA timer instances.  Below is the patch to plug
> > it, setting the upper limit for avoiding the system resource hog.
> 
> Scratch that, it may lead to an unbalanced count.
> Below is the fixed version.
> 
> 
> Takashi
> 
> -- 8< --
> From: Takashi Iwai <tiwai at suse.de>
> Subject: [PATCH v2] ALSA: timer: Limit max instances per timer
> 
> Currently we allow unlimited number of timer instances, and it may
> bring the system hogging way too much CPU when too many timer
> instances are opened and processed concurrently.  This may end up with
> a soft-lockup report as triggered by syzkaller, especially when
> hrtimer backend is deployed.
> 
> Since such insane number of instances aren't demanded by the normal
> use case of ALSA sequencer and it merely  opens a risk only for abuse,
> this patch introduces the upper limit for the number of instances per
> timer backend.  As default, it's set to 1000, but for the fine-grained
> timer like hrtimer, it's set to 100.
> 
> Reported-by: syzbot
> Cc: <stable at vger.kernel.org>
> Signed-off-by: Takashi Iwai <tiwai at suse.de>

Tested-by: Jérôme Glisse <jglisse at redhat.com>

> ---
>  include/sound/timer.h |  2 ++
>  sound/core/hrtimer.c  |  1 +
>  sound/core/timer.c    | 67 +++++++++++++++++++++++++++++++++++++++++----------
>  3 files changed, 57 insertions(+), 13 deletions(-)
> 
> diff --git a/include/sound/timer.h b/include/sound/timer.h
> index c4d76ff056c6..7ae226ab6990 100644
> --- a/include/sound/timer.h
> +++ b/include/sound/timer.h
> @@ -90,6 +90,8 @@ struct snd_timer {
>  	struct list_head ack_list_head;
>  	struct list_head sack_list_head; /* slow ack list head */
>  	struct tasklet_struct task_queue;
> +	int max_instances;	/* upper limit of timer instances */
> +	int num_instances;	/* current number of timer instances */
>  };
>  
>  struct snd_timer_instance {
> diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c
> index 5603e49f7bf7..18cb6f476bf4 100644
> --- a/sound/core/hrtimer.c
> +++ b/sound/core/hrtimer.c
> @@ -159,6 +159,7 @@ static int __init snd_hrtimer_init(void)
>  	timer->hw = hrtimer_hw;
>  	timer->hw.resolution = resolution;
>  	timer->hw.ticks = NANO_SEC / resolution;
> +	timer->max_instances = 100; /* lower the limit */
>  
>  	err = snd_timer_global_register(timer);
>  	if (err < 0) {
> diff --git a/sound/core/timer.c b/sound/core/timer.c
> index 09acaf2b2e57..ee09dace8bb1 100644
> --- a/sound/core/timer.c
> +++ b/sound/core/timer.c
> @@ -180,7 +180,7 @@ static void snd_timer_request(struct snd_timer_id *tid)
>   *
>   * call this with register_mutex down.
>   */
> -static void snd_timer_check_slave(struct snd_timer_instance *slave)
> +static int snd_timer_check_slave(struct snd_timer_instance *slave)
>  {
>  	struct snd_timer *timer;
>  	struct snd_timer_instance *master;
> @@ -190,16 +190,21 @@ static void snd_timer_check_slave(struct snd_timer_instance *slave)
>  		list_for_each_entry(master, &timer->open_list_head, open_list) {
>  			if (slave->slave_class == master->slave_class &&
>  			    slave->slave_id == master->slave_id) {
> +				if (master->timer->num_instances >=
> +				    master->timer->max_instances)
> +					return -EBUSY;
>  				list_move_tail(&slave->open_list,
>  					       &master->slave_list_head);
> +				master->timer->num_instances++;
>  				spin_lock_irq(&slave_active_lock);
>  				slave->master = master;
>  				slave->timer = master->timer;
>  				spin_unlock_irq(&slave_active_lock);
> -				return;
> +				return 0;
>  			}
>  		}
>  	}
> +	return 0;
>  }
>  
>  /*
> @@ -208,7 +213,7 @@ static void snd_timer_check_slave(struct snd_timer_instance *slave)
>   *
>   * call this with register_mutex down.
>   */
> -static void snd_timer_check_master(struct snd_timer_instance *master)
> +static int snd_timer_check_master(struct snd_timer_instance *master)
>  {
>  	struct snd_timer_instance *slave, *tmp;
>  
> @@ -216,7 +221,11 @@ static void snd_timer_check_master(struct snd_timer_instance *master)
>  	list_for_each_entry_safe(slave, tmp, &snd_timer_slave_list, open_list) {
>  		if (slave->slave_class == master->slave_class &&
>  		    slave->slave_id == master->slave_id) {
> +			if (master->timer->num_instances >=
> +			    master->timer->max_instances)
> +				return -EBUSY;
>  			list_move_tail(&slave->open_list, &master->slave_list_head);
> +			master->timer->num_instances++;
>  			spin_lock_irq(&slave_active_lock);
>  			spin_lock(&master->timer->lock);
>  			slave->master = master;
> @@ -228,8 +237,11 @@ static void snd_timer_check_master(struct snd_timer_instance *master)
>  			spin_unlock_irq(&slave_active_lock);
>  		}
>  	}
> +	return 0;
>  }
>  
> +static int snd_timer_close_locked(struct snd_timer_instance *timeri);
> +
>  /*
>   * open a timer instance
>   * when opening a master, the slave id must be here given.
> @@ -240,6 +252,7 @@ int snd_timer_open(struct snd_timer_instance **ti,
>  {
>  	struct snd_timer *timer;
>  	struct snd_timer_instance *timeri = NULL;
> +	int err;
>  
>  	if (tid->dev_class == SNDRV_TIMER_CLASS_SLAVE) {
>  		/* open a slave instance */
> @@ -259,10 +272,14 @@ int snd_timer_open(struct snd_timer_instance **ti,
>  		timeri->slave_id = tid->device;
>  		timeri->flags |= SNDRV_TIMER_IFLG_SLAVE;
>  		list_add_tail(&timeri->open_list, &snd_timer_slave_list);
> -		snd_timer_check_slave(timeri);
> +		err = snd_timer_check_slave(timeri);
> +		if (err < 0) {
> +			snd_timer_close_locked(timeri);
> +			timeri = NULL;
> +		}
>  		mutex_unlock(&register_mutex);
>  		*ti = timeri;
> -		return 0;
> +		return err;
>  	}
>  
>  	/* open a master instance */
> @@ -288,6 +305,10 @@ int snd_timer_open(struct snd_timer_instance **ti,
>  			return -EBUSY;
>  		}
>  	}
> +	if (timer->num_instances >= timer->max_instances) {
> +		mutex_unlock(&register_mutex);
> +		return -EBUSY;
> +	}
>  	timeri = snd_timer_instance_new(owner, timer);
>  	if (!timeri) {
>  		mutex_unlock(&register_mutex);
> @@ -314,25 +335,27 @@ int snd_timer_open(struct snd_timer_instance **ti,
>  	}
>  
>  	list_add_tail(&timeri->open_list, &timer->open_list_head);
> -	snd_timer_check_master(timeri);
> +	timer->num_instances++;
> +	err = snd_timer_check_master(timeri);
> +	if (err < 0) {
> +		snd_timer_close_locked(timeri);
> +		timeri = NULL;
> +	}
>  	mutex_unlock(&register_mutex);
>  	*ti = timeri;
> -	return 0;
> +	return err;
>  }
>  EXPORT_SYMBOL(snd_timer_open);
>  
>  /*
>   * close a timer instance
> + * call this with register_mutex down.
>   */
> -int snd_timer_close(struct snd_timer_instance *timeri)
> +static int snd_timer_close_locked(struct snd_timer_instance *timeri)
>  {
>  	struct snd_timer *timer = NULL;
>  	struct snd_timer_instance *slave, *tmp;
>  
> -	if (snd_BUG_ON(!timeri))
> -		return -ENXIO;
> -
> -	mutex_lock(&register_mutex);
>  	list_del(&timeri->open_list);
>  
>  	/* force to stop the timer */
> @@ -340,6 +363,7 @@ int snd_timer_close(struct snd_timer_instance *timeri)
>  
>  	timer = timeri->timer;
>  	if (timer) {
> +		timer->num_instances--;
>  		/* wait, until the active callback is finished */
>  		spin_lock_irq(&timer->lock);
>  		while (timeri->flags & SNDRV_TIMER_IFLG_CALLBACK) {
> @@ -355,6 +379,7 @@ int snd_timer_close(struct snd_timer_instance *timeri)
>  		list_for_each_entry_safe(slave, tmp, &timeri->slave_list_head,
>  					 open_list) {
>  			list_move_tail(&slave->open_list, &snd_timer_slave_list);
> +			timer->num_instances--;
>  			slave->master = NULL;
>  			slave->timer = NULL;
>  			list_del_init(&slave->ack_list);
> @@ -382,9 +407,24 @@ int snd_timer_close(struct snd_timer_instance *timeri)
>  		module_put(timer->module);
>  	}
>  
> -	mutex_unlock(&register_mutex);
>  	return 0;
>  }
> +
> +/*
> + * close a timer instance
> + */
> +int snd_timer_close(struct snd_timer_instance *timeri)
> +{
> +	int err;
> +
> +	if (snd_BUG_ON(!timeri))
> +		return -ENXIO;
> +
> +	mutex_lock(&register_mutex);
> +	err = snd_timer_close_locked(timeri);
> +	mutex_unlock(&register_mutex);
> +	return err;
> +}
>  EXPORT_SYMBOL(snd_timer_close);
>  
>  unsigned long snd_timer_resolution(struct snd_timer_instance *timeri)
> @@ -856,6 +896,7 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid,
>  	spin_lock_init(&timer->lock);
>  	tasklet_init(&timer->task_queue, snd_timer_tasklet,
>  		     (unsigned long)timer);
> +	timer->max_instances = 1000; /* default limit per timer */
>  	if (card != NULL) {
>  		timer->module = card->module;
>  		err = snd_device_new(card, SNDRV_DEV_TIMER, timer, &ops);
> -- 
> 2.15.0
> 


More information about the Alsa-devel mailing list