[alsa-devel] sound: use-after-free in snd_timer_interrupt

Takashi Iwai tiwai at suse.de
Mon Jan 18 14:06:10 CET 2016


On Mon, 18 Jan 2016 11:53:00 +0100,
Dmitry Vyukov wrote:
> 
> On Fri, Jan 15, 2016 at 10:44 PM, Takashi Iwai <tiwai at suse.de> wrote:
> > On Fri, 15 Jan 2016 22:22:46 +0100,
> > Takashi Iwai wrote:
> >>
> >> On Fri, 15 Jan 2016 20:47:05 +0100,
> >> Dmitry Vyukov wrote:
> >> >
> >> > On Fri, Jan 15, 2016 at 8:18 PM, Takashi Iwai <tiwai at suse.de> wrote:
> >> > > On Fri, 15 Jan 2016 20:13:11 +0100,
> >> > > Dmitry Vyukov wrote:
> >> > >>
> >> > >> On Fri, Jan 15, 2016 at 3:38 PM, Dmitry Vyukov <dvyukov at google.com> wrote:
> >> > >> > On Fri, Jan 15, 2016 at 2:51 PM, Takashi Iwai <tiwai at suse.de> wrote:
> >> > >> >> On Fri, 15 Jan 2016 12:03:17 +0100,
> >> > >> >> Dmitry Vyukov wrote:
> >> > >> >>>
> >> > >> >>> On Fri, Jan 15, 2016 at 12:00 PM, Takashi Iwai <tiwai at suse.de> wrote:
> >> > >> >>> > On Fri, 15 Jan 2016 09:06:10 +0100,
> >> > >> >>> > Dmitry Vyukov wrote:
> >> > >> >>> >>
> >> > >> >>> >> On Thu, Jan 14, 2016 at 5:09 PM, Takashi Iwai <tiwai at suse.de> wrote:
> >> > >> >>> >> > On Wed, 13 Jan 2016 21:54:10 +0100,
> >> > >> >>> >> > Takashi Iwai wrote:
> >> > >> >>> >> >>
> >> > >> >>> >> >> OK, then this might be a possible race at the current snd_timer_stop()
> >> > >> >>> >> >> implementation.  There is no sync action there, so the ISR might be
> >> > >> >>> >> >> still alive after snd_timer_close() call.  Or might be another race.
> >> > >> >>> >> >> This pattern looks a bit different, as it's involved with hrtimer.
> >> > >> >>> >> >>
> >> > >> >>> >> >> I'll take a look at it tomorrow.
> >> > >> >>> >> >
> >> > >> >>> >> > I've audited the code today, but the open window doesn't look like
> >> > >> >>> >> > what I expected.  I found only some possible cases with slave timer
> >> > >> >>> >> > instances.
> >> > >> >>> >> >
> >> > >> >>> >> > In anyway, below is a test fix patch.  Since I couldn't reproduce the
> >> > >> >>> >> > issue on my local machines, it's hard to say whether this covers the
> >> > >> >>> >> > holes you fell.  Let's see...
> >> > >> >>> >>
> >> > >> >>> >>
> >> > >> >>> >> Hi Takashi,
> >> > >> >>> >>
> >> > >> >>> >> I would be interested to understand why other people can't reproduce
> >> > >> >>> >> issues that I hit pretty reliably.
> >> > >> >>> >> I suspect that it can be due to .config. Please try with the following
> >> > >> >>> >> config values.
> >> > >> >>> >
> >> > >> >>> > I guess rather other config, e.g. the kernel debug options.
> >> > >> >>> > I suppose you enabled KASAN and DEBUG_LIST.  What else?
> >> > >> >>>
> >> > >> >>> I've attached my config (you will need to disable CONFIG_KCOV, it is
> >> > >> >>> not upstreamed).
> >> > >> >>
> >> > >> >> Hm, that has lots of other drivers built-in...
> >> > >> >>
> >> > >> >>> >> I also start qemu with "-soundhw all" arg.
> >> > >> >>> >
> >> > >> >>> > OK, so you're testing with VM?  This makes easier to recheck.
> >> > >> >>>
> >> > >> >>> Yes, I start qemu as:
> >> > >> >>>
> >> > >> >>> qemu-system-x86_64 -hda wheezy.img -net
> >> > >> >>> user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic -nographic -kernel
> >> > >> >>> arch/x86/boot/bzImage -append "console=ttyS0 root=/dev/sda debug
> >> > >> >>> earlyprintk=serial slub_debug=UZ" -enable-kvm -m 2G -numa
> >> > >> >>> node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 -smp
> >> > >> >>> sockets=2,cores=2,threads=1 -usb -usbdevice mouse -usbdevice tablet
> >> > >> >>> -soundhw all
> >> > >> >>
> >> > >> >> And which test did trigger use-after-free, even with all previous
> >> > >> >> patches?
> >> > >> >
> >> > >> > I will try to extract a new reproducer now.
> >> > >>
> >> > >> Ok, I does not seem to see any crashes except the timer hangs below.
> >> > >> Let's consider all other bugs as fixed. I will report anything new
> >> > >> that I see separately.
> >> > >
> >> > > OK, good to hear.
> >> > >
> >> > >> > Meanwhile, can you try to reproduce this one:
> >> > >> > https://groups.google.com/forum/#!msg/syzkaller/bbtG9_h1ONU/CPLblMC6FAAJ
> >> > >> > ? I run the program in a tight parallel loop.
> >> > >
> >> > > I could reproduce this after your suggestion with parallel runs.
> >> > >
> >> > > This seems specific to hrtimer.  Possibly it's not about the snd-timer
> >> > > core itself.  Could you check whether this doesn't happen when
> >> > > CONFIG_SND_HRTIMER isn't set?
> >> >
> >> >
> >> > Does not happen without CONFIG_SND_HRTIMER.
> >> > Do you mean that this is hrtimer bug?
> >>
> >> I guess rather it's a bug in snd-hrtimer driver.
> >> Will check it later.
> >
> > The patch below *might* fix the issue.  There was a deadlock problem
> > and the current code has a weird workaround for it.  I suspect it
> > being the cause.
> >
> > If this works, I'll happily apply it before submitting the next pull
> > request for 4.5.  If not, I'll take a closer look at it in the next
> > week :)
> 
> 
> No, unfortunately the hang still happens with the patch:

Thanks for testing.  I think I understood the problem.  We faced a
similar issue and moved hrtimer_cancel() in the past.  But this wasn't
enough, as the start function may be called also in interrupt, too.

How about the one below instead?


Takashi

---
From: Takashi Iwai <tiwai at suse.de>
Subject: [PATCH] ALSA: hrtimer: Fix stall by hrtimer_cancel()

hrtimer_cancel() waits for the completion from the callback, thus it
must not be called inside the callback itself.  This was already a
problem, and the early commit [fcfdebe70759: ALSA: hrtimer - Fix
lock-up] tried to address it.

However, the previous fix is still insufficient: it may still cause a
lockup when the ALSA timer instance reprograms itself at its
callback.  Then it invokes the start function even in
snd_timer_interrupt() that is called in hrtimer callback itself,
results in a CPU stall.  It's not a hypothetical problem, as actually
triggered by syzkaller fuzzer.

This patch tries to fix the issue again.  Now we call
hrtimer_try_to_cancel() at both start and stop functions so that it
won't fall into a deadlock, yet giving some chance to cancel the queue
if the functions have been called outside the callback.  The proper
hrtimer_cancel() is called in anyway at closing, so this should be
enough.

Reported-by: Dmitry Vyukov <dvyukov at google.com>
Cc: <stable at vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai at suse.de>
---
 sound/core/hrtimer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c
index f845ecf7e172..656d9a9032dc 100644
--- a/sound/core/hrtimer.c
+++ b/sound/core/hrtimer.c
@@ -90,7 +90,7 @@ static int snd_hrtimer_start(struct snd_timer *t)
 	struct snd_hrtimer *stime = t->private_data;
 
 	atomic_set(&stime->running, 0);
-	hrtimer_cancel(&stime->hrt);
+	hrtimer_try_to_cancel(&stime->hrt);
 	hrtimer_start(&stime->hrt, ns_to_ktime(t->sticks * resolution),
 		      HRTIMER_MODE_REL);
 	atomic_set(&stime->running, 1);
@@ -101,6 +101,7 @@ static int snd_hrtimer_stop(struct snd_timer *t)
 {
 	struct snd_hrtimer *stime = t->private_data;
 	atomic_set(&stime->running, 0);
+	hrtimer_try_to_cancel(&stime->hrt);
 	return 0;
 }
 
-- 
2.7.0



More information about the Alsa-devel mailing list