[alsa-devel] Races in alsa-lib with threads

Rémi Denis-Courmont remi at remlab.net
Wed Nov 14 22:19:08 CET 2012

Le mercredi 14 novembre 2012 22:54:15, Trent Piepho a écrit :
> gstreamer uses non-blocking I/O.  But uses snd_pcm_wait rather than
> poll directly.  I wonder if poll() is entirely correct when ALSA
> plugins are in the chain?
> "the snd_pcm_write() calls ...can be interlocked with snd_pcm_delay()
> calls".
> It's still a race to call writei() and delay() at the same time, even
> if writei() is non-blocking.  But only if the rate plugin is being
> used.  Of course the actual race is very small, and doesn't include
> the time spent doing resampling or mixing in plugins which are still
> part of the writei() call in non-blocking mode.  It's basically like
> the BKL around every ALSA call.

Of course, you will need a lock around many ALSA calls, at least all 
potentially concurrent call. But that's a great improvement over holding a 
lock while sleeping inside a blocking I/O operation.

> >> It seems that they didn't know there would be a problem.
> > 
> > Eh? It is rather the exception for a library to be thread-safe when using
> > the same object in multiple threads.
> It's ok to call kernel syscalls on the same fd at the same time.

Yes and no. Typically the kernel protects itself against undefined behaviour 
induced by user space. But it does not protect user space against their own 
undefined behaviour. The file descriptors table is a very good example of that.

In fact,  calling snd_pcm_delay() and snd_pcm_write() concurrently is just 
PLAIN STUPID. The delay measurement may or may not include the impeding write 
operation, or maybe only a part of it. Essentially, the delay value is not 
much more than random garbage.

Hiding a mutex inside alsa-lib will NOT fix this intrinsic problem: Instead of 
having data races in memory, you will have logical races and gstreamer would 
still be buggy in some ways.

> The kernel used to have the BKL for this.  Now it has fine grained locking
> around the actual critical sections.  So if ALSA is called thread
> safe, it doesn't seem that unreasonable to think one can do this.

Thread-safe means you can enter the library functions from multiple threads. 
Thread-safe never meant that you can enter the library functions from multiple 
threads with the SAME DATA.

See also Lennart's and Kay's example library...

> > It looks more like "they" did not pay attention to what they were doing.
> I think the point is that this bug was in gstreamer.  gstreamer is
> commonly used by many people.  The bug has been there for years.  Many
> programmers have looked at the code.  Yet no one noticed it or tracked
> it down until now.  If it's so obvious why did it take so long?
> You aren't aware of this problem in other multi-threaded code, but
> that doesn't mean it doesn't exist.  You weren't aware of the problem
> in gstreamer and yet it exists.

Hey, I don't use gstreamer.  VLC has interlocked its ALSA calls for many years 
and I did not blame alsa-lib for not doing it internally. In fact, I realized 
that this was not a problem that ALSA could nor should solve.

Also, all audio APIs are like that. Even PulseAudio requires the application 
to acquire locks explicitly.

Rémi Denis-Courmont

More information about the Alsa-devel mailing list