[alsa-devel] [PATCH] alsa-lib: snd_pcm_delay and friends do not account for a write being currently in progress
Hi,
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei. In this case, snd_pcm_delay does not take into account that there may not be enough space for all the data passed to snd_pcm_writei to be written to the ring buffer at once, and will return incorrect values.
To illustrate:
Suppose a sound card has a maximum buffer size of 0.1 second. Then, suppose an application writes 0.5 second of audio. snd_pcm_writei will immediately write the first 0.1 second to the ring buffer, then will incrementally write the remaining 0.4 second as the buffer drains. During this time, snd_pcm_delay will return a constant fill level of 0.1 second. On the application side, the playback time counter will be calculated during this time as 0.5 seconds written to ALSA minus a delay of 0.1 second, giving a constant stream position of 0.4 second.
(This problem showed up when a recent change in Audacious caused it to call snd_pcm_writei with larger chunks of audio than formerly -- the result was that the visualization, which is synchronized with the time counter, began to jump, where it should be a smooth 30 FPS animation.)
The problem is a little tough to fix, since there must be some thread semantics involved so that the thread calling snd_pcm_delay knows at what instant snd_pcm_delay will begin taking into account the data passed to snd_pcm_writei. My solution is to create a variant of snd_pcm_writei ("snd_pcm_writei_locked"), which is the passed the additional argument of a pointer to a mutex locked in advance by the caller. snd_pcm_writei_locked will then unlock the mutex once the application can safely assume that snd_pcm_delay will take into account the new data.
I realize this is a substantial change, so please tell me whether this patch is along the right track or if a different solution is called for.
Signed-off-by: John Lindgren john.lindgren@tds.net ---
include/pcm.h | 8 +++++ src/pcm/pcm.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++--- src/pcm/pcm_hw.c | 39 +++++++++++++++++++++++++- src/pcm/pcm_local.h | 3 ++ 4 files changed, 121 insertions(+), 5 deletions(-)
diff --git a/include/pcm.h b/include/pcm.h index f3618c3..2234279 100644 --- a/include/pcm.h +++ b/include/pcm.h @@ -33,6 +33,10 @@ extern "C" { #endif
+#include <pthread.h> + +#define SND_PCM_HAVE_LOCKED_WRITE + /** * \defgroup PCM PCM Interface * See the \ref pcm page for more details. @@ -448,8 +452,12 @@ snd_pcm_sframes_t snd_pcm_rewind(snd_pcm_t *pcm, snd_pcm_uframes_t frames); snd_pcm_sframes_t snd_pcm_forwardable(snd_pcm_t *pcm); snd_pcm_sframes_t snd_pcm_forward(snd_pcm_t *pcm, snd_pcm_uframes_t frames); snd_pcm_sframes_t snd_pcm_writei(snd_pcm_t *pcm, const void *buffer, snd_pcm_uframes_t size); +snd_pcm_sframes_t snd_pcm_writei_locked (snd_pcm_t * pcm, const void * buffer, + snd_pcm_uframes_t size, pthread_mutex_t * mutex); snd_pcm_sframes_t snd_pcm_readi(snd_pcm_t *pcm, void *buffer, snd_pcm_uframes_t size); snd_pcm_sframes_t snd_pcm_writen(snd_pcm_t *pcm, void **bufs, snd_pcm_uframes_t size); +snd_pcm_sframes_t snd_pcm_writen_locked (snd_pcm_t * pcm, void * * bufs, + snd_pcm_uframes_t size, pthread_mutex_t * mutex); snd_pcm_sframes_t snd_pcm_readn(snd_pcm_t *pcm, void **bufs, snd_pcm_uframes_t size); int snd_pcm_wait(snd_pcm_t *pcm, int timeout);
diff --git a/src/pcm/pcm.c b/src/pcm/pcm.c index f910189..db03067 100644 --- a/src/pcm/pcm.c +++ b/src/pcm/pcm.c @@ -903,8 +903,12 @@ int snd_pcm_sw_params(snd_pcm_t *pcm, snd_pcm_sw_params_t *params) */ int snd_pcm_status(snd_pcm_t *pcm, snd_pcm_status_t *status) { - assert(pcm && status); - return pcm->fast_ops->status(pcm->fast_op_arg, status); + int result; + + assert(pcm && status); + result = pcm->fast_ops->status (pcm->fast_op_arg, status); + status->delay += pcm->fast_op_arg->write_delay; + return result; }
/** @@ -973,12 +977,17 @@ link_warning(snd_pcm_hwsync, "Warning: snd_pcm_hwsync() is deprecated, consider */ int snd_pcm_delay(snd_pcm_t *pcm, snd_pcm_sframes_t *delayp) { + int result; + assert(pcm); if (CHECK_SANITY(! pcm->setup)) { SNDMSG("PCM not set up"); return -EIO; } - return pcm->fast_ops->delay(pcm->fast_op_arg, delayp); + + result = pcm->fast_ops->delay (pcm->fast_op_arg, delayp); + * delayp += pcm->fast_op_arg->write_delay; + return result; }
/** @@ -1248,6 +1257,34 @@ snd_pcm_sframes_t snd_pcm_writei(snd_pcm_t *pcm, const void *buffer, snd_pcm_ufr }
/** + * \brief Variant of snd_pcm_writei for multithreaded applications + * \param mutex Mutex already locked by caller + * + * In a multithreaded application, there is uncertainty inherent in calling + * snd_pcm_delay while another thread is calling and_pcm_writei, since there is + * no way to know whether the data passed to snd_pcm_writei has yet been + * accounted for in the delay calculation. This function provides a solution to + * that problem by unlocking the mutex when it is safe to call snd_pcm_delay. + * The mutex is locked again before the function returns. An application using + * this function should be careful to avoid deadlocks. + * + * Programmers should check that the macro SND_PCM_HAVE_LOCKED_WRITE is defined + * before using this function. + */ + +snd_pcm_sframes_t snd_pcm_writei_locked (snd_pcm_t * pcm, const void * buffer, + snd_pcm_uframes_t size, pthread_mutex_t * mutex) +{ + snd_pcm_sframes_t result; + + assert (pcm && pcm->fast_op_arg && ! pcm->fast_op_arg->write_mutex); + pcm->fast_op_arg->write_mutex = mutex; + result = snd_pcm_writei (pcm, buffer, size); + pcm->fast_op_arg->write_mutex = NULL; + return result; +} + +/** * \brief Write non interleaved frames to a PCM * \param pcm PCM handle * \param bufs frames containing buffers (one for each channel) @@ -1280,6 +1317,25 @@ snd_pcm_sframes_t snd_pcm_writen(snd_pcm_t *pcm, void **bufs, snd_pcm_uframes_t }
/** + * \brief Variant of snd_pcm_writen for multithreaded applications + * \param Mutex already locked by caller + * + * (See snd_pcm_writei_locked for a full description.) + */ + +snd_pcm_sframes_t snd_pcm_writen_locked (snd_pcm_t * pcm, void * * bufs, + snd_pcm_uframes_t size, pthread_mutex_t * mutex) +{ + snd_pcm_sframes_t result; + + assert (pcm && pcm->fast_op_arg && ! pcm->fast_op_arg->write_mutex); + pcm->fast_op_arg->write_mutex = mutex; + result = snd_pcm_writen (pcm, bufs, size); + pcm->fast_op_arg->write_mutex = NULL; + return result; +} + +/** * \brief Read interleaved frames from a PCM * \param pcm PCM handle * \param buffer frames containing buffer @@ -2480,6 +2536,7 @@ int snd_pcm_avail_delay(snd_pcm_t *pcm, sf = pcm->fast_ops->delay(pcm->fast_op_arg, delayp); if (sf < 0) return (int)sf; + * delayp += pcm->fast_op_arg->write_delay; sf = pcm->fast_ops->avail_update(pcm->fast_op_arg); if (sf < 0) return (int)sf; @@ -6652,7 +6709,18 @@ snd_pcm_sframes_t snd_pcm_write_areas(snd_pcm_t *pcm, const snd_pcm_channel_area goto _end; }
- err = snd_pcm_wait(pcm, -1); + if (pcm->write_mutex) { + pcm->write_delay = size; + assert (! pthread_mutex_unlock (pcm->write_mutex)); + } + + err = snd_pcm_wait(pcm, -1); + + if (pcm->write_mutex) { + assert (! pthread_mutex_lock (pcm->write_mutex)); + pcm->write_delay = 0; + } + if (err < 0) break; goto _again; diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c index 9d243d5..6a428d8 100644 --- a/src/pcm/pcm_hw.c +++ b/src/pcm/pcm_hw.c @@ -753,7 +753,8 @@ static int snd_pcm_hw_unlink(snd_pcm_t *pcm) return 0; }
-static snd_pcm_sframes_t snd_pcm_hw_writei(snd_pcm_t *pcm, const void *buffer, snd_pcm_uframes_t size) +static snd_pcm_sframes_t writei_real (snd_pcm_t * pcm, const void * buffer, + snd_pcm_uframes_t size) { int err; snd_pcm_hw_t *hw = pcm->private_data; @@ -772,6 +773,42 @@ static snd_pcm_sframes_t snd_pcm_hw_writei(snd_pcm_t *pcm, const void *buffer, s return xferi.result; }
+static snd_pcm_sframes_t snd_pcm_hw_writei (snd_pcm_t * pcm, const void * + buffer, snd_pcm_uframes_t size) +{ + snd_pcm_sframes_t remain = size; + snd_pcm_sframes_t result; + + while (1) { + if ((result = snd_pcm_hw_avail_update (pcm)) < 0) + return result; + if ((result = writei_real (pcm, buffer, result < remain ? result : + remain)) < 0) + return result; + + buffer = (const char *) buffer + snd_pcm_frames_to_bytes (pcm, result); + remain -= result; + + if (! remain) + return size; + + if (pcm->write_mutex) { + pcm->write_delay = remain; + assert (! pthread_mutex_unlock (pcm->write_mutex)); + } + + result = snd_pcm_wait (pcm, -1); + + if (pcm->write_mutex) { + assert (! pthread_mutex_lock (pcm->write_mutex)); + pcm->write_delay = 0; + } + + if (result < 0) + return result; + } +} + static snd_pcm_sframes_t snd_pcm_hw_writen(snd_pcm_t *pcm, void **bufs, snd_pcm_uframes_t size) { int err; diff --git a/src/pcm/pcm_local.h b/src/pcm/pcm_local.h index 9aa81e1..53eef78 100644 --- a/src/pcm/pcm_local.h +++ b/src/pcm/pcm_local.h @@ -229,6 +229,9 @@ struct _snd_pcm { snd_pcm_t *fast_op_arg; void *private_data; struct list_head async_handlers; + snd_pcm_uframes_t write_delay; /* size of data passed to snd_pcm_write* + * but not yet outputted to ring buffer */ + pthread_mutex_t * write_mutex; /* mutex passed to snd_pcm_write*_locked */ };
/* make local functions really local */
John Lindgren wrote:
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei.
Alsa-lib is not thread safe. In theory, you are not even allowed to call snd_pcm_delay while another function on the same PCM device has not yet returned.
Suppose a sound card has a maximum buffer size of 0.1 second. Then, suppose an application writes 0.5 second of audio. snd_pcm_writei will immediately write the first 0.1 second to the ring buffer, then will incrementally write the remaining 0.4 second as the buffer drains.
It will write a chunk whenever it is woken up by the driver, which is usually at period boundaries.
During this time, snd_pcm_delay will return a constant fill level of 0.1 second.
Depending on how big the granularity of the hardware's readable DMA pointer is, but on most hardware, it will fluctuate between a full buffer and one period less than that, not counting further fluctuations due to scheduling delays.
On the application side, the playback time counter will be calculated during this time as 0.5 seconds written to ALSA
This is wrong; as long as the write call has not returned, you do not know how much has been written (and when an error occurs, writing can stop before).
To keep track of the actual amount of data written, use non-blocking mode and in a loop, write as much as possible in one call, then update your write counter, then wait for some more free space in the buffer with poll().
Regards, Clemens
Thanks for your reply.
On Thu, 2010-06-03 at 08:40 +0200, Clemens Ladisch wrote:
John Lindgren wrote:
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei.
Alsa-lib is not thread safe. In theory, you are not even allowed to call snd_pcm_delay while another function on the same PCM device has not yet returned.
ALSA has the following significant features: ... SMP and thread-safe design.
So, that's a big lie?
...
On the application side, the playback time counter will be calculated during this time as 0.5 seconds written to ALSA
This is wrong; as long as the write call has not returned, you do not know how much has been written (and when an error occurs, writing can stop before).
To keep track of the actual amount of data written, use non-blocking mode and in a loop, write as much as possible in one call, then update your write counter, then wait for some more free space in the buffer with poll().
Would it work to simply call snd_pcm_wait?
John Lindgren
John Lindgren wrote:
On Thu, 2010-06-03 at 08:40 +0200, Clemens Ladisch wrote:
Alsa-lib is not thread safe.
From http://alsa-project.org/main/index.php/Main_Page:
ALSA has the following significant features: ... SMP and thread-safe design.
So, that's a big lie?
That applies to the kernel code.
Most functions in alsa-lib must not be called at the same time on the same device handle. (Don't ask me where this is documented.)
... poll()
Would it work to simply call snd_pcm_wait?
Yes. (I usually suggest poll because the code that writes audio data often wants to be informed of some other event. If your writing loop doesn't need to be interrupted, snd_pcm_wait works just fine.)
Regards, Clemens
On Thu, 2010-06-03 at 16:48 +0200, Clemens Ladisch wrote:
That applies to the kernel code.
Most functions in alsa-lib must not be called at the same time on the same device handle. (Don't ask me where this is documented.)
Do you have a problem with patches that improve the current situation?
Would it work to simply call snd_pcm_wait?
Yes. (I usually suggest poll because the code that writes audio data often wants to be informed of some other event. If your writing loop doesn't need to be interrupted, snd_pcm_wait works just fine.)
It is permissible, then, to call snd_pcm_delay during a snd_pcm_wait call?
What would be the cleanest way to interrupt snd_pcm_wait when we need to stop the stream? Will snd_pcm_drop work?
John Lindgren
John Lindgren wrote:
I understand that snd_pcm_delay and snd_pcm_writei currently "interfere with each other" in that snd_pcm_delay returns wrong values if called during snd_pcm_writei. That is the problem my patch tries to correct.
These values are not wrong; the problem is that your program does not have enough information to interpret it correctly.
Do you have a problem with patches that improve the current situation?
A blocking write call is an abstraction on top of the underlying non-blocking writes. Using a blocking write implies that your program wants to write the entire block and does not care about how the timing of the chunks that are actually written.
I do not consider it an improvement if a patch adds complexity, if the same functionality is already available. Non-blocking mode was designed for the case where you want to know the actual amount of data at every point in time.
Would it work to simply call snd_pcm_wait?
Yes. (I usually suggest poll because the code that writes audio data often wants to be informed of some other event. If your writing loop doesn't need to be interrupted, snd_pcm_wait works just fine.)
It is permissible, then, to call snd_pcm_delay during a snd_pcm_wait call?
It will work in practice (snd_pcm_wait is just a wrapper around poll).
What would be the cleanest way to interrupt snd_pcm_wait when we need to stop the stream? Will snd_pcm_drop work?
*sigh* This will _not_ work.
You have to use poll so that you can send your loop a message (through a pipe/eventfd/whatever) that tells it to stop.
Regards, Clemens
On Thu, 2010-06-03 at 19:03 +0200, Clemens Ladisch wrote:
John Lindgren wrote:
I understand that snd_pcm_delay and snd_pcm_writei currently "interfere with each other" in that snd_pcm_delay returns wrong values if called during snd_pcm_writei. That is the problem my patch tries to correct.
These values are not wrong; the problem is that your program does not have enough information to interpret it correctly.
Same thing.
What would be the cleanest way to interrupt snd_pcm_wait when we need to stop the stream? Will snd_pcm_drop work?
*sigh* This will _not_ work.
You have to use poll so that you can send your loop a message (through a pipe/eventfd/whatever) that tells it to stop.
Ugh. It seems you are determined to avoid complexity in ALSA at the expense of complexity in programs that use it.
At any rate, at least I know what needs to be done now, so thanks for that.
John Lindgren
On 2 June 2010 22:29, John Lindgren john.lindgren@tds.net wrote:
Hi,
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei. In this case, snd_pcm_delay does not take into account that there may not be enough space for all the data passed to snd_pcm_writei to be written to the ring buffer at once, and will return incorrect values.
I believe the definition of snd_pcm_delay() is it returns a value that would be fairly accurate at this instance of time. If you followed the snd_pcm_delay() with a snd_pcm_writei(), the samples would reach the speaker "delay" time later.
I think it would be fair to say that the value of snd_pcm_delay() is undefined if called during a snd_pcm_writei() call, because you will get a return value from snd_pcm_delay() but you will have no idea how many samples of the current snd_pcm_writei() have been written and thus no idea what the delay will be on the next snd_pcm_writei().
Even in multi-threaded applications calling two functions at the same time that interfere with each other it not good. If a function is re-entrant it is generally considered to be thread safe. It does not necessarily mean that it is sensible to call two different functions at the same time.
On Thu, 2010-06-03 at 15:40 +0100, James Courtier-Dutton wrote:
On 2 June 2010 22:29, John Lindgren john.lindgren@tds.net wrote:
Hi,
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei. In this case, snd_pcm_delay does not take into account that there may not be enough space for all the data passed to snd_pcm_writei to be written to the ring buffer at once, and will return incorrect values.
I believe the definition of snd_pcm_delay() is it returns a value that would be fairly accurate at this instance of time. If you followed the snd_pcm_delay() with a snd_pcm_writei(), the samples would reach the speaker "delay" time later.
I think it would be fair to say that the value of snd_pcm_delay() is undefined if called during a snd_pcm_writei() call, because you will get a return value from snd_pcm_delay() but you will have no idea how many samples of the current snd_pcm_writei() have been written and thus no idea what the delay will be on the next snd_pcm_writei().
Even in multi-threaded applications calling two functions at the same time that interfere with each other it not good. If a function is re-entrant it is generally considered to be thread safe. It does not necessarily mean that it is sensible to call two different functions at the same time.
I understand that snd_pcm_delay and snd_pcm_writei currently "interfere with each other" in that snd_pcm_delay returns wrong values if called during snd_pcm_writei. That is the problem my patch tries to correct. Do you disagree with this improvement? If so, why?
John Lindgren
On 3 June 2010 17:10, John Lindgren john.lindgren@tds.net wrote:
I understand that snd_pcm_delay and snd_pcm_writei currently "interfere with each other" in that snd_pcm_delay returns wrong values if called during snd_pcm_writei. That is the problem my patch tries to correct. Do you disagree with this improvement? If so, why?
I disagree because I believe the changes are unnecessary. The alsa api is big enough already, so adding anything extra must have an extremely good reason for it. snd_pcm_writei and snd_pcm_delay were always intended to be used in the same thread. Calling snd_pcm_writei and snd_pcm_delay at the same time is non-deterministic so fairly pointless to do. Adding more locks just makes another hit on performances so should be avoided. Locks really kill performance on a multi processor SMP machine.
I would argue that modifying your program to fit in the the above contraints is the way to go.
On Thu, 2010-06-03 at 17:34 +0100, James Courtier-Dutton wrote:
On 3 June 2010 17:10, John Lindgren john.lindgren@tds.net wrote:
I understand that snd_pcm_delay and snd_pcm_writei currently "interfere with each other" in that snd_pcm_delay returns wrong values if called during snd_pcm_writei. That is the problem my patch tries to correct. Do you disagree with this improvement? If so, why?
I disagree because I believe the changes are unnecessary. The alsa api is big enough already, so adding anything extra must have an extremely good reason for it.
I thought there was good enough reason, but I won't argue the point.
snd_pcm_writei and snd_pcm_delay were always intended to be used in the same thread. Calling snd_pcm_writei and snd_pcm_delay at the same time is non-deterministic so fairly pointless to do.
All you're saying is that because it currently doesn't work, it's pointless to make it work.
Adding more locks just makes another hit on performances so should be avoided. Locks really kill performance on a multi processor SMP machine.
The locks have to be there somewhere; it's just a question of whether that will be on the application side or the ALSA side. We have an interface running in one thread, and an audio decoder running in another. As long as output time is calculated from written time + delay, there must be some thread interaction to ensure that those values are in sync.
I would argue that modifying your program to fit in the the above contraints is the way to go.
That will be ugly, but it seems that is what I will have to do.
John Lindgren
On Wed, Jun 2, 2010 at 2:29 PM, John Lindgren john.lindgren@tds.net wrote:
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei. In this case, snd_pcm_delay does not take into account that there may not be enough space for all the data passed to snd_pcm_writei to be written to the ring buffer at once, and will return incorrect values.
Hi. I've been getting "pcm_hw.c: snd_pcm_hw_delay() SNDRV_PCM_IOCTL_DELAY failed." in my xine log and after some talks with one of the devs, he suggested that alsa was not getting data fast enough in some cases (iirc). Could you please email me a copy of your patch (so I don't have to hand-patch) to try? Hopefully it will fix this damn problem once and for all.
Thanks, Derek
On Thu, 2010-06-03 at 10:40 -0700, VDR User wrote:
On Wed, Jun 2, 2010 at 2:29 PM, John Lindgren john.lindgren@tds.net wrote:
In a multi-threaded application it is possible for snd_pcm_delay or an equivalent function to be called by one thread while another is sitting in snd_pcm_writei. In this case, snd_pcm_delay does not take into account that there may not be enough space for all the data passed to snd_pcm_writei to be written to the ring buffer at once, and will return incorrect values.
Hi. I've been getting "pcm_hw.c: snd_pcm_hw_delay() SNDRV_PCM_IOCTL_DELAY failed." in my xine log and after some talks with one of the devs, he suggested that alsa was not getting data fast enough in some cases (iirc). Could you please email me a copy of your patch (so I don't have to hand-patch) to try? Hopefully it will fix this damn problem once and for all.
I don't think my patch has anything to do with that problem, but I suppose you can try it.
John Lindgren
VDR User wrote:
I've been getting "pcm_hw.c: snd_pcm_hw_delay() SNDRV_PCM_IOCTL_DELAY failed." in my xine log
The actual error (code) is missing.
and after some talks with one of the devs, he suggested that alsa was not getting data fast enough in some cases (iirc).
That would be an underrun. This has nothing to do with the interpretation of snd_pcm_delay's return value; this function just appears in the log because it happend to be the first one that detected this error condition. An underrung cannot be prevented by any change in alsa-lib; xine needs to use a larger buffer, or the scheduling needs to be improved so that some other thread or program doesn't prevent xine from being executed.
Regards, Clemens
participants (4)
-
Clemens Ladisch
-
James Courtier-Dutton
-
John Lindgren
-
VDR User