[alsa-devel] bug in alsa-lib/snd_pcm_plugin_delay for capture?
Since the PulseAudio commit 29acfd0e0413a9bd126782763ee2dcf10357546 I am seeing errors messages telling me that the information reported by snd_pcm_avail_delay is not accurate on the capture path, more specifically the delay is lower than the available samples. I spent a little bit of time to find the root cause, and I believe this is a problem with alsa-lib plugins. I never see any errors when using hw devices. Conversely I always have these errors when using the front: plugin (the pulseaudio default). When I looked at the code, I saw that snd_pcm_plugin_delay and snd_pcm_plugin_avail_update don't really match. There is additional code in snd_pcm_plugin_avail_update for the capture case that is not present when in the delay code, and that causes the delta between the values reported. Since I don't understand this code that has all kinds of mmap_commit/forward/undos, I have to ask higher powers for advice. Jaroslav, could you take a look at the code and let me know if my analysis is correct? Thanks, -Pierre
On Tue, 16 Nov 2010, pl bossart wrote:
Since the PulseAudio commit 29acfd0e0413a9bd126782763ee2dcf10357546 I am seeing errors messages telling me that the information reported by snd_pcm_avail_delay is not accurate on the capture path, more specifically the delay is lower than the available samples.
Could you test patch bellow?
http://git.alsa-project.org/?p=alsa-lib.git;a=commitdiff;h=ba9332e9192814a54...
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
Since the PulseAudio commit 29acfd0e0413a9bd126782763ee2dcf10357546 I am seeing errors messages telling me that the information reported by snd_pcm_avail_delay is not accurate on the capture path, more specifically the delay is lower than the available samples.
Could you test patch bellow?
http://git.alsa-project.org/?p=alsa-lib.git;a=commitdiff;h=ba9332e9192814a54...
Nah, no luck. PulseAudio still reports an error and if I add assert(*delayp >= *availp) the code bombs out. Here is one way to solve the problem. I don't claim it's the right one, but it's compatible with the way the availp value is computed, and it makes the error message go away in PulseAudio.
While I was at it I also removed the .client_frames and .slave_frames methods that no one seems to use. This seems to be dead code that hasn't been touched in years. -Pierre
On Mon, 22 Nov 2010, pl bossart wrote:
Since the PulseAudio commit 29acfd0e0413a9bd126782763ee2dcf10357546 I am seeing errors messages telling me that the information reported by snd_pcm_avail_delay is not accurate on the capture path, more specifically the delay is lower than the available samples.
Could you test patch bellow?
http://git.alsa-project.org/?p=alsa-lib.git;a=commitdiff;h=ba9332e9192814a54...
Nah, no luck. PulseAudio still reports an error and if I add assert(*delayp >= *availp) the code bombs out. Here is one way to solve the problem. I don't claim it's the right one, but it's compatible with the way the availp value is computed, and it makes the error message go away in PulseAudio.
The double avail_update() calling seems too expensive to me. Could you test to use snd_pcm_mmap_capture_avail() instead snd_pcm_avail_update() calls (to assign both slave_avail and plugin_avail local variables)?
Also, remove the assert. I think that it may be raised when the stream is in some xrun state (with xrun checks disabled). The avail_update can return the negative values in this specific configuration.
While I was at it I also removed the .client_frames and .slave_frames methods that no one seems to use. This seems to be dead code that hasn't been touched in years.
Good catch. This code was used with the really old rate plugin (this plugin was recoded completely). I applied this patch.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
Could you test patch bellow?
http://git.alsa-project.org/?p=alsa-lib.git;a=commitdiff;h=ba9332e9192814a54...
Nah, no luck. PulseAudio still reports an error and if I add assert(*delayp >= *availp) the code bombs out. Here is one way to solve the problem. I don't claim it's the right one, but it's compatible with the way the availp value is computed, and it makes the error message go away in PulseAudio.
The double avail_update() calling seems too expensive to me. Could you test to use snd_pcm_mmap_capture_avail() instead snd_pcm_avail_update() calls (to assign both slave_avail and plugin_avail local variables)?
Nope, it still does not work if you use mmap_capture_avail. The root cause of the problem is the while loop in snd_pcm_plugin_avail_update (copied for reference below), it's called only in the capture case and the value returned by the xfer variable causes the delay to be eventually lower than the avail samples. My fix is indeed expensive but it makes sure the same loop is called twice, so that these two delay and avail values are consistent. I don't know how to fix it in a different manner since I don't understand this while loop in the first place. Maybe we need to refactor this loop so that it's called in the delay() routine as well? -Pierre
while (size > 0 && slave_size > 0) { snd_pcm_uframes_t frames = size; snd_pcm_uframes_t cont = pcm->buffer_size - hw_offset; const snd_pcm_channel_area_t *slave_areas; snd_pcm_uframes_t slave_offset; snd_pcm_uframes_t slave_frames = ULONG_MAX; snd_pcm_sframes_t result; int err;
err = snd_pcm_mmap_begin(slave, &slave_areas, &slave_offset, &slave_frames); if (err < 0) return xfer > 0 ? (snd_pcm_sframes_t)xfer : err; if (frames > cont) frames = cont; frames = (plugin->read)(pcm, areas, hw_offset, frames, slave_areas, slave_offset, &slave_frames); snd_atomic_write_begin(&plugin->watom); snd_pcm_mmap_hw_forward(pcm, frames); result = snd_pcm_mmap_commit(slave, slave_offset, slave_frames); snd_atomic_write_end(&plugin->watom); if (result > 0 && (snd_pcm_uframes_t)result != slave_frames) { snd_pcm_sframes_t res; res = plugin->undo_read(slave, areas, hw_offset, frames, slave_frames - result); if (res < 0) return xfer > 0 ? (snd_pcm_sframes_t)xfer : res; frames -= res; } if (result <= 0) return xfer > 0 ? (snd_pcm_sframes_t)xfer : result; if (frames == cont) hw_offset = 0; else hw_offset += frames; size -= frames; slave_size -= slave_frames; xfer += frames; } return (snd_pcm_sframes_t)xfer;
On Mon, 22 Nov 2010, pl bossart wrote:
Could you test patch bellow?
http://git.alsa-project.org/?p=alsa-lib.git;a=commitdiff;h=ba9332e9192814a54...
Nah, no luck. PulseAudio still reports an error and if I add assert(*delayp >= *availp) the code bombs out. Here is one way to solve the problem. I don't claim it's the right one, but it's compatible with the way the availp value is computed, and it makes the error message go away in PulseAudio.
The double avail_update() calling seems too expensive to me. Could you test to use snd_pcm_mmap_capture_avail() instead snd_pcm_avail_update() calls (to assign both slave_avail and plugin_avail local variables)?
Nope, it still does not work if you use mmap_capture_avail. The root cause of the problem is the while loop in snd_pcm_plugin_avail_update (copied for reference below), it's called only in the capture case and the value returned by the xfer variable causes the delay to be eventually lower than the avail samples.
After rethinking, I believe that the right fix should be to add the 'snd_pcm_mmap_capture_avail(pcm)' to the sd variable - delay(slave) - for the capture direction. You should not bother with avail_update or the slave avail values. The avail_update() capture loop just copied samples from the producer (slave) to the consumer (pcm). It means that the avail becomes lower for the producer (slave) and is increased for the consumer (pcm). The delay(slave) already contains the avail(slave) value.
So, the result should be 'delay(slave) + mmap_capture_avail(pcm)'.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
After rethinking, I believe that the right fix should be to add the 'snd_pcm_mmap_capture_avail(pcm)' to the sd variable - delay(slave) - for the capture direction. You should not bother with avail_update or the slave avail values. The avail_update() capture loop just copied samples from the producer (slave) to the consumer (pcm). It means that the avail becomes lower for the producer (slave) and is increased for the consumer (pcm). The delay(slave) already contains the avail(slave) value.
So, the result should be 'delay(slave) + mmap_capture_avail(pcm)'.
The attached patch implements your recoomendation and makes error messages go away in PulseAudio. It'd good to know if it improves the perfomance of echo cancellation. Wim, can you try it out? Thanks, -Pierre
On Tue, 23 Nov 2010, pl bossart wrote:
After rethinking, I believe that the right fix should be to add the 'snd_pcm_mmap_capture_avail(pcm)' to the sd variable - delay(slave) - for the capture direction. You should not bother with avail_update or the slave avail values. The avail_update() capture loop just copied samples from the producer (slave) to the consumer (pcm). It means that the avail becomes lower for the producer (slave) and is increased for the consumer (pcm). The delay(slave) already contains the avail(slave) value.
So, the result should be 'delay(slave) + mmap_capture_avail(pcm)'.
The attached patch implements your recoomendation and makes error messages go away in PulseAudio.
I applied your patch to the alsa-lib repo. Thank you for discovering and testing.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
On Tue, 2010-11-23 at 08:55 -0600, pl bossart wrote:
After rethinking, I believe that the right fix should be to add the 'snd_pcm_mmap_capture_avail(pcm)' to the sd variable - delay(slave) - for the capture direction. You should not bother with avail_update or the slave avail values. The avail_update() capture loop just copied samples from the producer (slave) to the consumer (pcm). It means that the avail becomes lower for the producer (slave) and is increased for the consumer (pcm). The delay(slave) already contains the avail(slave) value.
So, the result should be 'delay(slave) + mmap_capture_avail(pcm)'.
The attached patch implements your recoomendation and makes error messages go away in PulseAudio. It'd good to know if it improves the perfomance of echo cancellation. Wim, can you try it out?
I asked Arun to try it, will let you know if it actually improves things enough to celebrate. The current next plan to improve AEC between different devices is to first do further smoothing of the rate estimation in pa_smoother. and then make the sink (or source) do resampling to compensate (when we manage to get a stable resampling factor).
Wim
Thanks, -Pierre
2010/11/22 Jaroslav Kysela perex@perex.cz
On Mon, 22 Nov 2010, pl bossart wrote:
Since the PulseAudio commit 29acfd0e0413a9bd126782763ee2dcf10357546 I am seeing errors messages telling me that the information reported by snd_pcm_avail_delay is not accurate on the capture path, more specifically the delay is lower than the available samples.
Could you test patch bellow?
http://git.alsa-project.org/?p=alsa-lib.git;a=commitdiff;h=ba9332e9192814a54...
Nah, no luck. PulseAudio still reports an error and if I add assert(*delayp >= *availp) the code bombs out. Here is one way to solve the problem. I don't claim it's the right one, but it's compatible with the way the availp value is computed, and it makes the error message go away in PulseAudio.
The double avail_update() calling seems too expensive to me. Could you test to use snd_pcm_mmap_capture_avail() instead snd_pcm_avail_update() calls (to assign both slave_avail and plugin_avail local variables)?
Also, remove the assert. I think that it may be raised when the stream is in some xrun state (with xrun checks disabled). The avail_update can return the negative values in this specific configuration.
The point is the documentation does not mention that the state still remain as SND_PCM_STATE_RUNNING even when the distance between the application ptr and the hw_ptr is more than a buffer size , it only mention the device will do the endless loop.
The only way to know overrun/underrun is to check the avail is larger than the buffer size when the application set the stop threshold to boundary instead of buffer size.
so it is incorrect for PA to say this is a driver bug since it disable the automatic stop when underrun/overrun occur
e.g. dmix also set stop threshold to boundary but aplay can still get xrun when using small buffer size with plug:dmix
if you set the boudnary to buffer size instead of boundary in alsa-time-test.c , alsa-lib will tell you that the program is underrun/overrun
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___s_w___params.h...
PCM is automatically stopped in SND_PCM_STATE_XRUN state when available frames is >= threshold. If the stop threshold is equal to boundary (also software parameter - sw_param) then automatic stop will be disabled (thus device will do the endless loop in the ring buffer).
participants (4)
-
Jaroslav Kysela
-
pl bossart
-
Raymond Yau
-
Wim Taymans