[alsa-devel] snd_pcm_delay, hw buffers and driver api (v2)
Hi,
as I got at least one reply, reposting this in a new thread to catch more people. Originally sent to alsa-devel thread: "[PATCH 11/20] OMAP: McBSP: Add link DMA mode selection)" http://kerneltrap.org/mailarchive/alsa-devel/2009/7/30/6272573
---
So not strictly related to the patch that started the original thread, but this touches on, and is a good example of, one question I've been wondering for some time now as an app developer. Could Takashi, Jaroslav, Mark, or others comment on this as well, perhaps? Dropping linux-omap as this is a generic ALSA question.
On Wed, 12 Aug 2009, Jarkko Nikula wrote: [i.e. pcm_pointer == hw_ptr]
The threshold based transfer will cause that omap_pcm_pointer will loose a bit its accuracy. Probably irrelevant but still better to play safe at least over one kernel release before making it default.
[...]
element: 614 669 691
[...]
threshold: 512 512 1024 1024
In both cases, this value (hw_ptr) shows the status of sending data to an separate entity, and one that has its own buffer (multiple ms) before the actual DAC/ADC.
So when application developer uses snd_pcm_delay(), the result is not quite what the ALSA API says:
alsa-lib.git/src/pcm/pcm.c: --cut-- * For playback the delay is defined as the time that a frame that is written * to the PCM stream shortly after this call will take to be actually * audible. It is as such the overall latency from the write call to the final * DAC. [...] int snd_pcm_delay(snd_pcm_t *pcm, snd_pcm_sframes_t *delayp) --cut--
Of course, there are many other cases where the same happens, as the latency of the HW (or where it is connected) is not known. But we also have many cases where we _do_ know more about the latency, and we are just missing the means to expose this info. I think this applies to this OMAP patch as well. E.g. the HW can tell fairly accurately the current status (i.e. which sample is played-out/captured _now_). Now that brings us to my question -- how to expose this info in an ALSA driver.
Elsewhere in alsa-lib docs, there is more accurate description of current implementation and behaviour of snd_pcm_delay(): alsa-lib.git/src/pcm/pcm.c: --cut-- The function #snd_pcm_delay() returns the delay in samples. For playback, it means count of samples in the ring buffer before the next sample will be sent to DAC. --cut--
So e.g. the OMAP patch that this thread started from, complies with this definition, and element/threshold modes just differ in accuracy (although threshold implementation should warn the application with the SNDRV_PCM_INFO_BATCH flag).
But how would a driver expose both bits of info to apps (in a standard fashion): 1) the hw_ptr for shuffling bits out of the ringbuffer, and 2) the delay to next played-out/captured sample. For application developers, (2) is the piece of info we are mostly interested in (if I write a sample now, how long it will be until it's played out).
One obvious solution (that has been used already in other ALSA drivers) is to virtualize the hw_ptr, but this way you lose the accurate info about ringbuffer status. Ideally both pieces of info could be exposed. But as I see it, drivers currently can only implement "pointer" to relay this info.
Jaroslav proposed some ideas earlier this year, but then the discussion faded: "Re: Driver code with mpc5200 pointer problem.", 2009-04-28 http://article.gmane.org/gmane.linux.alsa.devel/62170
Takashi made basicly the same point I've tried to make above "Re: Driver code with mpc5200 pointer problem.", 2009-04-28 http://article.gmane.org/gmane.linux.alsa.devel/62172
Mark also commented to the same thread: http://article.gmane.org/gmane.linux.alsa.devel/62176, 2009-04-28
Any update/ideas on this topic? Personally I think adding a new driver callback would make most sense, as that would allow to take full benefit from hardware that allows to query sample-accurate position of playout (i.e. not just support exposing a fixed latency). Of course that's potentially a big change. In alsa-lib, snd_pcm_hwsync() could call this driver callback, and a new variant of snd_pcm_delay() could present the information to the applications.
Hi,
[ continuing to restart the thread. Here's a reply to Mark's mail to the original misplaced thread at: http://kerneltrap.org/mailarchive/alsa-devel/2009/7/30/6272573 ]
On Thu, 13 Aug 2009, Mark Brown wrote:
On Thu, Aug 13, 2009 at 11:46:52PM +0300, Kai Vehmanen wrote:
But how would a driver expose both bits of info to apps (in a standard fashion): 1) the hw_ptr for shuffling bits out of the ringbuffer, and 2) the delay to next played-out/captured sample. For application developers, (2) is the piece of info we are mostly interested in (if I write a sample now, how long it will be until it's played out).
Both bits of information are very interesting to applications - some applications want to work on the data they're sending for as long as possible (things like pulse which do mixing are the obvious example of this).
For sure, that's true. I was meaning to say, that when using snd_pcm_delay(), (2) is what applications are mostly interested in. But yeah, snd_pcm_avail()/snd_pcm_avail_delay() are certainly important to applications as well. But I initialled ignored this part, as it would seem this part of the API is already in good shape (especially after the recent updates like snd_pcm_avail_delay()).
Any update/ideas on this topic? Personally I think adding a new driver callback would make most sense, as that would allow to take full benefit from hardware that allows to query sample-accurate position of playout (i.e. not just support exposing a fixed latency). Of course that's potentially a big change. In alsa-lib, snd_pcm_hwsync() could call this driver callback, and a new variant of snd_pcm_delay() could present the information to the applications.
If we're adding a new API it should be possible to add one which provides the required information without disturbing the semantics of the existing APIs.
For sure. Changes to e.g. snd_pcm_delay() semantics (especially with current set of drivers) would cause no end of nasty bugs, agreed. Probably reusing snd_pcm_hwsync() is a bad idea as well. Maybe just add a new snd_pcm_hwdelay() or some such which calls the new driver callback (-> provides an atomic snapshot of snd_pcm_avail_delay + fill-status of ext-hw buffers).
2009/8/13 Kai Vehmanen kvehmanen@eca.cx:
Hi,
Any update/ideas on this topic? Personally I think adding a new driver callback would make most sense, as that would allow to take full benefit from hardware that allows to query sample-accurate position of playout (i.e. not just support exposing a fixed latency). Of course that's potentially a big change. In alsa-lib, snd_pcm_hwsync() could call this driver callback, and a new variant of snd_pcm_delay() could present the information to the applications.
snd_pcm_delay() does not just support exposing a fixed latency. The value returned is dynamic.
In the current ALSA implementation, the value returns a real time count of how many samples are already in the hardware buffer. So, if one makes a call to snd_pcm_delay(), waits for a period of a few samples, the new value returned in snd_pcm_delay() is going to be different, but only for sound cards that support this level of accuracy.
So, I think the snd_pcm_delay() function is already doing what you want. It was me who originally requested the snd_pcm_delay() function to be introduced into the API, for the purposes of audio/video sync in the media player xine.
Kind Regards
James
Hi,
On Fri, 14 Aug 2009, James Courtier-Dutton wrote:
snd_pcm_delay() does not just support exposing a fixed latency. The value returned is dynamic.
yes, I'm aware of that, and I was in fact quoting the documentation in my original mail, which indeed states that (although there are two descriptions, written with slightly different wording in alsa-lib pcm.c).
In the current ALSA implementation, the value returns a real time count of how many samples are already in the hardware buffer. So, if one makes a call to snd_pcm_delay(), waits for a period of a few samples, the new value returned in snd_pcm_delay() is going to be different, but only for sound cards that support this level of accuracy.
[...]
So, I think the snd_pcm_delay() function is already doing what you want. It was me who originally requested the snd_pcm_delay() function to be introduced into the API, for the purposes of audio/video sync in the media player xine.
Yes, acking that as well. It's just that we now start to have more drivers for HW, which include separate, fairly big, buffers of their own. I.e. something like the "URB buffering" of USB audio, but with potentially even larger buffers. My understanding is that so far these delays have not been exposed via ALSA and snd_pcm_delay(), although semantically these should be taken into consideration in the result of snd_pcm_delay() (as they directly affect e.g. a/v sync).
So if we have drivers that could expose this information, my question is how to do it with ALSA.
I've been looking at the ALSA code, and e.g. current implementation of snd_pcm_delay()->snd_pcm_hw_delay() in alsa-lib/src/pcm/pcm_hw.c, just returns the diff between 'hw_ptr' and 'appl_ptr'. The SNDRV_PCM_IOCTL_DELAY is called only for cases where the control structs cannot be mmap'ed, and the result is the same anyways. On the kernel side, drivers just implement the "pointer" method returning one value (-> hw_ptr). This tells the delay to last transfer from ringbuffer to/from HW, but not necessarily the full latency up until the codec.
Now one approach is full double-buffering, or virtualizeing the 'hw_ptr', (this seems to be done in e.g. the cs46xx driver). This is certainly one way to do it, but it seems somewhat messy, and based on earlier discussion (see the archive links in my original post to this thread), at least some of you share this view. A potential problem is for instance if application wants to do late mixing of samples and resubmit the samples [hw_ptr+X,hw_ptr+X+Y] in the ringbuffer. With the virtualized hw_ptr (usb-audio,cs46xx), this range might already been transfered to the HW, and thus application edits of that range will be discarded. You can warn applications about this by declaring SNDRV_PCM_INFO_BATCH/BLOCK_TRANSFER flags in the driver (as is done by usb-audio+cs46xx), but for apps, an accurate hw_ptr reflecting which parts of ringbuffer have been consumed, would be easier to handle.
Basicly transfering the bytes from the ALSA ringbuffer to a codec/dsp-memory, and actually playing those samples out a DAC or digital interface, are two different things, and ideally the application could track both of these pieces of information. Former is important for i/o scheduling, latter for e.g. a/v sync.
So one alternative would be to extend the driver interface, so that they could expose both pieces of info. The 'pointer' method would return ringbuffer/hw_ptr as currently, and a new interface would provide the "virtualized hw-ptr".
Then how to expose this to apps is a bit more tricky. Semantically just exposing this via snd_pcm_delay() would seem ok (as the info is needed for a/v sync), but it risks breaking existing apps (as the returned delay can be significantly higher than the overall ringbuffer size).
Alternatively, I'm missing some obvious and easy solution to this problem, but I'm hoping that in that case someone will point it out to me. :)
PS On existing mechanism is snd_pcm_hw_params_get_fifo_size(). But this is a fixed value, and very few apps seem to be using this (e.g. to fine-tune their a/v sync in a portable manner).
Kai Vehmanen wrote:
... In both cases, this value (hw_ptr) shows the status of sending data to an separate entity, and one that has its own buffer (multiple ms) before the actual DAC/ADC.
So when application developer uses snd_pcm_delay(), the result is not quite what the ALSA API says: ...
snd_pcm_avail() returns the free part of the buffer. snd_pcm_delay() returns the filled part of the buffer plus the additional delay imposed by the device.
The additional delay can be set by the driver in pcm_substream->runtime->delay. (Currently, only the USB audio driver bothers to set this.)
HTH Clemens
Hi,
On Fri, 14 Aug 2009, Clemens Ladisch wrote:
snd_pcm_delay() returns the filled part of the buffer plus the additional delay imposed by the device.
[...]
The additional delay can be set by the driver in pcm_substream->runtime->delay. (Currently, only the USB audio driver bothers to set this.)
aa, now that's interesting, and I think that's just the missing piece I'm looking for. I was looking at the 2.6.30.4 kernel, and couldn't find any traces of 'runtime->delay', but oh yes, 2.6.31 tree indeed has it - yay! :)
It would seem this was added by Takashi with commit "ALSA: Add extra delay count in PCM" on 2009-05-05, so no wonder not too many drivers set it yet.
Hmm, but isn't it so that this extra delay is added properly to snd_pcm_delay() result, when 'sync_ptr_ioctl==false' property is set for the pcm (i.e. use SYNC_PTR ioctl instead of mmap control structures). Attached is an alsa-lib patch attempting to fix this (as sufficient info is not available in the mmap control block, always go via IOCTL_DELAY).
But otherwise this does look good and is just what I was looking for. When using snd_pcm_status(), application can get all the information (avail, delay) in one go. Of course one limitation is that the runtime->delay updates are synced to driver events, not to application calling snd_pcm_delay (unlike query of hw_ptr status via pointer driver callback), but maybe this is not even needed in the end. Have to think about this some more...
Thanks Clemens, James and Mark for the quick replies!
participants (3)
-
Clemens Ladisch
-
James Courtier-Dutton
-
Kai Vehmanen