[alsa-devel] How to pair Wine with ALSA? part1: intro & underruns (long)
Dear knowledgeable ALSA developers,
Wine currently undergoes a rewrite of its audio systems. That's a welcome point in time to reflect on what it needs from ALSA and how it should use the ALSA API. What combination of hw & sw_params makes sense? How to feed ALSA with samples?
I've identified 3 uses cases and several other topics to help thinking about the issues:
UC1 Intermittent Mouse Clicks UC2 Video UC3 Background Music
T1 underrun T2 buffer and period size/time T3 blocking or not T4 mmap
Due to size, I'll cover T2-T4 in a later e-mail.
UC1 Intermittent Mouse Clicks An app may use the winmm API as follows: 1. open the device 2. from time to time, e.g. on mouse clicks, send wave data. The requirements are:
R1 Nothing (or silence) is to be played when the app submits no data. R2 Once the app submits data, it should play ASAP.
That is the app's view on it. It seems to translate to silence_threshold=0 and silence_size=boundary, doesn't it?
Alternatively, Wine's internal periodic audio worker could as well send silence samples when it receives nothing from the app. If so, I fear additional complexity with snd_pcm_rewind to meet R2, e.g. in case where the app submits samples shortly after wine decided to play some silence.
UC2 Video I've not much to say here because I basically ignore how apps synchronize audio with video. GetPosition returns "the stream position of the sample that is currently playing through the speakers". This begs for snd_pcm_delay, not avail_update.
An app may select a particular device for output (e.g. 5:1 may not be available with the default mixer).
R3 Wine must offer access to several sound cards if available.
R4 Try and find hw/sw_params that work with the "default" device as well as individual cards (hw:x or perhaps plughw:x).
mmdevapi has a notion of exclusive and shared modes. I expect shared mode to be asked with the "default" device (for mixing to work), and exclusive with others (e.g. 5:1 sound). OTOH, apps may also ask for the default device in exclusive mode, which is said to be granted if no other app is playing sound (much like I can grab hw:0 only when dmix has nothing to do).
UC3 Background Music The app ought to send a continuous stream of samples, perhaps mixing sources itself (e.g. what games let the dsound API do).
R5 No samples are ever skipped. If late, play late. See R1/R2.
I've observed that "dmix"+"pulse" and the "hw:x" devices approach underruns radically differently. In case of a 10s underrun, the former silently skip over the next 10s worth of samples(!) -- if the app does not submit those *fast* to catch up, you'll hear no more sound! -- while the latter immediately write them to their HW buffer.
IOW, dmix and pulse streaming semantics do not match winmm nor mmdevapi's semantics.
To counter that, I'm considering using: /* call avail_update prior to every write: */ if (snd_pcm_avail_update() > buffer_size) snd_pcm_reset() /* skip over late samples */ /* should be equivalent to snd_pcm_forward(avail) */ snd_pcm_write()
However, there's still a short underrun between reset() and write().
Topic T1 underrun
I argue that the MS APIs winmm and mmdevapi have almost no notion of underrun during playback. You derive it indirectly at best ("buffer is empty", "all headers returned", "speaker position = amount of samples sent").
R5 GetPosition only counts submitted samples. Silence played during underruns must not count and is not reported to the app.
This seems unlike snd_pcm_delay. I've observed snd_pcm_delay grow towards minus infinity during underruns. How to nevertheless make use of the ALSA API to compute that?
I'm thinking about err = snd_pcm_avail_delay(&avail,&delay) if (err<0) ...? if (avail > buffer_size) /* underrun => everything was played */ position = sum_of_submitted_samples; else position = some_function_of(delay, submitted_samples)
What do you recommend?
Thank you for your help and for reading up to this point, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com wrote:
UC1 Intermittent Mouse Clicks An app may use the winmm API as follows:
- open the device
- from time to time, e.g. on mouse clicks, send wave data.
The requirements are:
R1 Nothing (or silence) is to be played when the app submits no data. R2 Once the app submits data, it should play ASAP.
That is the app's view on it. It seems to translate to silence_threshold=0 and silence_size=boundary, doesn't it?
This translates to stop_threshold=buffer_size and start_threshold=1. (Please note that a very small start_threshold increases the possibility of an underrun because it becomes more likely that the sound card immediately reads all valid samples.)
UC2 Video I've not much to say here because I basically ignore how apps synchronize audio with video. GetPosition returns "the stream position of the sample that is currently playing through the speakers". This begs for snd_pcm_delay, not avail_update.
Yes.
However, does that documentation actually make a distinction between the last sample that has been read from the buffer and the sample being played?
R3 Wine must offer access to several sound cards if available.
It is not easy to enumerate available sound devices.
It's possible to enumerate hardware devices, but this does not give you software-implemented devices like PulseAudio or Bluetooth.
I'd suggest to have a default device ("default") and one device for each hardware sound card ("default:x", where x is the card index, but some people write a custom .asoundrc where this doesn't work), with the possibility of adding other device names.
R4 Try and find hw/sw_params that work with the "default" device as well as individual cards (hw:x or perhaps plughw:x).
The hw_params are hardware dependent; there is not set of values that is guaranteed to be available on all devices.
Unless you actually know that certain settings are possible, you should always use snd_hw_params_set_xxx_near/first/last.
mmdevapi has a notion of exclusive and shared modes. I expect shared mode to be asked with the "default" device (for mixing to work), and exclusive with others (e.g. 5:1 sound). OTOH, apps may also ask for the default device in exclusive mode, which is said to be granted if no other app is playing sound (much like I can grab hw:0 only when dmix has nothing to do).
There are some conventions about which devices are sharable, but those are not always true with some custom configurations.
I would be tempted to just ignore the exclusive/share flag.
R5 No samples are ever skipped. If late, play late. See R1/R2.
I've observed that "dmix"+"pulse" and the "hw:x" devices approach underruns radically differently. In case of a 10s underrun, the former silently skip over the next 10s worth of samples(!) -- if the app does not submit those *fast* to catch up, you'll hear no more sound! -- while the latter immediately write them to their HW buffer.
During an underrun, the device does not have valid samples to play. A hardware device just plays whatever currently is in the ring buffer; PulseAudio does not bother to mix samples that it knows would be wrong.
There is no guarantee about the actual output resulting from those 'missed' samples. You can set silence_threshold/size to force silence.
The continue-on-xrun mode is intended for situations where the overall timing of the running stream is important. After an xrun, you indeed have to write the 'missed' samples as fast as possible to catch up, until the relationship between written samples and elapsed time is back to a situation that looks as if no xrun has happened.
Configuring the device to stop on xruns seems to be a better fit for your requirements.
Regards, Clemens
Hi,
[I've moved some blocking/polling from part2 herein.]
Clemens Ladisch wrote:
[...] GetPosition returns "the stream position of the sample that is currently playing through the speakers".
However, does that documentation actually make a distinction between the last sample that has been read from the buffer and the sample being played?
Er, what do you mean?
My interpretation is: if there's a 5s network latency, that is included. The front-end of the audio-processing chain may have returned the buffer (as visible from GetCurrentPadding) containing that sample already 3 seconds ago to the app, that doesn't matter.
GetCurrentPadding: font-end view on audio processing chain GetPosition: back-end (speaker) view
I would be tempted to just ignore the exclusive/share flag.
The flag is MS' view on it. I never would think about using O_EXCL on the host OS side.
BTW, I thought dmix could be used with any card, but I found no one-liner syntax to open card 1 with dmix. It seemed hard-wired (by configuration) to card0 (not that I'm familiar with ALSA conf syntax).
I'd expect to be able to tell the (hypothetical) 5:1 player on card1: "use dmix-style functionality too, don't grab plughw:1".
There is no guarantee about the actual output resulting from those 'missed' samples. You can set silence_threshold/size to force silence.
Then I must use it. I don't want the user to hear random noise (or stuttering or similar effects).
Configuring the device to stop on xruns seems to be a better fit for your requirements.
That's what Wine used to do in the former driver.
But it's precisely because dmix does(did?) not support xrun detection that I started looking into the free-running mode.
I currently believe that I can support both simultaneously (i.e. not care):
- xrun stops: As long as snd_pcm_avail_update and snd_pcm_delay continue to be updated (my tests tell me they do), I know by how many samples to correct results.
- free-running: Here too, snd_pcm_avail_update and delay continue to be updated, so I know both: a) how many samples to skip when there'll be something to play again, b) how many samples not to include in what GetPosition must return.
The meaning of ALSA's periods is as follows: 2) When ALSA is blocked (in snd_pcm_write* or in poll), it checks whether to wake up the application only when an interrupt arrives.
What about non-blocking mode? Do you mean to imply that in non-blocking mode, never using poll() causes period_size to become irrelevant from the app POV? ALSA may update its internal state upon every interrupt, but the app never observes an interrupt, does it?
Non-blocking mode is perfectly fine if you're using poll() to wait for other events at the same time.
AFAICT Wine never used poll() with ALSA. It was in the code only to communicate via pipe() with the rest of the Wine driver. The new driver doesn't use poll at all. It uses a fixed rate timer signal. Is that against any recommendations? What's bad about it?
That's not ideal in terms of CPU interrupt frequency, nor latency. However those 10ms packets observable in mmdevapi make it IMHO unlikely that a fixed 10ms timer is wrong now in Wine. It is my conviction that Wine must mimic dynamic (timing) behaviour to avoid triggering bugs in apps. We'll see.
Regards, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com wrote:
Clemens Ladisch wrote:
[...] GetPosition returns "the stream position of the sample that is currently playing through the speakers".
However, does that documentation actually make a distinction between the last sample that has been read from the buffer and the sample being played?
Er, what do you mean?
Er, that I didn't read the documentation. After Googling, I see that your interpretation is correct.
BTW, I thought dmix could be used with any card, but I found no one-liner syntax to open card 1 with dmix. It seemed hard-wired (by configuration) to card0 (not that I'm familiar with ALSA conf syntax).
Device names can have parameters; you can use "dmix:x" or "dmix:CARD=x" or "dmix:{ CARD=x; }" or whatever other syntax is allowed by ALSA's configuration system. And yes, it's mostly undocumented.
Configuring the device to stop on xruns seems to be a better fit for your requirements.
That's what Wine used to do in the former driver.
But it's precisely because dmix does(did?) not support xrun detection that I started looking into the free-running mode.
Any device _must_ stop on xrun, if so configured. Not doing so would be a bug. Could you still reproduce it now?
The meaning of ALSA's periods is as follows: 2) When ALSA is blocked (in snd_pcm_write* or in poll), it checks whether to wake up the application only when an interrupt arrives.
What about non-blocking mode? Do you mean to imply that in non-blocking mode, never using poll() causes period_size to become irrelevant from the app POV?
Yes. Recently, the possibility to disable interrupts (period_wakeup) was added to a few drivers; PulseAudio uses this. http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___h_w___params.h...
The new driver doesn't use poll at all. It uses a fixed rate timer signal. Is that against any recommendations? What's bad about it?
Well, that duplicates PulseAudio's code. ;-)
Regards, Clemens
On Mon, 2011-08-15 at 17:50 +0200, Joerg-Cyril.Hoehle@t-systems.com wrote: [...]
I would be tempted to just ignore the exclusive/share flag.
The flag is MS' view on it. I never would think about using O_EXCL on the host OS side.
FWIW, the passthrough support that's in PulseAudio git master should correspond to what exclusive mode does.
-- Arun
2011/8/14 Clemens Ladisch clemens@ladisch.de:
Joerg-Cyril.Hoehle@t-systems.com wrote:
I'd suggest to have a default device ("default") and one device for each hardware sound card ("default:x", where x is the card index, but some people write a custom .asoundrc where this doesn't work), with the possibility of adding other device names.
http://git.alsa-project.org/?p=alsa-lib.git;a=commit;h=e6f990e5c9be5cac6f369...
Now with this patch , you can use sysdefault:x or sysdefault:CARD=x to use the default device define in /usr/share/alsa/cards/*.conf as same as the past
participants (4)
-
Arun Raghavan
-
Clemens Ladisch
-
Joerg-Cyril.Hoehle@t-systems.com
-
Raymond Yau