[alsa-devel] Master Plan on rewinding

7 Sep 2014


      Hello.
(TL;DR: nothing really new except the strawman proposal about threads 
and the note about interaction of variable sample rate with rewindability)
As Takashi Iwai told me that the audio miniconference is for discussion 
only, and not for presentation of anything, I guess that I need to 
present the plans and options now. That's why this e-mail. The goal 
here, besides merely presenting the plan, is to identify points that 
everyone agrees upon, so that they are not discussed pointlessly at the 
miniconference. Also, this e-mail serves as a justification for the 
pending seemingly-destructive work.
First, the status quo. If anyone disagrees with the facts below, please 
complain loudly, before I make any conclusions!
1. PulseAudio does not call snd_pcm_rewindable(), because for some ALSA 
plugins it crashed. This crash is completely fixed in alsa-lib 1.0.28, 
but in some cases snd_pcm_rewindable() still returns wrong results.
2. PulseAudio blindly assumes that it can rewind up to hwbuf_frames - 
(snd_pcm_avail() + rewind_safeguard) frames. The rewind safeguard is 
needed due to reasons that I don't completely understand, but one of 
them is imprecise reporting of the hardware pointer, and another one is 
that the hardware transfers several bytes at a time, and the bytes we 
need to overwrite may be already cached by the hardware.
3. On the hw plugin, I could demonstrate two other bugs regarding 
snd_pcm_rewindable(): stale data and bogus negative return values.
4. There are ALSA plugins like "a52" or the "bluetooth" plugin from old 
version of bluez that return non-zero for snd_pcm_rewindable() and 
snd_pcm_rewind, but actually don't rewind, and where rewinding is almost 
impossible to implement (e.g. because this would seemingly involve 
unsending of already-sent bluetooth packets). See below why "almost", 
you need to search for "Alternative strawman proposal".
5. PulseAudio contains a workaround for non-rewindability of the a52 
plugin that also happens to apply to other ioplug-based plugins. 
However, this workaround does not match extplug-based plugins (including 
"dca"), and my patch extending the workaround has been rejected with the 
argument that this has to be fixed in ALSA.
6. The rate plugin has been made non-rewindable in alsa-lib 1.0.28 
because the old implementation was wrong and no simple fix exists.
7. Issues regarding rewindability of PulseAudio internals such as 
virtual sinks will be discussed separately.
And now the topics for discussion.
My immediate plan is to fix (3) and, if anyone replies to the rewind 
safeguard proposal, maybe start fixing (2) from the alsa-lib side.
For all proposals below, unless I say otherwise, I intend to write the 
required alsa-lib code myself if the proposal is accepted.
=== On the rewind safeguard ===
The consensus is that rewind_safeguard is a workaround for an ALSA bug. 
This information should come from snd_pcm_rewindable() from the hw 
plugin. I.e., on a hw device, it should not be equal to 
snd_pcm_mmap_hw_avail(). While the method of dealing with stale data and 
negative returns is obvious to me, I am not completely sure about the 
safeguard. Where should this value come from?
Proposal (credit goes to Raymond Yau): the safeguard should be ideally 
equal to the granularity of hardware pointer updates. However, we don't 
know this granularity a priori, and cannot know this because nobody will 
find out this information for old cards. Since we can't get this, let's 
instead use the upper bound: the minimum possible period size for the 
given sample rate, sample format and the number of channels.
On my desktop PC, on snd-hda-intel with analog outputs for S16LE stereo, 
the granularity is 32 bytes (= 8 samples), and I get the pointer 
granularity of 64 bytes (=16 samples) over HDMI. The minimum period size 
is 32 samples in both cases. Thus Raymond's method will overestimate the 
required safeguard on my cards, but it is still better than no safeguard 
at all, and less than the hard-coded value in PulseAudio.
On ymfpci, the pointer granularity is 5 ms, thus the value currently 
hard-coded in PA is insufficient. If I understand the card's limitations 
correctly, Raymond's method will yield the spot-on result for this card.
For the allegedly existing cards where the pointer granularity is always 
the same as the period size, Raymond's method will not work. However, I 
don't know any concrete example of such card (I didn't look at 
snd-firewire in detail, though). And in any case, my opinion is that 
this "pointer granularity = period size" limitation on such cards can be 
treated as a driver bug that needs to be fixed (by someone who knows the 
driver, not by me). On cards where pointer updates happen only on 
interrupts, the driver should not configure the card in such a way that 
one period visible to the userspace corresponds to one interrupt. 
Instead, it should always configure the card for the minimum possible 
period size, and report only part of the period interrupts to 
period_elapsed(). I.e.: userspace asked for 2 periods 64 ms each, but 
let's configure the card for, say, 64 periods 2 ms each, and use the 
"extra" interrupts only for updating the pointer.
Hopefully, the above heuristic and driver fix will also let PulseAudio 
get rid of special treatment of the "batch" cards, simply because they 
(in the "pointer granularity = period size" definition which is 
currently used in PulseAudio) will no longer exist.
If the heuristic (or something else) is accepted, then the question 
remains how to enable the use of snd_pcm_rewindable in PulseAudio. The 
problem is compatibility with older ALSA versions that, let me remind 
you, crash if snd_pcm_rewindable is called. I don't have proposals or 
opinions here. The obvious options are: compile-time alsa-lib version 
check with fallback to old code on buggy versions, run-time alsa-lib 
version check with the same fallback, add a (both compile-time and 
run-time) hard requirement of a fully-fixed alsa-lib version, or do 
nothing at all.
Possible discussion: alternative heuristics and, as mentioned, migration 
path in PulseAudio.
=== On non-rewindability of the rate plugin ===
I intend to write a rewindable resampler eventually, but don't have time 
now. I understand that it is an important task, but issues below (and 
the dayjob which you can change by offering me a new one) have higher 
priority for me. However, I want everyone to understand the following 
point now:
"The resampler has to be written from scratch for the reasons explained 
in http://permalink.gmane.org/gmane.linux.alsa.devel/122179 , and 
similar arguments apply to all other kinds of sound processing code that 
needs history."
For PulseAudio, it is also needed to figure out the desired interaction 
between variable rate and rewindability. Should rewinding other than 
"discard everything completely" be allowed at all on variable rate 
streams when the rewind crosses the sample rate change point? I.e., 
write 100 samples, change rate, write 50 samples, rewind 100 samples, 
what should be the resulting rate? Should we special-case small changes 
vs big ones?
Slightly off-topic, but still a discussion point.
=== On possibly-incomplete rewindability of the file plugin ===
The file plugin has quite hairy code involving the use of the write 
buffer. I have not verified its correctness or studied the code in 
detail. Anyway, it looks like it allows rewinding over that buffer only, 
even though its slave may allow rewinding further. If true, this would 
make the apparent rewindability useless, as the applications depend on 
being able to rewind almost everything, and won't like the 
apparently-random limitation (random because it depends on some obscure 
"wbuf" implementation detail).
I think that the plugin has to be changed to avoid this limitation if it 
really exists (which I still need to verify). Namely, I propose keeping 
a shadow copy of the slave's buffer, and, at the beginning of each API 
call, querying the slave about its hardware pointer. Only the part of 
the shadow buffer that corresponds to the distance covered by the 
slave's hardware pointer since the last call has to be written to the file.
The same applies to recording. As the code is much simpler there, it is 
obvious that the problem exists: if one rewinds in order to look at the 
past recorded samples again, the plugin will read further samples from 
the file instead of supplying the already-looked-at samples. The same 
solution with the slave hardware pointer should be applicable.
The possible discussion here amounts to suggesting alternative proposals.
=== On bogus rewindability of ladspa and extplug plugins ===
For ioplug, see below, because the situation is different enough.
Neither LADSPA nor snd_pcm_extplug_t have an API that allows to tell the 
corresponding plugin to forget the last N samples. We cannot fix LADSPA, 
as it is beyond our control. And we cannot really fix extplug, because 
this just means moving the problem down one level. External libraries 
(which are depended upon by the extplug-based plugins) are almost 
universally not rewindable, are beyond our control, and will not be made 
rewindable, because rewindability is hard and is not needed by anyone 
except ALSA and PulseAudio. Still, neither .rewindable not .rewind 
currently return 0.
So, I'd argue that the only correct result from snd_pcm_rewindable() is 
0 for ladspa and extplug plugins. However, right now, they forward the 
result from the slave. This needs to be fixed, and I have a patch.
With snd_pcm_rewind(), the situation is a bit different. It, obviously, 
does nothing to tell the LADSPA or extplug plugin that the rewind 
happened (because it can't). It also performs a rewind on the slave. So, 
assuming that the rewind succeeds and the DSP algorithm in the plugin 
uses N samples from the past, we'll end up with the output of the 
correct duration but with N corrupted samples. Not ideal. But if we 
don't perform the rewind at all and return 0, we'll end up with 
regressing PulseAudio: huge latency when applying software volume 
changes or starting new streams. By setting up the large buffer size, it 
essentially assumes that any rewind will succeed.
So, my hackish proposal is: make the .rewindable callbacks on ladspa and 
extplug return 0, keep the current implementation of the .rewind 
callback for compatibility with the current versions of PulseAudio, 
document this hack. I already have patch for this, will send out if we 
agree that the proposal is good.
An alternative proposal (that I would implement if the proposal above is 
rejected and someone actively confirms this one - but I don't like it) 
would be to keep the current versions of the .rewindable and .rewind 
callbacks and add new versions of them (and new user-visible ALSA API) 
for the situation when imperfect results are unacceptable.
A completely strawman proposal would be to introduce a low-latency 
thread, see below in the ioplug section.
Same for .forwardable and .forward.
Any other alternatives that I missed?
=== On bogus rewindability of some ioplug-based plugins ===
Ioplug-based plugins currently report rewindability according to the 
same "appl.ptr - hw.ptr" rule that is used by the hw plugin. However, 
this is incorrect.
In fact, there are two types of ioplug-based plugins. The jack plugin 
does not even have a .transfer callback, and does all the work of 
irrevocably submitting samples to the JACK server in a thread. So, it is 
almost completely rewindable (with that "almost" being equal to JACK 
buffer size). Proposal: keep this logic for all plugins without the 
.transfer callback. Expose the JACK buffer size as the minimum period 
size. Then the same rule proposed in the section about the hw plugin 
will work for .rewindable.
Other plugins do their irrevocable work in the .transfer callback and 
thus are not really rewindable. Proposal: return 0 from .rewindable and 
rewind ioplug callbacks for all plugins with a .transfer callback (but 
see the next section for an amendment). Document that the best practice 
is to have no .transfer callback.
Alternative strawman proposal (speculative): if a .transfer callback 
exists, call it from an alsa-lib-created thread (not from 
snd_pcm_writei!) at the last possible moment, using the least possible 
amount of data (that truly cannot be rewound) in the mmap-style buffer. 
The new mantra is: .transfer is the thing that moves the hardware 
pointer. This way, previously non-rewindable plugins could become 
rewindable. Not sure if this is possible without changing the ABI or 
possible at all, though - it essentially forces all plugins to declare 
mmap support. Also not sure what to do with the .pointer callback.
For the strawman proposal, the benefit is a possibility to have dmix on 
top of a52 (because freewheeling will now work), the downside is that 
all programs (not only those intending to do rewinds) now pay the cost 
of the background low-latency thread. Can we avoid this (e.g. with a new 
flag for snd_pcm_open), so that only programs that intend to do rewinds 
pay the price? Can this new flag apply to the batch hw drivers? Won't 
that inflate the test matrix for plugin code?
For discussion: do we implement the first proposal or try the strawman 
proposal? Any other suggestions regarding ioplug internals? Anyone else 
has a problem with two completely different interfaces (with and without 
.transfer) being exposed under the same ioplug framework?
=== On the pulse plugin ===
PulseAudio natively supports rewinding on the wire. However, the 
ioplug-based plugin doesn't. It pretends that it rewound something, but 
always passes 0, 0 as the last two parameters to pa_stream_write(). 
That's bad.
It looks possible (but I haven't checked) to figure out the correct 
arguments (i.e. to detect the rewinds done before the call to 
snd_pcm_writei) by looking at the application pointer. This way, it 
looks possible to support rewinds, but does not look possible to get the 
correct result (and not the result implied by the circular-buffer model) 
from the snd_pcm_rewindable() function.
Also, we apply proposals from the previous section, we'll either end up 
with a plugin that is truly non-rewindable and doesn't pretend to be 
rewindable, or (with a strawman proposal) a rewindable plugin that 
forces low latency on the PulseAudio side and causes a lot of wakeups. 
Both situations are suboptimal.
So, I see no way around adding the .rewind and .rewindable callbacks to 
snd_pcm_ioplug_callback, and treating the plugins with them as 
rewindable. I guess .forward and .forwardable should be added, too.
For discussion: do we do the above or declare that ioplug is not really 
a suitable infrastructure for the pulse plugin? What other alternatives 
can be proposed?
=== On communication of non-rewindability to the program ===
PulseAudio attempts to use timer-based scheduling and rewinds. It makes 
a big hardware buffer in expectation that it will be able to rewind - 
with the exception of a workaround for ioplug-based plugins. It should 
not set big latency for non-rewindable plugins, but currently has no 
idea how to get this bit of information. See 
http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-April/020457.h... 
for the initial problem statement.
If all of the proposals above are implemented, then we'll be at a point 
where each ALSA plugin is either almost fully rewindable, or not 
rewindable at all. So, contrary to the end of 
http://permalink.gmane.org/gmane.linux.alsa.devel/122191 , I think we 
need just snd_pcm_hw_params_is_seekable() (or rename it if you wish) 
with an essentially boolean result.
For discussion:
1. Decide, finally, on this bit of API, so that I can start working.
2. Decide what to do with old alsa-lib versions in PulseAudio. I.e. 
transition plan.
=== On the programmer expectations ===
(social issue)
Some people, including at least Andrew Eikum and Clemens Ladisch, at 
least once in the past expressed the opinion that amounts to "any plugin 
that does not allow random access is, as far as the ALSA API is 
concerned, buggy" (quoting 
http://permalink.gmane.org/gmane.linux.alsa.devel/122159 ), i.e. they 
are maybe asking for the impossible. We need to do something about that. 
Choices:
1. Document clearly that there exist non-rewindable plugins, and that it 
is not a bug until someone implements a working time machine.
2. Implement a strawman proposal with the background low-latency thread 
if it is doable (and I am not sure that it is doable).
3. Remove ioplug, extplug and ladspa as unfixable, fix the rate and 
adpcm plugins (or remove adpcm). Do something about pulse and jack 
because they rely on the to-be-removed ioplug infrastructure. Then all 
remaining plugins will indeed be fully rewindable.
4. Something else?
Sorry for the negative tone here, I know I am bad at formulating and 
resolving social issues.
=== Conclusion ===
When all the proposals are implemented, we'll have correct 
implementations of the rewind operation in all plugins (where "return 0" 
counts as correct), and "return 0" only where it is truly unavoidable. 
PulseAudio will be able to use all of that, and avoid doing rewinds on 
non-rewindable plugins. On this positive note, I'd like to finish 
writing this long e-mail.
-- 
Alexander E. Patrakov

Tanu Kaskinen

Raymond Yau

David Henningsson

Raymond Yau

Raymond Yau

David Henningsson

David Henningsson

David Henningsson

Raymond Yau

A. C. Censi

Raymond Yau

Raymond Yau

Raymond Yau

Raymond Yau

Raymond Yau

Raymond Yau

Raymond Yau

Raymond Yau

tags (0)

participants (9)