Re: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

30 Jul 2015

      On 07/30/2015 01:11 AM, Alexander E. Patrakov wrote:
...
29.07.2015 22:46, Pierre-Louis Bossart wrote:
...
On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:
...
28.07.2015 19:19, Pierre-Louis Bossart wrote:
...
On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:
...
...

I have not seen any justification for the drastic measure

of making a DMA-based device completely unrewindable. Maybe a
more polite "please make this a batch/blocktransfer card"
request, thus disallowing only sub-period rewinds, would
still be useful for powersaving, without killing dmix.

If this "no rewinds" mode is not made the default, then

exactly nobody will use it. Everyone except sound servers
opens the default device with the default flags. I understand
the potential to break existing userspace, especially
PulseAudio, but we really need to think more here.
Not sure I understand the issue. when a new functionality is
added it takes time to be adopted. If we can push it in sound
servers first then it creates a wide pool of users from day1.
The issue is that the proposed functionality, in the currently
proposed "I promise to never rewind" form, is nearly useless by
its very nature for any sound server that cares about power
consumption significantly more than dmix does. It is indeed
usable by JACK (by its design, it doesn't rewind, and uses low
latency) and CRAS (which currently doesn't rewind, but I am not
sure whether this is a bug or a deliberate decision based on
non-public measurements of extra power savings that "rewinds + 
high latency" would allow).
As I have already explained, dmix, when mixing, writes to the
hardware buffer multiple times, which is equivalent to rewinding.
PulseAudio uses rewinds for a very specific purpose - to avoid
CPU wakeups in the common "nothing unexpected happened" case,
i.e. to allow very high average latency while keeping the latency
of reaction to unexpected events low. So, by convincing
PulseAudio to never rewind and to tell the driver about this
fact, you can save some power in the card, but (if the proponents
of rewinds are right - see my earlier request for non-public 
information) will waste way more power due to the need for much
more frequent CPU wakeups ("cannot rewind but have to react to
unexpected events within 20 ms" = "must wake up every 20 ms").
The only cases where this flag can be useful for sound servers
are:

Sound servers that already, by design, always waste CPU power

by running at low latency;
Your classification is not exhaustive enough. You need to take
into account sound servers that have two outputs per endpoint, one
for low-latency and one for low-power/deep-buffer with a
DSP/hardware mixer. Power is also no longer directly linked to
wakeup rates only but also to DDR access patterns. When a larger
on-chip buffer is available, disabling rewinds does let the
hardware know it can safely fetch data in bigger chunks rather
than use small data bursts. This sort of capabilities is becoming
prevalent these days, and not just on Intel platforms - see e.g.
Nexus5/9 devices -, and maybe PulseAudio and friends need to evolve
to make use of these resources rather than stay the course with
single output and rewind mechanisms that prevent power
optimizations on newer platforms.
OK, I see some new points there, but still no full picture. My point
still is: deep (>= 30-40 ms) buffer without rewinds or without any
other means to seamlessly cancel and replace what's there (up to at
certain unrewindable latency, i.e. the same 30-40 ms, that a human
won't object to) is absolutely useless to any current or future sound
server.
So, in your example, "I promise to never rewind" flag can definitely
be set for a low-latency endpoint, but not for the other one. That
other endpoint would benefit from a way to say "I promise to never
rewind closer than X samples to the hardware pointer" API, which youen
have flagged as a separate discussion. I would also not object to
that endpoint becoming a batch/blocktransfer PCM.
You are using low-latency with two different meanings : one is a fixed low-latency for music applications, where rewinds are indeed useless. But there is also a case for variable latencies when you are sharing the same output between apps having different needs, and that could also be used with very limited buffering.
Also for the deep-buffer solution, as long as you dedicate this output to a single media consumption app (playback, video) then you have no need for rewinds.
In other words, whenever an output is used in a exclusive manner without being shared between apps having different latency needs then you can optimize. For everything else keep using rewinds with the max_burst information to know by how much you can rewind (and accepting that some power optimizations linked to data transfers will not happen)
Also the notion of block transfer is too limiting, the hardware granularity may be smaller than a period and the hardware may 
still be able to provide precise data on the hw_ptr and delays, so the traditional batch/block transfer definition is a tad obsolete.
...
Well, there s a hackish way to say "I promisle to never rewind" on
both endpoints even with a deep buffer on a low-power endpoint. The
way is: instead of rewinding, open the low-latency endpoint and play
a correction signal (i.e. the difference between the actual and the
wanted contents of the deep buffer) through it, with low latency, and
let the hardware mix. But I don't like this, for purely subjective
reasons and for the need for that endpoint to have twice as much
output amplitude than one would normally need. What do you propose
instead?
Still, I think that a more general API that says "I promise to never
rewind closer than X samples to the hardware pointer" would be morep
useful than the black-and-white "I promise to never rewind" call
assuming that it can be expressed to the hardware.
the patches provide information on the max burst so you can know by how much to rewind. Setting the max burst based on a negotiation between driver and app is an interesting concept that we looked at but there aren't too many devices that support this capability so this might not be worth the effort.