[alsa-devel] [RFC PATCH 0/4] better support for bursty DMA usages

older
Re: [alsa-devel] [ALC668]: Asus...

Pierre-Louis Bossart

8 Jul 2015 8 Jul '15

12:10 p.m.

Set of patches to fix issues with hw_ptr fuzziness [1] and increased buffering w/ DSPs

1. disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch up to the application pointer, rewinds not supported) 2. report max in-flight bytes to avoid problems with stale data (late wake-ups, rewinds)

[1] http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html

TODO: 1. fixes and alsa-lib updates (compile-tested only for now) 2. get feedback 3. if supported, set DMA buffering based on negotiation between driver and app (capabilities vs. latency needs)

Pierre-Louis Bossart (4): ALSA: core: let low-level driver or userspace disable rewinds ALSA: core: add .notify callback for pcm ops ALSA: core: add report of max dma burst ALSA: hda: add default value for max_dma_burst

include/sound/pcm.h | 5 +++++ include/uapi/sound/asound.h | 6 ++++-- sound/core/pcm_lib.c | 19 +++++++++++++++++++ sound/core/pcm_native.c | 33 ++++++++++++++++++++++++++++++++- sound/pci/hda/hda_controller.c | 1 + 5 files changed, 61 insertions(+), 3 deletions(-)

-- 1.9.1

Show replies by date

Pierre-Louis Bossart

8 Jul 8 Jul

12:10 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

Add new hw_params flag to explicitly tell driver that rewinds will never be used. This can be used by low-level driver to optimize DMA operations and reduce power consumption. Use this flag only when data written in ring buffer will never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without opening a pcm stream, so applications might need to serially open all exposed devices, check what they support by looking at hw_params->info and close them (this is what PulseAudio does so might not be an issue)

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/pcm.h | 1 + include/uapi/sound/asound.h | 1 + sound/core/pcm_native.c | 8 ++++++++ 3 files changed, 10 insertions(+)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 691e7ee..25310b7 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -370,6 +370,7 @@ struct snd_pcm_runtime { unsigned int rate_num; unsigned int rate_den; unsigned int no_period_wakeup: 1; + unsigned int no_rewinds:1;

/* -- SW params -- */ int tstamp_mode; /* mmap timestamp is updated */ diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index a45be6b..b62b162 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t; #define SNDRV_PCM_HW_PARAMS_NORESAMPLE (1<<0) /* avoid rate resampling */ #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER (1<<1) /* export buffer */ #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP (1<<2) /* disable period wakeups */ +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS (1<<3) /* disable rewinds */

struct snd_interval { unsigned int min, max; diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index d126c03..a70e52d 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream, runtime->no_period_wakeup = (params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) && (params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP); + runtime->no_rewinds = + params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;

bits = snd_pcm_format_physical_width(runtime->format); runtime->sample_bits = bits; @@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst if (frames == 0) return 0;

+ if (runtime->no_rewinds) + return 0; + snd_pcm_stream_lock_irq(substream); switch (runtime->status->state) { case SNDRV_PCM_STATE_PREPARED: @@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr if (frames == 0) return 0;

+ if (runtime->no_rewinds) + return 0; + snd_pcm_stream_lock_irq(substream); switch (runtime->status->state) { case SNDRV_PCM_STATE_PREPARED:

-- 1.9.1

Takashi Iwai

4:21 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

On Wed, 08 Jul 2015 12:10:33 +0200, Pierre-Louis Bossart wrote:

...

Add new hw_params flag to explicitly tell driver that rewinds will never be used. This can be used by low-level driver to optimize DMA operations and reduce power consumption. Use this flag only when data written in ring buffer will never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without opening a pcm stream, so applications might need to serially open all exposed devices, check what they support by looking at hw_params->info and close them (this is what PulseAudio does so might not be an issue)

The forward should also fail with such hardware, no?

Takashi

...

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

include/sound/pcm.h | 1 + include/uapi/sound/asound.h | 1 + sound/core/pcm_native.c | 8 ++++++++ 3 files changed, 10 insertions(+)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 691e7ee..25310b7 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -370,6 +370,7 @@ struct snd_pcm_runtime { unsigned int rate_num; unsigned int rate_den; unsigned int no_period_wakeup: 1;

unsigned int no_rewinds:1;

/* -- SW params -- */ int tstamp_mode; /* mmap timestamp is updated */

diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index a45be6b..b62b162 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t; #define SNDRV_PCM_HW_PARAMS_NORESAMPLE (1<<0) /* avoid rate resampling */ #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER (1<<1) /* export buffer */ #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP (1<<2) /* disable period wakeups */ +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS (1<<3) /* disable rewinds */

struct snd_interval { unsigned int min, max; diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index d126c03..a70e52d 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream, runtime->no_period_wakeup = (params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) && (params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
runtime->no_rewinds =
	params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;
bits = snd_pcm_format_physical_width(runtime->format); runtime->sample_bits = bits;
@@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst if (frames == 0) return 0;
if (runtime->no_rewinds)
return 0;
snd_pcm_stream_lock_irq(substream); switch (runtime->status->state) { case SNDRV_PCM_STATE_PREPARED:
@@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr if (frames == 0) return 0;
if (runtime->no_rewinds)
return 0;
snd_pcm_stream_lock_irq(substream); switch (runtime->status->state) { case SNDRV_PCM_STATE_PREPARED:
-- 1.9.1

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Pierre-Louis Bossart

6:58 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

...

...
Add new hw_params flag to explicitly tell driver that rewinds will never be used. This can be used by low-level driver to optimize DMA operations and reduce power consumption. Use this flag only when data written in ring buffer will never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without opening a pcm stream, so applications might need to serially open all exposed devices, check what they support by looking at hw_params->info and close them (this is what PulseAudio does so might not be an issue)

The forward should also fail with such hardware, no?

The way I understand the forward is just an update of the application pointer without writing/reading new data, so that works. The hardware will just use whatever is in the ring buffer. It's weird but it would work.

Alexander E. Patrakov

11 Jul 11 Jul

7:06 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

08.07.2015 15:10, Pierre-Louis Bossart wrote:

...

Add new hw_params flag to explicitly tell driver that rewinds will never be used. This can be used by low-level driver to optimize DMA operations and reduce power consumption. Use this flag only when data written in ring buffer will never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without opening a pcm stream, so applications might need to serially open all exposed devices, check what they support by looking at hw_params->info and close them (this is what PulseAudio does so might not be an issue)

Thanks for the patch. I was also about to add this flag, for the use in userspace. It's good to know that it is also useful in the kernel space.

I usually don't comment on kernel patches, but this one provokes some questions and thoughts.

1. What amount of power-saving are we talking about, for Intel chips? (ideally, this should be in the commit message)

2. Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be also made incompatible with this flag, because, when mixing, it essentially overwrites incompletely-mixed data, i.e. rewinds without saying the word "rewind"? Shouldn't all other kinds of freewheeling be made incompatible with this flag, because the card, essentially, is never told about the application pointer? Or is this a userspace-only concern?

3. I have not seen any justification for the drastic measure of making a DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

4. If this "no rewinds" mode is not made the default, then exactly nobody will use it. Everyone except sound servers opens the default device with the default flags. I understand the potential to break existing userspace, especially PulseAudio, but we really need to think more here.

So, here is an alternative (or maybe additional - your flag does make sense, after all) hackish idea that should be compatible with the existing userspace.

Please make it possible to "allow", "put in auto mode" or "disallow" the device to DMA up to one period in advance. Please use the following logic when in "auto mode":

If the card is batch or blocktransfer "by itself", let it be. We can't do anything here, the DMA operations are already "optimized" whether we want it or not.

If the card does not support disabling the period wakeup, then FIXME - I don't know what to do. For compatibility, I think we can't optimize DMA operations here.

If the card supports disabling the period wakeup, and it is disabled, then don't allow it to optimize DMA operations.

If the card supports disabling the period wakeup, but it is not disabled, then allow it to optimize DMA operations.

As you have probably guessed, the idea here is to detect typical setup made by PulseAudio. I suspect that this also affects CRAS.

Also maybe this idea by David Henningsson is good:

""" E g, if the application sets a low period size but also sets the "disable period interrupts" flag, that could be an indicator that it wants lots of interrupts just to update the pointer, but nothing else. Maybe that's even the behaviour today (haven't checked). """

(taken from http://mailman.alsa-project.org/pipermail/alsa-devel/2014-September/080885.h... )

...

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

include/sound/pcm.h | 1 + include/uapi/sound/asound.h | 1 + sound/core/pcm_native.c | 8 ++++++++ 3 files changed, 10 insertions(+)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 691e7ee..25310b7 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -370,6 +370,7 @@ struct snd_pcm_runtime { unsigned int rate_num; unsigned int rate_den; unsigned int no_period_wakeup: 1;

unsigned int no_rewinds:1;

/* -- SW params -- */ int tstamp_mode; /* mmap timestamp is updated */

diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index a45be6b..b62b162 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t; #define SNDRV_PCM_HW_PARAMS_NORESAMPLE (1<<0) /* avoid rate resampling */ #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER (1<<1) /* export buffer */ #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP (1<<2) /* disable period wakeups */ +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS (1<<3) /* disable rewinds */

struct snd_interval { unsigned int min, max; diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index d126c03..a70e52d 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream, runtime->no_period_wakeup = (params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) && (params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
runtime->no_rewinds =
	params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;
bits = snd_pcm_format_physical_width(runtime->format); runtime->sample_bits = bits;
@@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst if (frames == 0) return 0;
if (runtime->no_rewinds)
return 0;
snd_pcm_stream_lock_irq(substream); switch (runtime->status->state) { case SNDRV_PCM_STATE_PREPARED:
@@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr if (frames == 0) return 0;
if (runtime->no_rewinds)
return 0;
snd_pcm_stream_lock_irq(substream); switch (runtime->status->state) { case SNDRV_PCM_STATE_PREPARED:

-- Alexander E. Patrakov

Raymond Yau

14 Jul 14 Jul

5:32 a.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

Add new hw_params flag to explicitly tell driver that rewinds

...

...
will never be used. This can be used by low-level driver to optimize DMA operations and reduce power consumption. Use this flag only when data written in ring buffer will never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without opening a pcm stream, so applications might need to serially open all exposed devices, check what they support by looking at hw_params->info and close them (this is what PulseAudio does so might not be an issue)

How about spdif->status AES0_NONAUDIO is set when AC3 passthrough SPDIF ?

...

What amount of power-saving are we talking about, for Intel chips?

(ideally, this should be in the commit message)

Refer to HDA043-A

Energy efficient audio buffering and dynamic FIFO limit change

3.3.11 Offset 12h: GCAP2 –Global Capabilities 2

FIFO Size Change (FIFOSC) This bit isRO

if GCAP2.EEAC = 0.

In static case (GCAP2.EEAC = 1 and GCAP2.DFFLCC = 0), this bit will only have effect before the first time RUN bit is set. Software must not attempt to write this bit, unless EECAP.EEAC bit is set. HW will treat this bit as read -only if the capability is not supported . In dynamic case (GCAP2.EEAC = 1 and GCAP2.DFFLCC = 1), on top of supporting the static behavior describe above, this bit can also be used to communicate the dynamic change request in FIFO size between SW and HW when the stream is active

3.3.41 Offset 90: {IOB}SDnFIFOS – Input/Output/Bidirectional Stream Descriptor n FIFO Size

When HW supports EE Audio capability,the FIFOS this value will represent the minimum size of cyclic buffer for efficient HW operation

Do application need to know driver is using static or dynamic case ?

is the minimum size of cyclic buffer for efficient HW operation larger than current hw params (period_bytes_min * periods_min) ?

...

Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be also

made incompatible with this flag, because, when mixing, it essentially overwrites incompletely-mixed data, i.e. rewinds without saying the word "rewind"? Shouldn't all other kinds of freewheeling be made incompatible with this flag, because the card, essentially, is never told about the application pointer? Or is this a userspace-only concern?

if SND_PCM_APPEND use fmode = O_APPEND when snd_device_open(), this seem support sequential write only and no random write

Do it mean pulseaudio cannot rewind dmix as dmix alway append data to hw device ?

...

I have not seen any justification for the drastic measure of making a

DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

Can those non-interleaved access pcm device with two periods support rewind ?

Alexander E. Patrakov

7:39 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

14.07.2015 08:32, Raymond Yau wrote:

...

1. What amount of power-saving are we talking about, for Intel
chips? (ideally, this should be in the commit message)
Refer to HDA043-A

Energy efficient audio buffering and dynamic FIFO limit change

I expected an answer in the form "XXX mW out of YYY, for such-and-such buffer and period sizes".

...

2. Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be
also made incompatible with this flag, because, when mixing, it
essentially overwrites incompletely-mixed data, i.e. rewinds without
saying the word "rewind"? Shouldn't all other kinds of freewheeling
be made incompatible with this flag, because the card, essentially,
is never told about the application pointer? Or is this a
userspace-only concern?

if SND_PCM_APPEND use fmode = O_APPEND when snd_device_open(), this seem support sequential write only and no random write

dmix uses mmap and never advances application pointer on the slave (which is a hw device). While each instance of dmix can indeed perform mmap-based writes sequentially, what the device sees is not sequential. E.g., for two periods and two applications, started simultaneously, the sequence is like this:

1st app writes 1st period 1st app writes 2nd period 2nd app overwrites 1st period with the mix 2nd app overwrites 2nd periodwith the mix hw completes playing 1st period 1st app writes 1st period 2nd app overwrites 1st period with the mix hw completes playing 2nd period 2nd app writes 2nd period 1st app overwrites 2nd period with the mix hw completes playing 1st period

...and so on, and we have not considered xruns here, which complicate the picture even further. The point is: what the hardware sees is not a stream of sequential writes, so the no-rewind flag is not appropriate.

And it is never notified about the application pointer, so the flag is definitely not appropriate again.

...

Do it mean pulseaudio cannot rewind dmix as dmix alway append data to hw device ?

No, see above.

Regarding dmix rewindability status: dmix is not actually rewindable (but pretends to be), and, while (unlike for extplug) I don't see any reason why it must be unrewindable, there is no opposition to mark it as unrewindable and then call the problem solved.

Regarding PulseAudio: for PATCH 1/4, it is completely irrelevant. Pulseaudio wants to be responsive to the events that new streams appear and existing ones change software-based volume. Let's suppose that it wants to respond in no later than 10 ms (just a made-up number). Then, at no point there should be more than 10 ms of buffered audio that cannot be replaced. If PulseAudio starts using this new flag, then it has to wake up the CPU every 10 ms. I.e. "let the audio DMA engine wake up every 0.125 ms, as it does by default, and let the CPU wake up every 1000 ms unless something important happens (in which case there is a rewind)" would be replaced by "let the audio DMA engine wake up once in 10 ms, and also let the CPU wake up every 10 ms", which is much worse. Unfortunately, the patch does not offer a way to say "let the DMA engine wake up every 10 ms, and CPU wake up every 1000 ms (and occasionally perform rewinds, but leave 10 ms untouched)".

...

3. I have not seen any justification for the drastic measure of
making a DMA-based device completely unrewindable. Maybe a more
polite "please make this a batch/blocktransfer card" request, thus
disallowing only sub-period rewinds, would still be useful for
powersaving, without killing dmix.

Can those non-interleaved access pcm device with two periods support rewind ?

Yes, but very carefully. E.g., suppose that the period size is 10 ms. Then, this sequence of events is valid, does not produce xruns and does not cause the card to play stale data:

0 ms: Application fills 2 periods via mmap, notices the timestamp and starts the PCM. 10 ms: Hardware plays out the 1st period and wakes up the application. Application writes the 1st period via mmap and notices the timestamp. 20 ms: Hardware plays out the 2nd period and wakes up the application. Application writes the 2nd period via mmap and notices the timestamp. 23 ms: Application wakes up for an unrelated reason and notices the timestamp. It knows that the sound card still has 17 ms to play, and 10 ms (one period, as usual for batch/blocktransfer cards) are unrewindable. So it rewinds 6 ms and writes new data. 30 ms: Hardware plays out the 1st period and wakes up the application. Application writes the 1st period via mmap and notices the timestamp.

... and so on.

-- Alexander E. Patrakov

Raymond Yau

15 Jul 15 Jul

3:23 a.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

...

I expected an answer in the form "XXX mW out of YYY, for such-and-such

buffer and period sizes".

The new parameter is to inform the application that the application can only use sequential write and not random write

Alexander E. Patrakov

6:51 a.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

15.07.2015 06:23, Raymond Yau wrote:

...

...
I expected an answer in the form "XXX mW out of YYY, for

such-and-such buffer and period sizes".

The new parameter is to inform the application that the application can only use sequential write and not random write

Please reread the description, it also works the other way:

"""Add new hw_params flag to explicitly tell driver that rewinds will never be used."""

-- Alexander E. Patrakov

Pierre-Louis Bossart

28 Jul 28 Jul

4:19 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:

...

08.07.2015 15:10, Pierre-Louis Bossart wrote:

...
Add new hw_params flag to explicitly tell driver that rewinds will never be used. This can be used by low-level driver to optimize DMA operations and reduce power consumption. Use this flag only when data written in ring buffer will never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without opening a pcm stream, so applications might need to serially open all exposed devices, check what they support by looking at hw_params->info and close them (this is what PulseAudio does so might not be an issue)

Thanks for the patch. I was also about to add this flag, for the use in userspace. It's good to know that it is also useful in the kernel space.

I usually don't comment on kernel patches, but this one provokes some questions and thoughts.

What amount of power-saving are we talking about, for Intel chips?

(ideally, this should be in the commit message)

The power savings will depend on non-audio parameters, so it's impossible to state a value. And it's probably not public information anyway.

...

Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be also

made incompatible with this flag, because, when mixing, it essentially overwrites incompletely-mixed data, i.e. rewinds without saying the word "rewind"? Shouldn't all other kinds of freewheeling be made incompatible with this flag, because the card, essentially, is never told about the application pointer? Or is this a userspace-only concern?

Good point, I wasn't aware of this flag. willl look into it.

...

I have not seen any justification for the drastic measure of making a

DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

If this "no rewinds" mode is not made the default, then exactly

nobody will use it. Everyone except sound servers opens the default device with the default flags. I understand the potential to break existing userspace, especially PulseAudio, but we really need to think more here.

Not sure I understand the issue. when a new functionality is added it takes time to be adopted. If we can push it in sound servers first then it creates a wide pool of users from day1.

...

So, here is an alternative (or maybe additional - your flag does make sense, after all) hackish idea that should be compatible with the existing userspace.

Please make it possible to "allow", "put in auto mode" or "disallow" the device to DMA up to one period in advance. Please use the following logic when in "auto mode":

This is a separate discussion. the patches I wrote are only to make sure the hardware can safely fetch the data that is provided by userspace. I think the open is more how we configure the DMA burst, something I haven't looked at yet.

...

If the card is batch or blocktransfer "by itself", let it be. We can't do anything here, the DMA operations are already "optimized" whether we want it or not.

If the card does not support disabling the period wakeup, then FIXME - I don't know what to do. For compatibility, I think we can't optimize DMA operations here.

If the card supports disabling the period wakeup, and it is disabled, then don't allow it to optimize DMA operations.

If the card supports disabling the period wakeup, but it is not disabled, then allow it to optimize DMA operations.

As you have probably guessed, the idea here is to detect typical setup made by PulseAudio. I suspect that this also affects CRAS.

Also maybe this idea by David Henningsson is good:

""" E g, if the application sets a low period size but also sets the "disable period interrupts" flag, that could be an indicator that it wants lots of interrupts just to update the pointer, but nothing else. Maybe that's even the behaviour today (haven't checked). """

(taken from http://mailman.alsa-project.org/pipermail/alsa-devel/2014-September/080885.h... )

...
Signed-off-by: Pierre-Louis Bossart

pierre-louis.bossart@linux.intel.com

include/sound/pcm.h | 1 + include/uapi/sound/asound.h | 1 + sound/core/pcm_native.c | 8 ++++++++ 3 files changed, 10 insertions(+)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 691e7ee..25310b7 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -370,6 +370,7 @@ struct snd_pcm_runtime { unsigned int rate_num; unsigned int rate_den; unsigned int no_period_wakeup: 1;

unsigned int no_rewinds:1;

/* -- SW params -- */ int tstamp_mode; /* mmap timestamp is updated */

diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index a45be6b..b62b162 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t; #define SNDRV_PCM_HW_PARAMS_NORESAMPLE (1<<0) /* avoid rate resampling */ #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER (1<<1) /* export buffer */ #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP (1<<2) /* disable period wakeups */ +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS (1<<3) /* disable rewinds */

struct snd_interval { unsigned int min, max; diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index d126c03..a70e52d 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream, runtime->no_period_wakeup = (params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) && (params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
runtime->no_rewinds =
       params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;

bits = snd_pcm_format_physical_width(runtime->format);
runtime->sample_bits = bits;
@@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst if (frames == 0) return 0;
if (runtime->no_rewinds)
   return 0;
snd_pcm_stream_lock_irq(substream);
switch (runtime->status->state) {
case SNDRV_PCM_STATE_PREPARED:
@@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr if (frames == 0) return 0;
if (runtime->no_rewinds)
   return 0;
snd_pcm_stream_lock_irq(substream);
switch (runtime->status->state) {
case SNDRV_PCM_STATE_PREPARED:

Alexander E. Patrakov

5:43 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

28.07.2015 19:19, Pierre-Louis Bossart wrote:

...

On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:

...

...

I have not seen any justification for the drastic measure of making a

DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

If this "no rewinds" mode is not made the default, then exactly

nobody will use it. Everyone except sound servers opens the default device with the default flags. I understand the potential to break existing userspace, especially PulseAudio, but we really need to think more here.

Not sure I understand the issue. when a new functionality is added it takes time to be adopted. If we can push it in sound servers first then it creates a wide pool of users from day1.

The issue is that the proposed functionality, in the currently proposed "I promise to never rewind" form, is nearly useless by its very nature for any sound server that cares about power consumption significantly more than dmix does. It is indeed usable by JACK (by its design, it doesn't rewind, and uses low latency) and CRAS (which currently doesn't rewind, but I am not sure whether this is a bug or a deliberate decision based on non-public measurements of extra power savings that "rewinds + high latency" would allow).

As I have already explained, dmix, when mixing, writes to the hardware buffer multiple times, which is equivalent to rewinding. PulseAudio uses rewinds for a very specific purpose - to avoid CPU wakeups in the common "nothing unexpected happened" case, i.e. to allow very high average latency while keeping the latency of reaction to unexpected events low. So, by convincing PulseAudio to never rewind and to tell the driver about this fact, you can save some power in the card, but (if the proponents of rewinds are right - see my earlier request for non-public information) will waste way more power due to the need for much more frequent CPU wakeups ("cannot rewind but have to react to unexpected events within 20 ms" = "must wake up every 20 ms").

The only cases where this flag can be useful for sound servers are:

1. Sound servers that already, by design, always waste CPU power by running at low latency;

2. Sound cards like USB audio where the kernel already contains a low-latency thread that copies data to the card in the serialized way.

For PulseAudio on Intel hardware and HDA-compatible card, it would (if the situation on my Sony laptop is an exception and not the rule), as I have already mentioned, only trade some power consumption improvement in the card for bigger losses in the CPU.

And with (2), I am not sure whether this is a win at all: you are likely trading a kernel thread that wakes up, say, every 6 ms (with userspace waking up very rarely), to a userspace wakeup that happens every 6 or maybe 20 ms (as the sound server chooses), but incurs more context switches. It does shift the policy decision about the wakeup rate to userspace, though.

-- Alexander E. Patrakov

Pierre-Louis Bossart

29 Jul 29 Jul

7:46 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:

...

28.07.2015 19:19, Pierre-Louis Bossart wrote:

...
On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:

...
...

I have not seen any justification for the drastic measure of making a

DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

If this "no rewinds" mode is not made the default, then exactly

nobody will use it. Everyone except sound servers opens the default device with the default flags. I understand the potential to break existing userspace, especially PulseAudio, but we really need to think more here.

Not sure I understand the issue. when a new functionality is added it takes time to be adopted. If we can push it in sound servers first then it creates a wide pool of users from day1.

The issue is that the proposed functionality, in the currently proposed "I promise to never rewind" form, is nearly useless by its very nature for any sound server that cares about power consumption significantly more than dmix does. It is indeed usable by JACK (by its design, it doesn't rewind, and uses low latency) and CRAS (which currently doesn't rewind, but I am not sure whether this is a bug or a deliberate decision based on non-public measurements of extra power savings that "rewinds + high latency" would allow).

As I have already explained, dmix, when mixing, writes to the hardware buffer multiple times, which is equivalent to rewinding. PulseAudio uses rewinds for a very specific purpose - to avoid CPU wakeups in the common "nothing unexpected happened" case, i.e. to allow very high average latency while keeping the latency of reaction to unexpected events low. So, by convincing PulseAudio to never rewind and to tell the driver about this fact, you can save some power in the card, but (if the proponents of rewinds are right - see my earlier request for non-public information) will waste way more power due to the need for much more frequent CPU wakeups ("cannot rewind but have to react to unexpected events within 20 ms" = "must wake up every 20 ms").

The only cases where this flag can be useful for sound servers are:

Sound servers that already, by design, always waste CPU power by

running at low latency;

Your classification is not exhaustive enough. You need to take into account sound servers that have two outputs per endpoint, one for low-latency and one for low-power/deep-buffer with a DSP/hardware mixer. Power is also no longer directly linked to wakeup rates only but also to DDR access patterns. When a larger on-chip buffer is available, disabling rewinds does let the hardware know it can safely fetch data in bigger chunks rather than use small data bursts. This sort of capabilities is becoming prevalent these days, and not just on Intel platforms - see e.g. Nexus5/9 devices -, and maybe PulseAudio and friends need to evolve to make use of these resources rather than stay the course with single output and rewind mechanisms that prevent power optimizations on newer platforms.

...

Sound cards like USB audio where the kernel already contains a

low-latency thread that copies data to the card in the serialized way.

For PulseAudio on Intel hardware and HDA-compatible card, it would (if the situation on my Sony laptop is an exception and not the rule), as I have already mentioned, only trade some power consumption improvement in the card for bigger losses in the CPU.

And with (2), I am not sure whether this is a win at all: you are likely trading a kernel thread that wakes up, say, every 6 ms (with userspace waking up very rarely), to a userspace wakeup that happens every 6 or maybe 20 ms (as the sound server chooses), but incurs more context switches. It does shift the policy decision about the wakeup rate to userspace, though.

Alexander E. Patrakov

30 Jul 30 Jul

8:11 a.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

29.07.2015 22:46, Pierre-Louis Bossart wrote:

...

On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:

...
28.07.2015 19:19, Pierre-Louis Bossart wrote:

...
On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:

...
...

I have not seen any justification for the drastic measure of

making a DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

If this "no rewinds" mode is not made the default, then exactly

nobody will use it. Everyone except sound servers opens the default device with the default flags. I understand the potential to break existing userspace, especially PulseAudio, but we really need to think more here.

Not sure I understand the issue. when a new functionality is added it takes time to be adopted. If we can push it in sound servers first then it creates a wide pool of users from day1.

The issue is that the proposed functionality, in the currently proposed "I promise to never rewind" form, is nearly useless by its very nature for any sound server that cares about power consumption significantly more than dmix does. It is indeed usable by JACK (by its design, it doesn't rewind, and uses low latency) and CRAS (which currently doesn't rewind, but I am not sure whether this is a bug or a deliberate decision based on non-public measurements of extra power savings that "rewinds + high latency" would allow).

As I have already explained, dmix, when mixing, writes to the hardware buffer multiple times, which is equivalent to rewinding. PulseAudio uses rewinds for a very specific purpose - to avoid CPU wakeups in the common "nothing unexpected happened" case, i.e. to allow very high average latency while keeping the latency of reaction to unexpected events low. So, by convincing PulseAudio to never rewind and to tell the driver about this fact, you can save some power in the card, but (if the proponents of rewinds are right - see my earlier request for non-public information) will waste way more power due to the need for much more frequent CPU wakeups ("cannot rewind but have to react to unexpected events within 20 ms" = "must wake up every 20 ms").

The only cases where this flag can be useful for sound servers are:

Sound servers that already, by design, always waste CPU power by

running at low latency;

Your classification is not exhaustive enough. You need to take into account sound servers that have two outputs per endpoint, one for low-latency and one for low-power/deep-buffer with a DSP/hardware mixer. Power is also no longer directly linked to wakeup rates only but also to DDR access patterns. When a larger on-chip buffer is available, disabling rewinds does let the hardware know it can safely fetch data in bigger chunks rather than use small data bursts. This sort of capabilities is becoming prevalent these days, and not just on Intel platforms - see e.g. Nexus5/9 devices -, and maybe PulseAudio and friends need to evolve to make use of these resources rather than stay the course with single output and rewind mechanisms that prevent power optimizations on newer platforms.

OK, I see some new points there, but still no full picture. My point still is: deep (>= 30-40 ms) buffer without rewinds or without any other means to seamlessly cancel and replace what's there (up to a certain unrewindable latency, i.e. the same 30-40 ms, that a human won't object to) is absolutely useless to any current or future sound server.

So, in your example, "I promise to never rewind" flag can definitely be set for a low-latency endpoint, but not for the other one. That other endpoint would benefit from a way to say "I promise to never rewind closer than X samples to the hardware pointer" API, which you have flagged as a separate discussion. I would also not object to that endpoint becoming a batch/blocktransfer PCM.

Well, there is a hackish way to say "I promise to never rewind" on both endpoints even with a deep buffer on a low-power endpoint. The way is: instead of rewinding, open the low-latency endpoint and play a correction signal (i.e. the difference between the actual and the wanted contents of the deep buffer) through it, with low latency, and let the hardware mix. But I don't like this, for purely subjective reasons and for the need for that endpoint to have twice as much output amplitude than one would normally need. What do you propose instead?

Still, I think that a more general API that says "I promise to never rewind closer than X samples to the hardware pointer" would be more useful than the black-and-white "I promise to never rewind" call - assuming that it can be expressed to the hardware.

-- Alexander E. Patrakov

Pierre-Louis Bossart

3:46 p.m.

New subject: [alsa-devel] [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds

On 07/30/2015 01:11 AM, Alexander E. Patrakov wrote:

...

29.07.2015 22:46, Pierre-Louis Bossart wrote:

...
On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:

...
28.07.2015 19:19, Pierre-Louis Bossart wrote:

...
On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:

...
...

I have not seen any justification for the drastic measure

of making a DMA-based device completely unrewindable. Maybe a more polite "please make this a batch/blocktransfer card" request, thus disallowing only sub-period rewinds, would still be useful for powersaving, without killing dmix.

If this "no rewinds" mode is not made the default, then

exactly nobody will use it. Everyone except sound servers opens the default device with the default flags. I understand the potential to break existing userspace, especially PulseAudio, but we really need to think more here.

Not sure I understand the issue. when a new functionality is added it takes time to be adopted. If we can push it in sound servers first then it creates a wide pool of users from day1.

The issue is that the proposed functionality, in the currently proposed "I promise to never rewind" form, is nearly useless by its very nature for any sound server that cares about power consumption significantly more than dmix does. It is indeed usable by JACK (by its design, it doesn't rewind, and uses low latency) and CRAS (which currently doesn't rewind, but I am not sure whether this is a bug or a deliberate decision based on non-public measurements of extra power savings that "rewinds + high latency" would allow).

As I have already explained, dmix, when mixing, writes to the hardware buffer multiple times, which is equivalent to rewinding. PulseAudio uses rewinds for a very specific purpose - to avoid CPU wakeups in the common "nothing unexpected happened" case, i.e. to allow very high average latency while keeping the latency of reaction to unexpected events low. So, by convincing PulseAudio to never rewind and to tell the driver about this fact, you can save some power in the card, but (if the proponents of rewinds are right - see my earlier request for non-public information) will waste way more power due to the need for much more frequent CPU wakeups ("cannot rewind but have to react to unexpected events within 20 ms" = "must wake up every 20 ms").

The only cases where this flag can be useful for sound servers are:

Sound servers that already, by design, always waste CPU power

by running at low latency;

Your classification is not exhaustive enough. You need to take into account sound servers that have two outputs per endpoint, one for low-latency and one for low-power/deep-buffer with a DSP/hardware mixer. Power is also no longer directly linked to wakeup rates only but also to DDR access patterns. When a larger on-chip buffer is available, disabling rewinds does let the hardware know it can safely fetch data in bigger chunks rather than use small data bursts. This sort of capabilities is becoming prevalent these days, and not just on Intel platforms - see e.g. Nexus5/9 devices -, and maybe PulseAudio and friends need to evolve to make use of these resources rather than stay the course with single output and rewind mechanisms that prevent power optimizations on newer platforms.

OK, I see some new points there, but still no full picture. My point still is: deep (>= 30-40 ms) buffer without rewinds or without any other means to seamlessly cancel and replace what's there (up to at certain unrewindable latency, i.e. the same 30-40 ms, that a human won't object to) is absolutely useless to any current or future sound server.

So, in your example, "I promise to never rewind" flag can definitely be set for a low-latency endpoint, but not for the other one. That other endpoint would benefit from a way to say "I promise to never rewind closer than X samples to the hardware pointer" API, which youen have flagged as a separate discussion. I would also not object to that endpoint becoming a batch/blocktransfer PCM.

You are using low-latency with two different meanings : one is a fixed low-latency for music applications, where rewinds are indeed useless. But there is also a case for variable latencies when you are sharing the same output between apps having different needs, and that could also be used with very limited buffering.

Also for the deep-buffer solution, as long as you dedicate this output to a single media consumption app (playback, video) then you have no need for rewinds.

In other words, whenever an output is used in a exclusive manner without being shared between apps having different latency needs then you can optimize. For everything else keep using rewinds with the max_burst information to know by how much you can rewind (and accepting that some power optimizations linked to data transfers will not happen)

Also the notion of block transfer is too limiting, the hardware granularity may be smaller than a period and the hardware may still be able to provide precise data on the hw_ptr and delays, so the traditional batch/block transfer definition is a tad obsolete.

...

Well, there s a hackish way to say "I promisle to never rewind" on both endpoints even with a deep buffer on a low-power endpoint. The way is: instead of rewinding, open the low-latency endpoint and play a correction signal (i.e. the difference between the actual and the wanted contents of the deep buffer) through it, with low latency, and let the hardware mix. But I don't like this, for purely subjective reasons and for the need for that endpoint to have twice as much output amplitude than one would normally need. What do you propose instead?

Still, I think that a more general API that says "I promise to never rewind closer than X samples to the hardware pointer" would be morep useful than the black-and-white "I promise to never rewind" call assuming that it can be expressed to the hardware.

the patches provide information on the max burst so you can know by how much to rewind. Setting the max burst based on a negotiation between driver and app is an interesting concept that we looked at but there aren't too many devices that support this capability so this might not be worth the effort.

Pierre-Louis Bossart

8 Jul 8 Jul

12:10 p.m.

New subject: [alsa-devel] [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops

When appl_ptr is updated let low-level driver know.

This is only enabled when the NO_REWIND hardware flag is used, so that the low-level driver/hardware to opportunistically pre-fetch data.

FIXME: should we rely on .ack for this? Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/pcm.h | 2 ++ sound/core/pcm_native.c | 18 +++++++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 25310b7..d5eff03 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -87,6 +87,8 @@ struct snd_pcm_ops { unsigned long offset); int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma); int (*ack)(struct snd_pcm_substream *substream); + int (*notify)(struct snd_pcm_substream *substream); + /* FIXME: what's the difference between ack and notify ? */ };

/* diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index a70e52d..dd519b8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -2565,6 +2565,10 @@ static snd_pcm_sframes_t snd_pcm_playback_forward(struct snd_pcm_substream *subs appl_ptr -= runtime->boundary; runtime->control->appl_ptr = appl_ptr; ret = frames; + + if (runtime->no_rewinds && substream->ops->notify) + substream->ops->notify(substream); + __end: snd_pcm_stream_unlock_irq(substream); return ret; @@ -2614,6 +2618,10 @@ static snd_pcm_sframes_t snd_pcm_capture_forward(struct snd_pcm_substream *subst appl_ptr -= runtime->boundary; runtime->control->appl_ptr = appl_ptr; ret = frames; + + if (runtime->no_rewinds && substream->ops->notify) + substream->ops->notify(substream); + __end: snd_pcm_stream_unlock_irq(substream); return ret; @@ -2713,8 +2721,16 @@ static int snd_pcm_sync_ptr(struct snd_pcm_substream *substream, return err; } snd_pcm_stream_lock_irq(substream); - if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) + if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) { + /* FIXME: this code is used by mmap_commit, should it handle boundary + * wrap-around as done for read/write in pcm_lib.c + */ + control->appl_ptr = sync_ptr.c.control.appl_ptr; + /* if supported, let low-level driver know about appl_ptr change */ + if (runtime->no_rewinds && substream->ops->notify) + substream->ops->notify(substream); + } else sync_ptr.c.control.appl_ptr = control->appl_ptr; if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))

-- 1.9.1

Takashi Iwai

4:27 p.m.

New subject: [alsa-devel] [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops

On Wed, 08 Jul 2015 12:10:34 +0200, Pierre-Louis Bossart wrote:

...

When appl_ptr is updated let low-level driver know.

This is only enabled when the NO_REWIND hardware flag is used, so that the low-level driver/hardware to opportunistically pre-fetch data.

FIXME: should we rely on .ack for this? Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

Hmm, OK, so the forward is allowed but with workarounds... But then why rewind won't work in a similar way? DSP might be able to cancel some of inflight data.

In other words, I see no reason to strict notify callback only for no_rewinds. This is an optional ops in anyway.

Also, I find the name "notify" a bit too ambiguous. In this case, it's notifying the applptr change. So, a name related with the function would be more understandable.

thanks,

Takashi

...

include/sound/pcm.h | 2 ++ sound/core/pcm_native.c | 18 +++++++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 25310b7..d5eff03 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -87,6 +87,8 @@ struct snd_pcm_ops { unsigned long offset); int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma); int (*ack)(struct snd_pcm_substream *substream);

int (*notify)(struct snd_pcm_substream *substream);

/* FIXME: what's the difference between ack and notify ? */

};

/* diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index a70e52d..dd519b8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -2565,6 +2565,10 @@ static snd_pcm_sframes_t snd_pcm_playback_forward(struct snd_pcm_substream *subs appl_ptr -= runtime->boundary; runtime->control->appl_ptr = appl_ptr; ret = frames;
if (runtime->no_rewinds && substream->ops->notify)
substream->ops->notify(substream);
__end: snd_pcm_stream_unlock_irq(substream); return ret;
@@ -2614,6 +2618,10 @@ static snd_pcm_sframes_t snd_pcm_capture_forward(struct snd_pcm_substream *subst appl_ptr -= runtime->boundary; runtime->control->appl_ptr = appl_ptr; ret = frames;
if (runtime->no_rewinds && substream->ops->notify)
substream->ops->notify(substream);
__end: snd_pcm_stream_unlock_irq(substream); return ret;
@@ -2713,8 +2721,16 @@ static int snd_pcm_sync_ptr(struct snd_pcm_substream *substream, return err; } snd_pcm_stream_lock_irq(substream);

if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL))
if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) {
/* FIXME: this code is used by mmap_commit, should it handle boundary
 *   wrap-around as done for read/write in pcm_lib.c
 */
control->appl_ptr = sync_ptr.c.control.appl_ptr;
/* if supported, let low-level driver know about appl_ptr change */
if (runtime->no_rewinds && substream->ops->notify)
	substream->ops->notify(substream);
} else sync_ptr.c.control.appl_ptr = control->appl_ptr; if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
-- 1.9.1

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Pierre-Louis Bossart

7:10 p.m.

New subject: [alsa-devel] [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops

...

...
This is only enabled when the NO_REWIND hardware flag is used, so that the low-level driver/hardware to opportunistically pre-fetch data.

FIXME: should we rely on .ack for this? Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

Hmm, OK, so the forward is allowed but with workarounds... But then why rewind won't work in a similar way? DSP might be able to cancel some of inflight data.

Nope, this is explicitly not supported, so unfortunately if we want to optimize for power and let hardware fetch data when it's most appropriate rewinds need to be disabled.

...

In other words, I see no reason to strict notify callback only for no_rewinds. This is an optional ops in anyway.

It's fine to remove the check. I added this based on internal review comments but I agree with your point.

...

Also, I find the name "notify" a bit too ambiguous. In this case, it's notifying the applptr change. So, a name related with the function would be more understandable.

The first open I had was to know if we could use .ack for this? if a different callback is needed, we can use 'appl_ptr_update' instead of 'notify'

...

thanks,

Takashi

...
include/sound/pcm.h | 2 ++ sound/core/pcm_native.c | 18 +++++++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 25310b7..d5eff03 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -87,6 +87,8 @@ struct snd_pcm_ops { unsigned long offset); int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma); int (*ack)(struct snd_pcm_substream *substream);

int (*notify)(struct snd_pcm_substream *substream);

/* FIXME: what's the difference between ack and notify ? */ };

/*

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index a70e52d..dd519b8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -2565,6 +2565,10 @@ static snd_pcm_sframes_t snd_pcm_playback_forward(struct snd_pcm_substream *subs appl_ptr -= runtime->boundary; runtime->control->appl_ptr = appl_ptr; ret = frames;
if (runtime->no_rewinds && substream->ops->notify)
substream->ops->notify(substream);
__end: snd_pcm_stream_unlock_irq(substream); return ret;
@@ -2614,6 +2618,10 @@ static snd_pcm_sframes_t snd_pcm_capture_forward(struct snd_pcm_substream *subst appl_ptr -= runtime->boundary; runtime->control->appl_ptr = appl_ptr; ret = frames;
if (runtime->no_rewinds && substream->ops->notify)
substream->ops->notify(substream);
__end: snd_pcm_stream_unlock_irq(substream); return ret;
@@ -2713,8 +2721,16 @@ static int snd_pcm_sync_ptr(struct snd_pcm_substream *substream, return err; } snd_pcm_stream_lock_irq(substream);

if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL))
if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) {
/* FIXME: this code is used by mmap_commit, should it handle boundary
 *   wrap-around as done for read/write in pcm_lib.c
 */
control->appl_ptr = sync_ptr.c.control.appl_ptr;
/* if supported, let low-level driver know about appl_ptr change */
if (runtime->no_rewinds && substream->ops->notify)
	substream->ops->notify(substream);
} else sync_ptr.c.control.appl_ptr = control->appl_ptr; if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
-- 1.9.1

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Takashi Iwai

9 Jul 9 Jul

4:44 p.m.

New subject: [alsa-devel] [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops

On Wed, 08 Jul 2015 19:10:32 +0200, Pierre-Louis Bossart wrote:

...

...
...
This is only enabled when the NO_REWIND hardware flag is used, so that the low-level driver/hardware to opportunistically pre-fetch data.

FIXME: should we rely on .ack for this? Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

Hmm, OK, so the forward is allowed but with workarounds... But then why rewind won't work in a similar way? DSP might be able to cancel some of inflight data.

Nope, this is explicitly not supported, so unfortunately if we want to optimize for power and let hardware fetch data when it's most appropriate rewinds need to be disabled.

...
In other words, I see no reason to strict notify callback only for no_rewinds. This is an optional ops in anyway.

It's fine to remove the check. I added this based on internal review comments but I agree with your point.

OK, then let's treat the NO_REWIND and new ops individually.

...

...
Also, I find the name "notify" a bit too ambiguous. In this case, it's notifying the applptr change. So, a name related with the function would be more understandable.

The first open I had was to know if we could use .ack for this? if a different callback is needed, we can use 'appl_ptr_update' instead of 'notify'

As there are no many users of ack callback, I don't mind to reuse it. But then we need to extend the function to receive a new argument indicating the type of ack, right?

thanks,

Takashi

Raymond Yau

9:25 a.m.

New subject: [alsa-devel] [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops

...

...
When appl_ptr is updated let low-level driver know.

This is only enabled when the NO_REWIND hardware flag is used, so that the low-level driver/hardware to opportunistically pre-fetch data.

FIXME: should we rely on .ack for this? Signed-off-by: Pierre-Louis Bossart <

pierre-louis.bossart@linux.intel.com>

...

Hmm, OK, so the forward is allowed but with workarounds... But then why rewind won't work in a similar way? DSP might be able to cancel some of inflight data.

In other words, I see no reason to strict notify callback only for no_rewinds. This is an optional ops in anyway.

Also, I find the name "notify" a bit too ambiguous. In this case, it's notifying the applptr change. So, a name related with the function would be more understandable.

If driver specify no rewind flag, should alsa lib

1) return error when application call snd_pcm_rewind() and snd_pcm_forward() ? 2) return zero when call snd_pcm_rewindable() and snd_pcm_forwardable()

How can the application recover when hw_ptr is behind appl_ptr when stop threshold is set to boundary ?

Do you mean compressed audio stream don't support rewind and forward ?

Pierre-Louis Bossart

9:35 a.m.

New subject: [alsa-devel] [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops

On 7/9/15 9:25 AM, Raymond Yau wrote:

...

...
...
When appl_ptr is updated let low-level driver know.

This is only enabled when the NO_REWIND hardware flag is used, so that the low-level driver/hardware to opportunistically pre-fetch data.

FIXME: should we rely on .ack for this? Signed-off-by: Pierre-Louis Bossart <

pierre-louis.bossart@linux.intel.com>

...
Hmm, OK, so the forward is allowed but with workarounds... But then why rewind won't work in a similar way? DSP might be able to cancel some of inflight data.

In other words, I see no reason to strict notify callback only for no_rewinds. This is an optional ops in anyway.

Also, I find the name "notify" a bit too ambiguous. In this case, it's notifying the applptr change. So, a name related with the function would be more understandable.

If driver specify no rewind flag, should alsa lib

return error when application call snd_pcm_rewind() and

snd_pcm_forward() ?

no, it would return 0 on rewind and the number of frames on forward. In other words the value returned is consistent with the function prototype which has no scope for errors.

...

return zero when call snd_pcm_rewindable() and snd_pcm_forwardable()

again zero for rewind and the actual number for forward.

...

How can the application recover when hw_ptr is behind appl_ptr when stop threshold is set to boundary ?

Don't understand the question, there is no change here.

...

Do you mean compressed audio stream don't support rewind and forward ?

You can't rewind in a compressed bitstream in general without losing sync or missing state information when there are dependencies between frames (linear prediction or filter with history). Forward is the same. Some encoders allow for skips to well identified markers but random access is not possible or desired in general.

...

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Pierre-Louis Bossart

8 Jul 8 Jul

12:10 p.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

Report how many bytes the DMA can burst before updating the hw_ptr. This can help fix two issues with stale data discussed many times over on the alsa-devel mailing list (refilling/reading ring buffer too late or rewinding too close to the hw_ptr)

FIXME: 1. this was copy/pasted/edited based on the fifo_size fields, not sure why we would need this IOCTL1 2. we also need the ability to set the actual fifo size, if suppported, by the hardware, to negociate the prefetch amount between application and driver. By default the default should be max buffering to allow for lower power, but for low-latency use cases we may want to reduce the amount of prefetching.

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/pcm.h | 2 ++ include/uapi/sound/asound.h | 5 +++-- sound/core/pcm_lib.c | 19 +++++++++++++++++++ sound/core/pcm_native.c | 7 +++++++ 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index d5eff03..57c0571 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -56,6 +56,7 @@ struct snd_pcm_hardware { unsigned int periods_min; /* min # of periods */ unsigned int periods_max; /* max # of periods */ size_t fifo_size; /* fifo size in bytes */ + unsigned int max_dma_burst; /* max DMA in-flight bytes, linked to hw_ptr precision/fuzziness */ };

struct snd_pcm_substream; @@ -106,6 +107,7 @@ struct snd_pcm_ops { #define SNDRV_PCM_IOCTL1_CHANNEL_INFO 2 #define SNDRV_PCM_IOCTL1_GSTATE 3 #define SNDRV_PCM_IOCTL1_FIFO_SIZE 4 +#define SNDRV_PCM_IOCTL1_MAX_DMA_BURST 5

#define SNDRV_PCM_TRIGGER_STOP 0 #define SNDRV_PCM_TRIGGER_START 1 diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index b62b162..3dc049a 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -386,8 +386,9 @@ struct snd_pcm_hw_params { unsigned int msbits; /* R: used most significant bits */ unsigned int rate_num; /* R: rate numerator */ unsigned int rate_den; /* R: rate denominator */ - snd_pcm_uframes_t fifo_size; /* R: chip FIFO size in frames */ - unsigned char reserved[64]; /* reserved for future */ + snd_pcm_uframes_t fifo_size; /* R: chip FIFO size in frames, indicates buffering after hw_ptr */ + unsigned int max_dma_burst; /* R: max in-flight bytes, indicates buffering before hw_ptr */ + unsigned char reserved[60]; /* reserved for future */ };

enum { diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 7d45645..dc89aa0 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1826,6 +1826,22 @@ static int snd_pcm_lib_ioctl_fifo_size(struct snd_pcm_substream *substream, return 0; }

+static int snd_pcm_lib_ioctl_max_dma_burst(struct snd_pcm_substream *substream, + void *arg) +{ + struct snd_pcm_hw_params *params = arg; + + /* FIXME: add sanity checks: + * max burst < ring buffer + * max burst >= min_period_size + * not sure if we can check against period sizes? + * any others ? + */ + params->max_dma_burst = substream->runtime->hw.max_dma_burst; + + return 0; +} + /** * snd_pcm_lib_ioctl - a generic PCM ioctl callback * @substream: the pcm substream instance @@ -1849,6 +1865,9 @@ int snd_pcm_lib_ioctl(struct snd_pcm_substream *substream, return snd_pcm_lib_ioctl_channel_info(substream, arg); case SNDRV_PCM_IOCTL1_FIFO_SIZE: return snd_pcm_lib_ioctl_fifo_size(substream, arg); + case SNDRV_PCM_IOCTL1_MAX_DMA_BURST: + return snd_pcm_lib_ioctl_max_dma_burst(substream, arg); + } return -ENXIO; } diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index dd519b8..3d379b8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -280,6 +280,7 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,

params->info = 0; params->fifo_size = 0; + params->max_dma_burst = 0; if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_SAMPLE_BITS)) params->msbits = 0; if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_RATE)) { @@ -437,6 +438,12 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream, return changed; } } + if (!params->max_dma_burst) { + changed = substream->ops->ioctl(substream, + SNDRV_PCM_IOCTL1_MAX_DMA_BURST, params); + if (changed < 0) + return changed; + } params->rmask = 0; return 0; }

-- 1.9.1

Takashi Iwai

4:37 p.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

On Wed, 08 Jul 2015 12:10:35 +0200, Pierre-Louis Bossart wrote:

...

Report how many bytes the DMA can burst before updating the hw_ptr. This can help fix two issues with stale data discussed many times over on the alsa-devel mailing list (refilling/reading ring buffer too late or rewinding too close to the hw_ptr)

FIXME:

this was copy/pasted/edited based on the fifo_size fields, not

sure why we would need this IOCTL1

fifo_size would fit, but it means that it also changes the current behavior. I don't believe that currently there are many programs relying on this value, but who knows.

...

we also need the ability to set the actual fifo size, if suppported,

by the hardware, to negociate the prefetch amount between application and driver. By default the default should be max buffering to allow for lower power, but for low-latency use cases we may want to reduce the amount of prefetching.

Right. So a hw_parmas field looks suitable for that purpose.

Takashi

...

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

include/sound/pcm.h | 2 ++ include/uapi/sound/asound.h | 5 +++-- sound/core/pcm_lib.c | 19 +++++++++++++++++++ sound/core/pcm_native.c | 7 +++++++ 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index d5eff03..57c0571 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -56,6 +56,7 @@ struct snd_pcm_hardware { unsigned int periods_min; /* min # of periods */ unsigned int periods_max; /* max # of periods */ size_t fifo_size; /* fifo size in bytes */

unsigned int max_dma_burst; /* max DMA in-flight bytes, linked to hw_ptr precision/fuzziness */

};

struct snd_pcm_substream; @@ -106,6 +107,7 @@ struct snd_pcm_ops { #define SNDRV_PCM_IOCTL1_CHANNEL_INFO 2 #define SNDRV_PCM_IOCTL1_GSTATE 3 #define SNDRV_PCM_IOCTL1_FIFO_SIZE 4 +#define SNDRV_PCM_IOCTL1_MAX_DMA_BURST 5

#define SNDRV_PCM_TRIGGER_STOP 0 #define SNDRV_PCM_TRIGGER_START 1 diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index b62b162..3dc049a 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -386,8 +386,9 @@ struct snd_pcm_hw_params { unsigned int msbits; /* R: used most significant bits */ unsigned int rate_num; /* R: rate numerator */ unsigned int rate_den; /* R: rate denominator */

snd_pcm_uframes_t fifo_size; /* R: chip FIFO size in frames */

unsigned char reserved[64]; /* reserved for future */

snd_pcm_uframes_t fifo_size; /* R: chip FIFO size in frames, indicates buffering after hw_ptr */

unsigned int max_dma_burst; /* R: max in-flight bytes, indicates buffering before hw_ptr */

unsigned char reserved[60]; /* reserved for future */

};

enum { diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 7d45645..dc89aa0 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1826,6 +1826,22 @@ static int snd_pcm_lib_ioctl_fifo_size(struct snd_pcm_substream *substream, return 0; }

+static int snd_pcm_lib_ioctl_max_dma_burst(struct snd_pcm_substream *substream,
			      void *arg)
+{
struct snd_pcm_hw_params *params = arg;

/* FIXME: add sanity checks:
* max burst < ring buffer
* max burst >= min_period_size
* not sure if we can check against period sizes?
* any others ?
*/
params->max_dma_burst = substream->runtime->hw.max_dma_burst;

return 0;
+}

/**

snd_pcm_lib_ioctl - a generic PCM ioctl callback

@substream: the pcm substream instance

@@ -1849,6 +1865,9 @@ int snd_pcm_lib_ioctl(struct snd_pcm_substream *substream, return snd_pcm_lib_ioctl_channel_info(substream, arg); case SNDRV_PCM_IOCTL1_FIFO_SIZE: return snd_pcm_lib_ioctl_fifo_size(substream, arg);
case SNDRV_PCM_IOCTL1_MAX_DMA_BURST:
return snd_pcm_lib_ioctl_max_dma_burst(substream, arg);
} return -ENXIO;
} diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index dd519b8..3d379b8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -280,6 +280,7 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,

params->info = 0; params->fifo_size = 0;

params->max_dma_burst = 0; if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_SAMPLE_BITS)) params->msbits = 0; if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_RATE)) {

@@ -437,6 +438,12 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream, return changed; } }
if (!params->max_dma_burst) {
changed = substream->ops->ioctl(substream,
				SNDRV_PCM_IOCTL1_MAX_DMA_BURST, params);
if (changed < 0)
	return changed;
} params->rmask = 0; return 0;
}

1.9.1

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Pierre-Louis Bossart

7:46 p.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

...

...
FIXME:

this was copy/pasted/edited based on the fifo_size fields, not

sure why we would need this IOCTL1

fifo_size would fit, but it means that it also changes the current behavior. I don't believe that currently there are many programs relying on this value, but who knows.

I saw a mention of fifo_size in the VIA HDA controller, that's why I went ahead with a different field. The fifo_size could be useful as a max value for the internal hardware delay, when app use the 'delay' field in the status structure they get a very dynamic value that isn't necessarily straightforward to use.

...

...

we also need the ability to set the actual fifo size, if suppported,

by the hardware, to negociate the prefetch amount between application and driver. By default the default should be max buffering to allow for lower power, but for low-latency use cases we may want to reduce the amount of prefetching.

Right. So a hw_parmas field looks suitable for that purpose.

That 'set' capability is a lot more complicated, I am really having a hard time with all the constraints for hw_params and how to represent min,max and step values... If anyone is willing to help on that part I wouldn't mind, this is really a part of ALSA I never looked into...

Raymond Yau

10 Jul 10 Jul

4:35 a.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

...

...
...
FIXME:

this was copy/pasted/edited based on the fifo_size fields, not

sure why we would need this IOCTL1

fifo_size would fit, but it means that it also changes the current behavior. I don't believe that currently there are many programs relying on this value, but who knows.

I saw a mention of fifo_size in the VIA HDA controller, that's why I went

ahead with a different field. The fifo_size could be useful as a max value for the internal hardware delay, when app use the 'delay' field in the status structure they get a very dynamic value that isn't necessarily straightforward to use.

...

...
...

we also need the ability to set the actual fifo size, if suppported,

by the hardware, to negociate the prefetch amount between application and driver. By default the default should be max buffering to allow for lower power, but for low-latency use cases we may want to reduce the amount of prefetching.

Right. So a hw_parmas field looks suitable for that purpose.

That 'set' capability is a lot more complicated, I am really having a

hard time with all the constraints for hw_params and how to represent min,max and step values... If anyone is willing to help on that part I wouldn't mind, this is really a part of ALSA I never looked into...

What is the usage of runtime->dma_bytes in snd_pcm_hw_constraints_complete() in core/pcm_native.c ?

Was the usage similar to 128 bytes alignment ?

/* FIXME: remove */ if (runtime->dma_bytes) { err = snd_pcm_hw_constraint_minmax(runtime, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 0, runtime->dma_bytes); if (err < 0) return err; }

Lars-Peter Clausen

7:13 p.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

On 07/10/2015 04:35 AM, Raymond Yau wrote: [...]

...

What is the usage of runtime->dma_bytes in snd_pcm_hw_constraints_complete() in core/pcm_native.c ?

Was the usage similar to 128 bytes alignment ?

/* FIXME: remove */ if (runtime->dma_bytes) { err = snd_pcm_hw_constraint_minmax(runtime, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 0, runtime->dma_bytes); if (err < 0) return err; }

The FIXME is probably correct. dma_bytes should always be 0 at that point. This is a freshly allocated runtime struct and dma_bytes only gets set once a buffer is allocated, which is done later on.

- Lars

Alexander E. Patrakov

11 Jul 11 Jul

7:46 p.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

08.07.2015 19:37, Takashi Iwai wrote:

...

On Wed, 08 Jul 2015 12:10:35 +0200, Pierre-Louis Bossart wrote:

...
Report how many bytes the DMA can burst before updating the hw_ptr. This can help fix two issues with stale data discussed many times over on the alsa-devel mailing list (refilling/reading ring buffer too late or rewinding too close to the hw_ptr)

FIXME:

this was copy/pasted/edited based on the fifo_size fields, not

sure why we would need this IOCTL1

fifo_size would fit, but it means that it also changes the current behavior. I don't believe that currently there are many programs relying on this value, but who knows.

Debian Code Search knows, if that matters. Result: it is used by bindings, wrappers and reimplementations only (lush, oss4, alsa-lib itself, pyglet, libtritonus-java).

https://codesearch.debian.net/results/snd_pcm_hw_params_get_fifo_size/page_0

https://codesearch.debian.net/results/snd_pcm_hw_params_get_fifo_size/page_1

...

...

we also need the ability to set the actual fifo size, if suppported,

by the hardware, to negociate the prefetch amount between application and driver. By default the default should be max buffering to allow for lower power, but for low-latency use cases we may want to reduce the amount of prefetching.

Right. So a hw_parmas field looks suitable for that purpose.

Takashi

...
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com

include/sound/pcm.h | 2 ++ include/uapi/sound/asound.h | 5 +++-- sound/core/pcm_lib.c | 19 +++++++++++++++++++ sound/core/pcm_native.c | 7 +++++++ 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h index d5eff03..57c0571 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -56,6 +56,7 @@ struct snd_pcm_hardware { unsigned int periods_min; /* min # of periods */ unsigned int periods_max; /* max # of periods */ size_t fifo_size; /* fifo size in bytes */

unsigned int max_dma_burst; /* max DMA in-flight bytes, linked to hw_ptr precision/fuzziness */ };

struct snd_pcm_substream;

@@ -106,6 +107,7 @@ struct snd_pcm_ops { #define SNDRV_PCM_IOCTL1_CHANNEL_INFO 2 #define SNDRV_PCM_IOCTL1_GSTATE 3 #define SNDRV_PCM_IOCTL1_FIFO_SIZE 4 +#define SNDRV_PCM_IOCTL1_MAX_DMA_BURST 5

#define SNDRV_PCM_TRIGGER_STOP 0 #define SNDRV_PCM_TRIGGER_START 1 diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index b62b162..3dc049a 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -386,8 +386,9 @@ struct snd_pcm_hw_params { unsigned int msbits; /* R: used most significant bits */ unsigned int rate_num; /* R: rate numerator */ unsigned int rate_den; /* R: rate denominator */

snd_pcm_uframes_t fifo_size; /* R: chip FIFO size in frames */

unsigned char reserved[64]; /* reserved for future */

snd_pcm_uframes_t fifo_size; /* R: chip FIFO size in frames, indicates buffering after hw_ptr */

unsigned int max_dma_burst; /* R: max in-flight bytes, indicates buffering before hw_ptr */

unsigned char reserved[60]; /* reserved for future */ };

enum {

diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 7d45645..dc89aa0 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1826,6 +1826,22 @@ static int snd_pcm_lib_ioctl_fifo_size(struct snd_pcm_substream *substream, return 0; }

+static int snd_pcm_lib_ioctl_max_dma_burst(struct snd_pcm_substream *substream,
			      void *arg)
+{
struct snd_pcm_hw_params *params = arg;

/* FIXME: add sanity checks:
* max burst < ring buffer
* max burst >= min_period_size
* not sure if we can check against period sizes?
* any others ?
*/
params->max_dma_burst = substream->runtime->hw.max_dma_burst;

return 0;
+}

/**

snd_pcm_lib_ioctl - a generic PCM ioctl callback

@substream: the pcm substream instance

@@ -1849,6 +1865,9 @@ int snd_pcm_lib_ioctl(struct snd_pcm_substream *substream, return snd_pcm_lib_ioctl_channel_info(substream, arg); case SNDRV_PCM_IOCTL1_FIFO_SIZE: return snd_pcm_lib_ioctl_fifo_size(substream, arg);
case SNDRV_PCM_IOCTL1_MAX_DMA_BURST:
return snd_pcm_lib_ioctl_max_dma_burst(substream, arg);
} return -ENXIO; }
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index dd519b8..3d379b8 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -280,6 +280,7 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,

params->info = 0; params->fifo_size = 0;

params->max_dma_burst = 0; if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_SAMPLE_BITS)) params->msbits = 0; if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_RATE)) {

@@ -437,6 +438,12 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream, return changed; } }
if (!params->max_dma_burst) {
changed = substream->ops->ioctl(substream,
				SNDRV_PCM_IOCTL1_MAX_DMA_BURST, params);
if (changed < 0)
	return changed;
} params->rmask = 0; return 0; }
-- 1.9.1

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Jaroslav Kysela

8:28 p.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

...

...
...

unsigned int max_dma_burst; /* R: max in-flight bytes, indicates buffering before hw_ptr */

I'm not sure if we should name fields like dma in the abstract ioctl layer. The words in the comment seems more appropriate or just remove dma and keep only max_burst.

Thanks, Jaroslav

-- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project; Red Hat, Inc.

Raymond Yau

16 Jul 16 Jul

4:11 a.m.

New subject: [alsa-devel] [RFC PATCH 3/4] ALSA: core: add report of max dma burst

...

...
...
...

unsigned int max_dma_burst; /* R: max in-flight bytes,

indicates buffering before hw_ptr */

...

I'm not sure if we should name fields like dma in the abstract ioctl layer. The words in the comment seems more appropriate or just remove dma and keep only max_burst.

Why not using hwptr_granularity if the usage is about the hw_ptr update interval ?

max_dma_brust is confusing for firewire and usb audio

Pierre-Louis Bossart

8 Jul 8 Jul

12:10 p.m.

New subject: [alsa-devel] [RFC PATCH 4/4] ALSA: hda: add default value for max_dma_burst

set value to 128 bytes for legacy HDAudio, can be overridden as needed (on a per-stream basis) for enhanced hardware with more buffering capabilities

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/pci/hda/hda_controller.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/sound/pci/hda/hda_controller.c b/sound/pci/hda/hda_controller.c index 9444559..958c8f1 100644 --- a/sound/pci/hda/hda_controller.c +++ b/sound/pci/hda/hda_controller.c @@ -391,6 +391,7 @@ static struct snd_pcm_hardware azx_pcm_hw = { .periods_min = 2, .periods_max = AZX_MAX_FRAG, .fifo_size = 0, + .max_dma_burst = 128 /* default value for all legacy HDAudio controllers, override as needed */ };

static int azx_pcm_open(struct snd_pcm_substream *substream)

-- 1.9.1

Takashi Iwai

4:31 p.m.

New subject: [alsa-devel] [RFC PATCH 0/4] better support for bursty DMA usages

On Wed, 08 Jul 2015 12:10:32 +0200, Pierre-Louis Bossart wrote:

...

Set of patches to fix issues with hw_ptr fuzziness [1] and increased buffering w/ DSPs

disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch up to

the application pointer, rewinds not supported) 2. report max in-flight bytes to avoid problems with stale data (late wake-ups, rewinds)

[1] http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html

TODO:

fixes and alsa-lib updates (compile-tested only for now)

get feedback

if supported, set DMA buffering based on negotiation between driver and app (capabilities vs.

latency needs)

Thanks for heading up. I wanted to start working on this by myself, too, but it was pending due to the horribly hot summer days here :)

The new flag and the callback are fairly similar as what I had in my mind. They look simple enough (although details need more discussion and evaulation).

For the DMA burst thing, I'm not quite sure whether it's the best form, including its naming. But I'd like to be neutral about this, and hopefully others will give opinion for this or give alternatives.

Takashi

...

Pierre-Louis Bossart (4): ALSA: core: let low-level driver or userspace disable rewinds ALSA: core: add .notify callback for pcm ops ALSA: core: add report of max dma burst ALSA: hda: add default value for max_dma_burst

include/sound/pcm.h | 5 +++++ include/uapi/sound/asound.h | 6 ++++-- sound/core/pcm_lib.c | 19 +++++++++++++++++++ sound/core/pcm_native.c | 33 ++++++++++++++++++++++++++++++++- sound/pci/hda/hda_controller.c | 1 + 5 files changed, 61 insertions(+), 3 deletions(-)

-- 1.9.1

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Pierre-Louis Bossart

7:50 p.m.

New subject: [alsa-devel] [RFC PATCH 0/4] better support for bursty DMA usages

On 7/8/15 4:31 PM, Takashi Iwai wrote:

...

On Wed, 08 Jul 2015 12:10:32 +0200, Pierre-Louis Bossart wrote:

...
Set of patches to fix issues with hw_ptr fuzziness [1] and increased buffering w/ DSPs

disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch up to

the application pointer, rewinds not supported) 2. report max in-flight bytes to avoid problems with stale data (late wake-ups, rewinds)

[1] http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html

TODO:

fixes and alsa-lib updates (compile-tested only for now)

get feedback

if supported, set DMA buffering based on negotiation between driver and app (capabilities vs.

latency needs)

Thanks for heading up. I wanted to start working on this by myself, too, but it was pending due to the horribly hot summer days here :)

Thanks for the quick review Takashi, much appreciated. The work was done before the heat wave on my side...

...

The new flag and the callback are fairly similar as what I had in my mind. They look simple enough (although details need more discussion and evaulation).

For the DMA burst thing, I'm not quite sure whether it's the best form, including its naming. But I'd like to be neutral about this, and hopefully others will give opinion for this or give alternatives.

Yes, this is really a first draft to try and solve the problems mentioned multiple times by Arun and Alexander and help use the latest and greatest hardware. If others have comments we are all ears.

Raymond Yau

15 Jul 15 Jul

12:14 p.m.

New subject: [alsa-devel] [RFC PATCH 0/4] better support for bursty DMA usages

...

Set of patches to fix issues with hw_ptr fuzziness [1] and increased

buffering

...

w/ DSPs

disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch

up to

...

the application pointer, rewinds not supported) 2. report max in-flight bytes to avoid problems with stale data (late

wake-ups,

...

rewinds)

[1]

http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html

...

TODO:

fixes and alsa-lib updates (compile-tested only for now)

http://cgit.freedesktop.org/pulseaudio/pulseaudio/commit/src/modules/alsa/al...

Do you need a new api snd_pcm_hw_params_can_rewind() for the application to know pcm device does not support rewind ?

Alexander E. Patrakov

12:38 p.m.

New subject: [alsa-devel] [RFC PATCH 0/4] better support for bursty DMA usages

15.07.2015 15:14, Raymond Yau wrote:

...

...
TODO:

fixes and alsa-lib updates (compile-tested only for now)

http://cgit.freedesktop.org/pulseaudio/pulseaudio/commit/src/modules/alsa/al...

Do you need a new api snd_pcm_hw_params_can_rewind() for the application to know pcm device does not support rewind ?

Yes, we do need that.

Please see my LAC 2015 paper: http://lac.linuxaudio.org/2015/papers/10.pdf

-- Alexander E. Patrakov