[alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

newer
[alsa-devel] [PATCH 1/4] conf/ucm:...

Mike Brady

22 Oct 2018 22 Oct '18

9:17 p.m.

When the BCM2835 audio output is used, userspace sees a jitter up to 10ms in the audio position, aka "delay" -- the number of frames that must be output before a new frame would be played. Make this a bit nicer for userspace by interpolating the position using the CPU clock. The overhead is small -- an extra ktime_get() every time a GPU message is sent -- and another call and a few calculations whenever the delay is sought from userland. At 48,000 frames per second, i.e. approximately 20 microseconds per frame, it would take a clock inaccuracy of 20 microseconds in 10 milliseconds -- 2,000 parts per million -- to result in an inaccurate estimate, whereas crystal- or resonator-based clocks typically have an inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady mikebrady@eircom.net --- Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag

.../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++++++++++++++++++ .../vc04_services/bcm2835-audio/bcm2835.h | 1 + 2 files changed, 21 insertions(+)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c index e66da11af5cf..9053b996cada 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c @@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream *alsa_stream, atomic_set(&alsa_stream->pos, pos);

alsa_stream->period_offset += bytes; + alsa_stream->interpolate_start = ktime_get(); if (alsa_stream->period_offset >= alsa_stream->period_size) { alsa_stream->period_offset %= alsa_stream->period_size; snd_pcm_period_elapsed(substream); @@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream *substream) atomic_set(&alsa_stream->pos, 0); alsa_stream->period_offset = 0; alsa_stream->draining = false; + alsa_stream->interpolate_start = ktime_get();

return 0; } @@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct bcm2835_alsa_stream *alsa_stream = runtime->private_data; + ktime_t now = ktime_get(); + + /* Give userspace better delay reporting by interpolating between GPU + * notifications, assuming audio speed is close enough to the clock + * used for ktime + */ + + if ((ktime_to_ns(alsa_stream->interpolate_start)) && + (ktime_compare(alsa_stream->interpolate_start, now) < 0)) { + u64 interval = + (ktime_to_ns(ktime_sub(now, + alsa_stream->interpolate_start))); + u64 frames_output_in_interval = + div_u64((interval * runtime->rate), 1000000000); + snd_pcm_sframes_t frames_output_in_interval_sized = + -frames_output_in_interval; + runtime->delay = frames_output_in_interval_sized; + }

return snd_pcm_indirect_playback_pointer(substream, &alsa_stream->pcm_indirect, diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h index e13435d1c205..595ad584243f 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h @@ -78,6 +78,7 @@ struct bcm2835_alsa_stream { unsigned int period_offset; unsigned int buffer_size; unsigned int period_size; + ktime_t interpolate_start;

struct bcm2835_audio_instance *instance; int idx;

-- 2.17.1

Show replies by date

Kirill Marinushkin

23 Oct 23 Oct

12:25 a.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hello Mike,

AFAIU, this patch is wrong. Please correct me, maybe I misunderstand something.

...

The problem that this patch seeks to resolve is that when userland asks for the delay

The userspace asks not for delay, but for the pointer. You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here you are supposed to increase `alsa_stream->pos` with the proper offset. Instead, you imitate a delay, but in fact the delay is not increased.

So, the proper solution should be to fix the reported pointer. As a result, userspace will recieve the correct delay, instead of these crazy 10 ms.

...

FYI, there is a discussion of the effects of a downstream equivalent of this suggested patch at: https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.

Thank you for the link, it clarified for me what you try to achieve.

On 10/22/18 21:17, Mike Brady wrote:

...

When the BCM2835 audio output is used, userspace sees a jitter up to 10ms in the audio position, aka "delay" -- the number of frames that must be output before a new frame would be played. Make this a bit nicer for userspace by interpolating the position using the CPU clock. The overhead is small -- an extra ktime_get() every time a GPU message is sent -- and another call and a few calculations whenever the delay is sought from userland. At 48,000 frames per second, i.e. approximately 20 microseconds per frame, it would take a clock inaccuracy of 20 microseconds in 10 milliseconds -- 2,000 parts per million -- to result in an inaccurate estimate, whereas crystal- or resonator-based clocks typically have an inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady mikebrady@eircom.net

Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag

.../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++++++++++++++++++ .../vc04_services/bcm2835-audio/bcm2835.h | 1 + 2 files changed, 21 insertions(+)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c index e66da11af5cf..9053b996cada 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c @@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream *alsa_stream, atomic_set(&alsa_stream->pos, pos);

alsa_stream->period_offset += bytes;

alsa_stream->interpolate_start = ktime_get(); if (alsa_stream->period_offset >= alsa_stream->period_size) { alsa_stream->period_offset %= alsa_stream->period_size; snd_pcm_period_elapsed(substream);

@@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream *substream) atomic_set(&alsa_stream->pos, 0); alsa_stream->period_offset = 0; alsa_stream->draining = false;

alsa_stream->interpolate_start = ktime_get();

return 0;

} @@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
ktime_t now = ktime_get();

/* Give userspace better delay reporting by interpolating between GPU
* notifications, assuming audio speed is close enough to the clock
* used for ktime
*/
if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
u64 interval =
	(ktime_to_ns(ktime_sub(now,
		alsa_stream->interpolate_start)));
u64 frames_output_in_interval =
	div_u64((interval * runtime->rate), 1000000000);
snd_pcm_sframes_t frames_output_in_interval_sized =
	-frames_output_in_interval;
runtime->delay = frames_output_in_interval_sized;
}

return snd_pcm_indirect_playback_pointer(substream, &alsa_stream->pcm_indirect,
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h index e13435d1c205..595ad584243f 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h @@ -78,6 +78,7 @@ struct bcm2835_alsa_stream { unsigned int period_offset; unsigned int buffer_size; unsigned int period_size;

ktime_t interpolate_start;

struct bcm2835_audio_instance *instance; int idx;

Mike Brady

24 Oct 24 Oct

10:20 a.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hi Kirill. Thanks for your comments.

...

On 22 Oct 2018, at 23:25, Kirill Marinushkin k.marinushkin@gmail.com wrote:

AFAIU, this patch is wrong. Please correct me, maybe I misunderstand something.

...
The problem that this patch seeks to resolve is that when userland asks for the delay

The userspace asks not for delay, but for the pointer.

The call in question is snd_pcm_delay. I presume this delay is calculated from knowledge of the stream position “pos", the period (buffer?) number (and period/buffer size) and the snd_pcm_runtime structure’s “delay" field (“runtime->delay”).

...

You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here you are supposed to increase `alsa_stream->pos` with the proper offset. Instead, you imitate a delay, but in fact the delay is not increased.

So, the proper solution should be to fix the reported pointer.

I think there is a difficulty with this. The “pos” pointer looks to have to be modulo the buffer size. This causes a problem, as I see it, in that if the calculated (pos + interpolated delay in bytes) is longer than the buffer size, it must be wrapped, but AFAIK we are unable to increment a buffer index used in the snd_pcm_delay calculation. Hence the calculation of the actual position would be wrong. This is why the snd_pcm_runtime delay field is used. On reflection, BTW, the patch assumes that the field's original value was zero — that can be rectified.

...

As a result, userspace will recieve the correct delay, instead of these crazy 10 ms.

Just to point out that with the proposed patch, it appears that the correct delay is being reported, (apart, possibly, from any delay originally set in the snd_pcm_delay field, as mentioned above).

All the best, Mike

...

FYI, there is

...
a discussion of the effects of a downstream equivalent of this suggested patch at: https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.

Thank you for the link, it clarified for me what you try to achieve.

On 10/22/18 21:17, Mike Brady wrote:

...
When the BCM2835 audio output is used, userspace sees a jitter up to 10ms in the audio position, aka "delay" -- the number of frames that must be output before a new frame would be played. Make this a bit nicer for userspace by interpolating the position using the CPU clock. The overhead is small -- an extra ktime_get() every time a GPU message is sent -- and another call and a few calculations whenever the delay is sought from userland. At 48,000 frames per second, i.e. approximately 20 microseconds per frame, it would take a clock inaccuracy of 20 microseconds in 10 milliseconds -- 2,000 parts per million -- to result in an inaccurate estimate, whereas crystal- or resonator-based clocks typically have an inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady mikebrady@eircom.net

Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag

.../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++++++++++++++++++ .../vc04_services/bcm2835-audio/bcm2835.h | 1 + 2 files changed, 21 insertions(+)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c index e66da11af5cf..9053b996cada 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c @@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream *alsa_stream, atomic_set(&alsa_stream->pos, pos);

alsa_stream->period_offset += bytes;

alsa_stream->interpolate_start = ktime_get(); if (alsa_stream->period_offset >= alsa_stream->period_size) { alsa_stream->period_offset %= alsa_stream->period_size; snd_pcm_period_elapsed(substream);

@@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream *substream) atomic_set(&alsa_stream->pos, 0); alsa_stream->period_offset = 0; alsa_stream->draining = false;

alsa_stream->interpolate_start = ktime_get();

return 0;

} @@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
ktime_t now = ktime_get();

/* Give userspace better delay reporting by interpolating between GPU
* notifications, assuming audio speed is close enough to the clock
* used for ktime
*/
if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
u64 interval =
	(ktime_to_ns(ktime_sub(now,
		alsa_stream->interpolate_start)));
u64 frames_output_in_interval =
	div_u64((interval * runtime->rate), 1000000000);
snd_pcm_sframes_t frames_output_in_interval_sized =
	-frames_output_in_interval;
runtime->delay = frames_output_in_interval_sized;
}

return snd_pcm_indirect_playback_pointer(substream, &alsa_stream->pcm_indirect,
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h index e13435d1c205..595ad584243f 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h @@ -78,6 +78,7 @@ struct bcm2835_alsa_stream { unsigned int period_offset; unsigned int buffer_size; unsigned int period_size;

ktime_t interpolate_start;

struct bcm2835_audio_instance *instance; int idx;
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Kirill Marinushkin

8:06 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hello Mike,

On 10/24/18 10:20, Mike Brady wrote:

...

Hi Kirill. Thanks for your comments.

...
On 22 Oct 2018, at 23:25, Kirill Marinushkin k.marinushkin@gmail.com wrote:

AFAIU, this patch is wrong. Please correct me, maybe I misunderstand something.

...
The problem that this patch seeks to resolve is that when userland asks for the delay

The userspace asks not for delay, but for the pointer.

The call in question is snd_pcm_delay. I presume this delay is calculated from knowledge of the stream position “pos", the period (buffer?) number (and period/buffer size) and the snd_pcm_runtime structure’s “delay" field (“runtime->delay”).

In kernel, this delay is calculated in `snd_pcm_calc_delay()`, defined at `sound/core/pcm_native.c:889`. If you analyze how this calculation is done, you will see that the returned value is a summary of the following fields:

* runtime->status->hw_ptr * runtime->buffer_size * runtime->control->appl_ptr * runtime->boundary * runtime->delay

...

...
You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here you are supposed to increase `alsa_stream->pos` with the proper offset. Instead, you imitate a delay, but in fact the delay is not increased.

So, the proper solution should be to fix the reported pointer.

I think there is a difficulty with this. The “pos” pointer looks to have to be modulo the buffer size. This causes a problem, as I see it, in that if the calculated (pos + interpolated delay in bytes) is longer than the buffer size

There is no "interpolated delay". The concept of "interpolated delay" is incorrect. When you play sound - the pointer increments. But in this commit you increment the delay, as if sound doesn't play.

...

...
As a result, userspace will recieve the correct delay, instead of these crazy 10 ms.

Just to point out that with the proposed patch, it appears that the correct delay is being reported, (apart, possibly, from any delay originally set in the snd_pcm_delay field, as mentioned above).

Then I would like to point out the alsa-lib function `snd_pcm_avail()` - it will return the wrong value.

...

All the best, Mike

...
FYI, there is

...
a discussion of the effects of a downstream equivalent of this suggested patch at: https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.

Thank you for the link, it clarified for me what you try to achieve.

On 10/22/18 21:17, Mike Brady wrote:

...
When the BCM2835 audio output is used, userspace sees a jitter up to 10ms in the audio position, aka "delay" -- the number of frames that must be output before a new frame would be played. Make this a bit nicer for userspace by interpolating the position using the CPU clock. The overhead is small -- an extra ktime_get() every time a GPU message is sent -- and another call and a few calculations whenever the delay is sought from userland. At 48,000 frames per second, i.e. approximately 20 microseconds per frame, it would take a clock inaccuracy of 20 microseconds in 10 milliseconds -- 2,000 parts per million -- to result in an inaccurate estimate, whereas crystal- or resonator-based clocks typically have an inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady mikebrady@eircom.net

Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag

.../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++++++++++++++++++ .../vc04_services/bcm2835-audio/bcm2835.h | 1 + 2 files changed, 21 insertions(+)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c index e66da11af5cf..9053b996cada 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c @@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream *alsa_stream, atomic_set(&alsa_stream->pos, pos);

alsa_stream->period_offset += bytes;

alsa_stream->interpolate_start = ktime_get(); if (alsa_stream->period_offset >= alsa_stream->period_size) { alsa_stream->period_offset %= alsa_stream->period_size; snd_pcm_period_elapsed(substream);

@@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream *substream) atomic_set(&alsa_stream->pos, 0); alsa_stream->period_offset = 0; alsa_stream->draining = false;

alsa_stream->interpolate_start = ktime_get();

return 0;

} @@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
ktime_t now = ktime_get();

/* Give userspace better delay reporting by interpolating between GPU
* notifications, assuming audio speed is close enough to the clock
* used for ktime
*/
if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
u64 interval =
	(ktime_to_ns(ktime_sub(now,
		alsa_stream->interpolate_start)));
u64 frames_output_in_interval =
	div_u64((interval * runtime->rate), 1000000000);
snd_pcm_sframes_t frames_output_in_interval_sized =
	-frames_output_in_interval;
runtime->delay = frames_output_in_interval_sized;
}

return snd_pcm_indirect_playback_pointer(substream, &alsa_stream->pcm_indirect,
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h index e13435d1c205..595ad584243f 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h @@ -78,6 +78,7 @@ struct bcm2835_alsa_stream { unsigned int period_offset; unsigned int buffer_size; unsigned int period_size;

ktime_t interpolate_start;

struct bcm2835_audio_instance *instance; int idx;
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Mike Brady

9:54 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hi Kirill. Thanks again for your comments.

...

On 24 Oct 2018, at 19:06, Kirill Marinushkin k.marinushkin@gmail.com wrote:

Hello Mike,

On 10/24/18 10:20, Mike Brady wrote:

...
Hi Kirill. Thanks for your comments.

...
On 22 Oct 2018, at 23:25, Kirill Marinushkin k.marinushkin@gmail.com wrote:

AFAIU, this patch is wrong. Please correct me, maybe I misunderstand something.

...
The problem that this patch seeks to resolve is that when userland asks for the delay

The userspace asks not for delay, but for the pointer.

The call in question is snd_pcm_delay. I presume this delay is calculated from knowledge of the stream position “pos", the period (buffer?) number (and period/buffer size) and the snd_pcm_runtime structure’s “delay" field (“runtime->delay”).

In kernel, this delay is calculated in `snd_pcm_calc_delay()`, defined at `sound/core/pcm_native.c:889`. If you analyze how this calculation is done, you will see that the returned value is a summary of the following fields:

runtime->status->hw_ptr

runtime->buffer_size

runtime->control->appl_ptr

runtime->boundary

runtime->delay

That’s very useful, thanks.

...

...
...
You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here you are supposed to increase `alsa_stream->pos` with the proper offset. Instead, you imitate a delay, but in fact the delay is not increased.

So, the proper solution should be to fix the reported pointer.

I think there is a difficulty with this. The “pos” pointer looks to have to be modulo the buffer size. This causes a problem, as I see it, in that if the calculated (pos + interpolated delay in bytes) is longer than the buffer size

There is no "interpolated delay". The concept of "interpolated delay" is incorrect.

Yes, my language here is wrong. What I mean is the estimated number of frames output since the pointer was last updated — let’s call it the `interpolated frame count`.

...

When you play sound - the pointer increments.

Unfortunately, when you play sound, the pointer does not actually increment, for up to about 10 milliseconds. I know of no way to actually access the true “live” position of the frame that is being played at any instant; hence the desire to estimate it.

What actually seems to be happening is that when `bcm2835_playback_fifo` is called, the pointer is updated, but as frames are individually output to the DAC, this pointer does not increment. It is not updated until the next time `bcm2835_playback_fifo` is called.

...

But in this commit you increment the delay, as if sound doesn't play.

It is true that the patch does make use of the snd_pcm_runtime structure’s “delay" field (aka "runtime->delay” here). That field is defined for: “/* extra delay; typically FIFO size */”. Clearly it is not being used for that here — it is being used simply because it is part of the calculation done in snd_pcm_calc_delay(), as you point out. At present, it looks like that field isn’t being used –– it’s set to zero –– and not modified anywhere else in the driver, AFAICS. If it was necessary, it would be a simple matter to preserve whatever value it was given.

...

...
...
As a result, userspace will recieve the correct delay, instead of these crazy 10 ms.

Just to point out that with the proposed patch, it appears that the correct delay is being reported, (apart, possibly, from any delay originally set in the snd_pcm_delay field, as mentioned above).

Then I would like to point out the alsa-lib function `snd_pcm_avail()` - it will return the wrong value.

It is already the case that snd_pcm_avail() does not return the true delay. The ALSA documentation states: "The value returned by that call [i.e. the snd_pcm_avail*() functions] is not directly related to the delay…”

Kind regards Mike

...

...
...
FYI, there is

...
a discussion of the effects of a downstream equivalent of this suggested patch at: https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.

Thank you for the link, it clarified for me what you try to achieve.

On 10/22/18 21:17, Mike Brady wrote:

...
When the BCM2835 audio output is used, userspace sees a jitter up to 10ms in the audio position, aka "delay" -- the number of frames that must be output before a new frame would be played. Make this a bit nicer for userspace by interpolating the position using the CPU clock. The overhead is small -- an extra ktime_get() every time a GPU message is sent -- and another call and a few calculations whenever the delay is sought from userland. At 48,000 frames per second, i.e. approximately 20 microseconds per frame, it would take a clock inaccuracy of 20 microseconds in 10 milliseconds -- 2,000 parts per million -- to result in an inaccurate estimate, whereas crystal- or resonator-based clocks typically have an inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady mikebrady@eircom.net

Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag

.../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++++++++++++++++++ .../vc04_services/bcm2835-audio/bcm2835.h | 1 + 2 files changed, 21 insertions(+)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c index e66da11af5cf..9053b996cada 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c @@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream *alsa_stream, atomic_set(&alsa_stream->pos, pos);

alsa_stream->period_offset += bytes;

alsa_stream->interpolate_start = ktime_get(); if (alsa_stream->period_offset >= alsa_stream->period_size) { alsa_stream->period_offset %= alsa_stream->period_size; snd_pcm_period_elapsed(substream);

@@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream *substream) atomic_set(&alsa_stream->pos, 0); alsa_stream->period_offset = 0; alsa_stream->draining = false;

alsa_stream->interpolate_start = ktime_get();

return 0;

} @@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
ktime_t now = ktime_get();

/* Give userspace better delay reporting by interpolating between GPU
* notifications, assuming audio speed is close enough to the clock
* used for ktime
*/
if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
u64 interval =
	(ktime_to_ns(ktime_sub(now,
		alsa_stream->interpolate_start)));
u64 frames_output_in_interval =
	div_u64((interval * runtime->rate), 1000000000);
snd_pcm_sframes_t frames_output_in_interval_sized =
	-frames_output_in_interval;
runtime->delay = frames_output_in_interval_sized;
}

return snd_pcm_indirect_playback_pointer(substream, &alsa_stream->pcm_indirect,
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h index e13435d1c205..595ad584243f 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h @@ -78,6 +78,7 @@ struct bcm2835_alsa_stream { unsigned int period_offset; unsigned int buffer_size; unsigned int period_size;

ktime_t interpolate_start;

struct bcm2835_audio_instance *instance; int idx;
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Kirill Marinushkin

25 Oct 25 Oct

12:02 a.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hello Mike,

We are not on the same page. What you hear is not what I tell you. Either you don't understand what happens in your commit, or I don't understand what happens in the driver.

Hopefully somebody in the community can comment here.

On 10/24/18 21:54, Mike Brady wrote:

...

...
...
...
You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here you are supposed to increase `alsa_stream->pos` with the proper offset. Instead, you imitate a delay, but in fact the delay is not increased.

So, the proper solution should be to fix the reported pointer.

I think there is a difficulty with this. The “pos” pointer looks to have to be modulo the buffer size. This causes a problem, as I see it, in that if the calculated (pos + interpolated delay in bytes) is longer than the buffer size

There is no "interpolated delay". The concept of "interpolated delay" is incorrect.

Yes, my language here is wrong. What I mean is the estimated number of frames output since the pointer was last updated — let’s call it the `interpolated frame count`.

That's not what I mean. From my perspective, the problem is not in language, but in the concept which you introduce here.

...

...
When you play sound - the pointer increments.

Unfortunately, when you play sound, the pointer does not actually increment, for up to about 10 milliseconds. I know of no way to actually access the true “live” position of the frame that is being played at any instant; hence the desire to estimate it.

Your vision of situation in the opposite from my vision. What you see as a symptom - I see as a root cause. As I see, you should fix the pointer-not-incrementing. Why do you think that it's okay that the pointer is not updating during sound play? Why do you insist that there is a delay? I don't understand why we are so stuck here.

...

What actually seems to be happening is that when `bcm2835_playback_fifo` is called, the pointer is updated, but as frames are individually output to the DAC, this pointer does not increment. It is not updated until the next time `bcm2835_playback_fifo` is called.

...
But in this commit you increment the delay, as if sound doesn't play.

It is true that the patch does make use of the snd_pcm_runtime structure’s “delay" field (aka "runtime->delay” here). That field is defined for: “/* extra delay; typically FIFO size */”. Clearly it is not being used for that here — it is being used simply because it is part of the calculation done in snd_pcm_calc_delay(), as you point out. At present, it looks like that field isn’t being used –– it’s set to zero –– and not modified anywhere else in the driver, AFAICS. If it was necessary, it would be a simple matter to preserve whatever value it was given.

That's not what I am talking about. Somehow we don't understand each other.

...

...
...
...
As a result, userspace will recieve the correct delay, instead of these crazy 10 ms.

Just to point out that with the proposed patch, it appears that the correct delay is being reported, (apart, possibly, from any delay originally set in the snd_pcm_delay field, as mentioned above).

Then I would like to point out the alsa-lib function `snd_pcm_avail()` - it will return the wrong value.

It is already the case that snd_pcm_avail() does not return the true delay. The ALSA documentation states: "The value returned by that call [i.e. the snd_pcm_avail*() functions] is not directly related to the delay…”

Do you mean, that you are submitting the patch into the upstream kernel without reading the code?

snd_pcm_avail() is calculated based on:

* hw_ptr * buffer_size * appl_ptr * boundary

If you fix hw_ptr - it will fix both snd_pcm_delay() and snd_pcm_avail(). Instead, you invent the "interpolated delay", which in fact only compensates the wrong hw_ptr instead of fixing it.

Best Regards, Kirill

Takashi Iwai

9:37 a.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

On Thu, 25 Oct 2018 00:02:34 +0200, Kirill Marinushkin wrote:

...

...
...
When you play sound - the pointer increments.

Unfortunately, when you play sound, the pointer does not actually increment, for up to about 10 milliseconds. I know of no way to actually access the true “live” position of the frame that is being played at any instant; hence the desire to estimate it.

Your vision of situation in the opposite from my vision. What you see as a symptom - I see as a root cause. As I see, you should fix the pointer-not-incrementing. Why do you think that it's okay that the pointer is not updating during sound play? Why do you insist that there is a delay? I don't understand why we are so stuck here.

Well, in the API POV, it's nothing wrong to keep hwptr sticking while updating only delay value. It implies that the hardware chip doesn't provide the hwptr update.

Though, usually the delay value is provided also from the hardware, e.g. reading the link position or such. It's a typical case like USB-audio, where the hwptr gets updated and the delay indicates the actual position *behind* hwptr. That works because hwptr shows the position in the ring buffer at which you can access the data. And it doesn't mean that hwptr is the actually playing position, but it can be ahead of the current position, when many samples are queued on FIFO. The delay is provided to correct the position back to the actual point.

But, this also doesn't mean that the delay shouldn't be used for the purpose like this patchset, either. OTOH, providing a finer hwptr value would be likely more apps-friendly; there must be many programs that don't evaluate the delay value properly.

So, I suppose that hwptr update might be a better option if the code won't become too complex. Let's see.

One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Maybe it's worth to have a module option to suppress this optimistic hwptr update behavior, in case something went wrong with clocks?

thanks,

Takashi

Kirill Marinushkin

7:20 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hello Takashi, Mike,

@Takashi

On 10/25/18 09:37, Takashi Iwai wrote:

...

Well, in the API POV, it's nothing wrong to keep hwptr sticking while updating only delay value. It implies that the hardware chip doesn't provide the hwptr update.

Thank you for the clarification. Modifying `runtime->delay` from the `pointer` function looked wrong for me. Now I understand the motivation and the use-case. I will be more careful when analyzing the code which doesn't fit my expectations.

@Mike

I was wrong. You can ignore my comments. Please don't take them personal: it's all about having high-quality code in kernel.

Best Regards, Kirill

Mike Brady

28 Oct 28 Oct

3:24 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Hi Kirill. Thanks for the post. Mike

...

On 25 Oct 2018, at 18:20, Kirill Marinushkin k.marinushkin@gmail.com wrote:

Hello Takashi, Mike,

@Takashi

On 10/25/18 09:37, Takashi Iwai wrote:

...
Well, in the API POV, it's nothing wrong to keep hwptr sticking while updating only delay value. It implies that the hardware chip doesn't provide the hwptr update.

Thank you for the clarification. Modifying `runtime->delay` from the `pointer` function looked wrong for me. Now I understand the motivation and the use-case. I will be more careful when analyzing the code which doesn't fit my expectations.

@Mike

I was wrong. You can ignore my comments. Please don't take them personal: it's all about having high-quality code in kernel.

Best Regards, Kirill _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Mike Brady

3:26 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

...

On 25 Oct 2018, at 08:37, Takashi Iwai tiwai@suse.de wrote:

On Thu, 25 Oct 2018 00:02:34 +0200, Kirill Marinushkin wrote:

...
...
...
When you play sound - the pointer increments.

Unfortunately, when you play sound, the pointer does not actually increment, for up to about 10 milliseconds. I know of no way to actually access the true “live” position of the frame that is being played at any instant; hence the desire to estimate it.

Your vision of situation in the opposite from my vision. What you see as a symptom - I see as a root cause. As I see, you should fix the pointer-not-incrementing. Why do you think that it's okay that the pointer is not updating during sound play? Why do you insist that there is a delay? I don't understand why we are so stuck here.

Well, in the API POV, it's nothing wrong to keep hwptr sticking while updating only delay value. It implies that the hardware chip doesn't provide the hwptr update.

Though, usually the delay value is provided also from the hardware, e.g. reading the link position or such. It's a typical case like USB-audio, where the hwptr gets updated and the delay indicates the actual position *behind* hwptr. That works because hwptr shows the position in the ring buffer at which you can access the data. And it doesn't mean that hwptr is the actually playing position, but it can be ahead of the current position, when many samples are queued on FIFO. The delay is provided to correct the position back to the actual point.

But, this also doesn't mean that the delay shouldn't be used for the purpose like this patchset, either. OTOH, providing a finer hwptr value would be likely more apps-friendly; there must be many programs that don't evaluate the delay value properly.

So, I suppose that hwptr update might be a better option if the code won't become too complex. Let's see.

Indeed. It will take me a few days to look into this…

Regards Mike

...

One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Maybe it's worth to have a module option to suppress this optimistic hwptr update behavior, in case something went wrong with clocks?

thanks,

Takashi

Mike Brady

5 Nov 5 Nov

4:57 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

Thanks for the comments and suggestions.

...

On 25 Oct 2018, at 08:37, Takashi Iwai tiwai@suse.de wrote:

Well, in the API POV, it's nothing wrong to keep hwptr sticking while updating only delay value. It implies that the hardware chip doesn't provide the hwptr update.

As I understand it, this driver stages settings, data and status information for the true audio driver which is part of VideoCore (VC). The driver communicates with the VC by sending messages. Responses come back in asynchronous callbacks. There doesn’t seem to be any other source of data or status.

When parameters such as frame format, rate and period size have been set up, the VC executes periodic callbacks to retrieve period-sized buffers of data. At 44,100 frames per second and with standard 444-frame periods, callbacks occur approximately every 10.07 milliseconds.

...

Though, usually the delay value is provided also from the hardware, e.g. reading the link position or such. It's a typical case like USB-audio, where the hwptr gets updated and the delay indicates the actual position *behind* hwptr. That works because hwptr shows the position in the ring buffer at which you can access the data. And it doesn't mean that hwptr is the actually playing position, but it can be ahead of the current position, when many samples are queued on FIFO. The delay is provided to correct the position back to the actual point.

The information that the alsa snd_pcm_delay() function depends on is updated during these callbacks. Thus, a user program monitoring the snc_pcm_delay() value closely will see sudden jumps in the value every 10 milliseconds or so — a 10 millisecond jitter.

...

But, this also doesn't mean that the delay shouldn't be used for the purpose like this patchset, either. OTOH, providing a finer hwptr value would be likely more apps-friendly; there must be many programs that don't evaluate the delay value properly.

With interpolation, the number of frames that would have been output from the time of the last callback is subtracted from the delay to give a more accurate estimate of the actual delay at the time it is requested.

...

So, I suppose that hwptr update might be a better option if the code won't become too complex. Let's see.

Having looked at the code, there does not seem to be a good way to avoid interpolation. Later versions of the interface include a message type of VC_AUDIO_MSG_TYPE_LATENCY (see https://github.com/raspberrypi/firmware/blob/master/opt/vc/include/interface...) which seems to be a request to return the latency. However, the latency would be returned in an asynchronous callback (see function audio_vchi_callback in bcm2835-vchiq.c). One can wait for the result, but it seems that it could take up to 10 milliseconds (see function bcm2835_audio_send_msg_locked in bcm2835-vchiq.c). This is hardly tolerable, and to avoid it, one would have to store both the latency returned and the time the request was sent (or the time the reply was returned — it’s not clear which would be correct) and interpolate from that to the time the delay is requested. In other words, from the point of view of avoiding interpolation, this is likely to be no better than the present suggestion. There wold also be a need to make the latency request periodically, adding to the overhead.

Without getting a good deal more information about the VC, which may not be available, I’m afraid I can’t see a way of getting a better fix on the instantaneous values of pointers such as the hw_ptr. BTW, I have not been able to find a source for the file vc_vchi_audioserv_defs.h, which looks like a Broadcom file and which appears to have two versions. If anyone could point me to the source, I’d be grateful.

...

One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Agreed — that would give the caller the information needed to do the interpolation for themselves if desired.

...

Maybe it's worth to have a module option to suppress this optimistic hwptr update behavior, in case something went wrong with clocks?

Following this suggestion, I have updated the patch to include a module parameter ‘enable_delay_interpolation’, and I will post that later for consideration.

Regards Mike

...

thanks,

Takashi

...

Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Takashi Iwai

5:11 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

On Mon, 05 Nov 2018 16:57:07 +0100, Mike Brady wrote:

...

...
One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Agreed — that would give the caller the information needed to do the interpolation for themselves if desired.

And now I wonder whether the problem is still present with the latest code. There was a (kind of) regression in this regard when we introduced the fine-grained hardware timestamping, but it should have been addressed by the commit 20e3f985bb875fea4f86b04eba4b6cc29bfd6b71 ALSA: pcm: update tstamp only if audio_tstamp changed

Could you double-check whether the tstamp field gets still updated even if no hwptr (and delay) is changed?

thanks,

Takashi

Mike Brady

6 Nov 6 Nov

10:05 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

...

On 5 Nov 2018, at 16:11, Takashi Iwai tiwai@suse.de wrote:

On Mon, 05 Nov 2018 16:57:07 +0100, Mike Brady wrote:

...
...
One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Agreed — that would give the caller the information needed to do the interpolation for themselves if desired.

And now I wonder whether the problem is still present with the latest code. There was a (kind of) regression in this regard when we introduced the fine-grained hardware timestamping, but it should have been addressed by the commit 20e3f985bb875fea4f86b04eba4b6cc29bfd6b71 ALSA: pcm: update tstamp only if audio_tstamp changed

Could you double-check whether the tstamp field gets still updated even if no hwptr (and delay) is changed?

Yes, this could be a bit problematic. The function update_audio_tstamp in pcm_lib.c could include the interpolated delay in the calculation of audio_tstamp, and hence could trigger the update of tstamp.

Another issue, as I see it, is that the the audio_tstamp value would depend on whether, and when, a snd_pcm_delay() call (which recalculates the interpolation and puts it into the delay field) was made immediately prior to it. By zeroing the delay when a GPU interrupt occurs, you could be certain that the interpolated delay would be less than or equal to the true delay, but this doesn’t seem very satisfactory — you have neither the timestamp of the last update nor the correctly interpolated timestamp.

Sadly, therefore, I’m now of the view that this approach to interpolating the delay between GPU interrupts is not really viable. Would that be your view?

Regards Mike

...

thanks,

Takashi

Takashi Iwai

10:31 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

On Tue, 06 Nov 2018 22:05:11 +0100, Mike Brady wrote:

...

...
On 5 Nov 2018, at 16:11, Takashi Iwai tiwai@suse.de wrote:

On Mon, 05 Nov 2018 16:57:07 +0100, Mike Brady wrote:

...
...
One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Agreed — that would give the caller the information needed to do the interpolation for themselves if desired.

And now I wonder whether the problem is still present with the latest code. There was a (kind of) regression in this regard when we introduced the fine-grained hardware timestamping, but it should have been addressed by the commit 20e3f985bb875fea4f86b04eba4b6cc29bfd6b71 ALSA: pcm: update tstamp only if audio_tstamp changed

Could you double-check whether the tstamp field gets still updated even if no hwptr (and delay) is changed?

Yes, this could be a bit problematic. The function update_audio_tstamp in pcm_lib.c could include the interpolated delay in the calculation of audio_tstamp, and hence could trigger the update of tstamp.

Well, my question is about the current driver as-is. It has no runtime->delay, so far, hence audio_tstamp is calculated only from the hwptr position. As the corresponding tstamp gets updated only when the audio_tstamp (i.e. hwptr) is updated, the driver should provide the consistent pair of audio_tstamp (i.e. hwptr) vs tstamp.

...

Another issue, as I see it, is that the the audio_tstamp value would depend on whether, and when, a snd_pcm_delay() call (which recalculates the interpolation and puts it into the delay field) was made immediately prior to it. By zeroing the delay when a GPU interrupt occurs, you could be certain that the interpolated delay would be less than or equal to the true delay, but this doesn’t seem very satisfactory — you have neither the timestamp of the last update nor the correctly interpolated timestamp.

No, audio_stamp field is updated at snd_pcm_period_elapsed() call as well as tstamp field.

Basically the driver provides three things: hwptr, tstamp and audio_tstamp. For the default configuration (like bcm audio does), audio_tstamp is calculated from hwptr, so it can be seen as the hwptr represented in timespec. OTOH, tstamp is the actual system time that is updated only when audio_tstamp changes -- which means tstamp gets updated *only* at snd_pcm_period_elapsed() call on bcm audio.

And, my point is that you should be able to interpolate the actual position in user-space side based on these information; it doesn't have to be done in the kernel at all.

...

Sadly, therefore, I’m now of the view that this approach to interpolating the delay between GPU interrupts is not really viable. Would that be your view?

Actually there were some bugs in the past that the tstamp was updated at each snd_pcm_status(), but it should have been fixed in the recent kernels. That's why I asked to re-check the current status.

thanks,

Takashi

Mike Brady

11 Nov 11 Nov

7:08 p.m.

New subject: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

...

On 6 Nov 2018, at 21:31, Takashi Iwai tiwai@suse.de wrote:

On Tue, 06 Nov 2018 22:05:11 +0100, Mike Brady wrote:

...
...
On 5 Nov 2018, at 16:11, Takashi Iwai tiwai@suse.de wrote:

On Mon, 05 Nov 2018 16:57:07 +0100, Mike Brady wrote:

...
...
One another thing I'd like to point out is that the value given in the patch is nothing but an estimated position, optimistically calculated via the system timer. Mike and I had already discussion in another thread, and another possible option would be to provide the proper timestamp-vs-hwptr pair, instead of updating the timestamp always at the status read.

Agreed — that would give the caller the information needed to do the interpolation for themselves if desired.

And now I wonder whether the problem is still present with the latest code. There was a (kind of) regression in this regard when we introduced the fine-grained hardware timestamping, but it should have been addressed by the commit 20e3f985bb875fea4f86b04eba4b6cc29bfd6b71 ALSA: pcm: update tstamp only if audio_tstamp changed

Could you double-check whether the tstamp field gets still updated even if no hwptr (and delay) is changed?

Yes, this could be a bit problematic. The function update_audio_tstamp in pcm_lib.c could include the interpolated delay in the calculation of audio_tstamp, and hence could trigger the update of tstamp.

Well, my question is about the current driver as-is. It has no runtime->delay, so far, hence audio_tstamp is calculated only from the hwptr position. As the corresponding tstamp gets updated only when the audio_tstamp (i.e. hwptr) is updated, the driver should provide the consistent pair of audio_tstamp (i.e. hwptr) vs tstamp.

Fair enough — I had been thinking about the situation with the patch in place.

...

...
Another issue, as I see it, is that the the audio_tstamp value would depend on whether, and when, a snd_pcm_delay() call (which recalculates the interpolation and puts it into the delay field) was made immediately prior to it. By zeroing the delay when a GPU interrupt occurs, you could be certain that the interpolated delay would be less than or equal to the true delay, but this doesn’t seem very satisfactory — you have neither the timestamp of the last update nor the correctly interpolated timestamp.

No, audio_stamp field is updated at snd_pcm_period_elapsed() call as well as tstamp field.

That's great.

...

Basically the driver provides three things: hwptr, tstamp and audio_tstamp. For the default configuration (like bcm audio does), audio_tstamp is calculated from hwptr, so it can be seen as the hwptr represented in timespec. OTOH, tstamp is the actual system time that is updated only when audio_tstamp changes -- which means tstamp gets updated *only* at snd_pcm_period_elapsed() call on bcm audio.

And, my point is that you should be able to interpolate the actual position in user-space side based on these information; it doesn't have to be done in the kernel at all.

That is true, of course. The problem is that the snd_pcm_delay() call is so inaccurate though.

...

...
Sadly, therefore, I’m now of the view that this approach to interpolating the delay between GPU interrupts is not really viable. Would that be your view?

...

Actually there were some bugs in the past that the tstamp was updated at each snd_pcm_status(), but it should have been fixed in the recent kernels. That's why I asked to re-check the current status.

Yes, as far as I can see, that's fixed.

In further testing, however, I noticed that the audio_frames calculation in update_audio_tstamp() in pcm_lib.c didn't include the delay, so now it does if the delay field is negative, which it is "naturally" in this case. With that change, the delay reported by snd_pcm_delay() and calculated as you referred to above are consistent.

So, overall, I am happier that this approach is at least viable. But two issues remain, in my view: First, is it "a good idea"? Second, the delay field is now being used as a delay if its positive and an interpolation if it's negative. It works, but would it be better to have an extra "interpolation" field?

I'll post the updated patch shortly.

Thanks, Mike

Stefan Wahren

9:18 p.m.

New subject: [alsa-devel] [PATCH v2] ARM: staging: bcm2835-audio: interpolate audio delay

Hi Mike,

...

Mike Brady mikebrady@eircom.net hat am 11. November 2018 um 19:21 geschrieben:

When the BCM2835 audio output is used, userspace sees a jitter up to 10ms in the audio position, aka "delay" -- the number of frames that must be output before a new frame would be played. Make this a bit nicer for userspace by interpolating the position using the CPU clock. The overhead is small -- an extra ktime_get() every time a GPU message is sent -- and another call and a few calculations whenever the delay is sought from userland. At 48,000 frames per second, i.e. approximately 20 microseconds per frame, it would take a clock inaccuracy of 20 microseconds in 10 milliseconds -- 2,000 parts per million -- to result in an inaccurate estimate, whereas crystal- or resonator-based clocks typically have an inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady mikebrady@eircom.net

the former version of your patch has already been applied to staging-next.

Didn't you received Greg's email?

So please rebase your changes.

Stefan

Takashi Iwai

13 Nov 13 Nov

5:50 p.m.

New subject: [alsa-devel] [PATCH v2] ARM: staging: bcm2835-audio: interpolate audio delay

On Sun, 11 Nov 2018 19:21:29 +0100, Mike Brady wrote:

...

/* hardware definition */ static const struct snd_pcm_hardware snd_bcm2835_playback_hw = { .info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
 SNDRV_PCM_INFO_DRAIN_TRIGGER | SNDRV_PCM_INFO_SYNC_APPLPTR),
 SNDRV_PCM_INFO_BATCH | SNDRV_PCM_INFO_DRAIN_TRIGGER |
 SNDRV_PCM_INFO_SYNC_APPLPTR),

As already mentioned, the addition of SNDRV_PCM_INFO_BATCH should be irrelevant with this change. Please create another patch to add this and clarify it in the changelog.

...

diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 4e6110d778bd..574df7d7a1fa 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -229,19 +229,38 @@ static void update_audio_tstamp(struct snd_pcm_substream *substream, (runtime->audio_tstamp_report.actual_type == SNDRV_PCM_AUDIO_TSTAMP_TYPE_DEFAULT)) {

```
/*
```

 * provide audio timestamp derived from pointer position

```
 * add delay only if requested
```
```
 */
```

```
// provide audio timestamp derived from pointer position
```
audio_frames = runtime->hw_ptr_wrap + runtime->status->hw_ptr;

if (runtime->audio_tstamp_config.report_delay) {

```
/*
```

 * If the runtime->delay is greater than zero, it's a

 * genuine delay, e.g. a delay due to a hardware fifo.

```
 *
```

 * But if the runtime->delay is less than zero, it's an

 * interpolated estimate of the number of frames output

 * since the hardware pointer was last updated.

```
 *
```

 * It would be calculated in the pointer callback.

 * For example, for the bcm_2835 driver, it is calculated in

```
 * snd_bcm2835_pcm_pointer().
```
```
 */
```
```
if (runtime->delay < 0) {
```

	// The delay is an interpolated estimate...
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)

```
		audio_frames -=  runtime->delay;
```
```
	else
```
```
		audio_frames +=  runtime->delay;
```

```
		audio_frames += runtime->delay;
```
```
} else {
```

	// The delay is a real delay. Add it if requested.

	if (runtime->audio_tstamp_config.report_delay) {

```
		if (substream->stream ==
```
```
		    SNDRV_PCM_STREAM_PLAYBACK)
```
```
			audio_frames -=  runtime->delay;
```
```
		else
```
```
			audio_frames +=  runtime->delay;
```
```
	}
```
}

Well, honestly speaking, I'm really not fond of changing the PCM core just for this.

Can we make bcm audio driver providing the finer pointer update instead? If we have a module option to disable that behavior, it's an enough excuse in case anyone really cares about the accuracy.

thanks,

Takashi

Mike Brady

1 Jan 1 Jan

11:02 a.m.

New subject: [alsa-devel] [PATCH v2] ARM: staging: bcm2835-audio: interpolate audio delay

Apologies for the delay coming back to this.

Having had a long look at the further implications of this proposed change, I would like to withdraw the suggested patch.

I agree that is would seem excessive to have to change the PCM core to accommodate it. Furthermore, the idea only works if is is legitimate to assume that frames are smoothly consumed in the interpolation interval. If that assumption were not to hold for some reason, then the delay reported would be wrong.

Thanks to everyone for their comments and inputs.

Regards Mike

...

On 13 Nov 2018, at 16:50, Takashi Iwai tiwai@suse.de wrote:

On Sun, 11 Nov 2018 19:21:29 +0100, Mike Brady wrote:

...
/* hardware definition */ static const struct snd_pcm_hardware snd_bcm2835_playback_hw = { .info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
 SNDRV_PCM_INFO_DRAIN_TRIGGER | SNDRV_PCM_INFO_SYNC_APPLPTR),
 SNDRV_PCM_INFO_BATCH | SNDRV_PCM_INFO_DRAIN_TRIGGER |
 SNDRV_PCM_INFO_SYNC_APPLPTR),
As already mentioned, the addition of SNDRV_PCM_INFO_BATCH should be irrelevant with this change. Please create another patch to add this and clarify it in the changelog.

...
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 4e6110d778bd..574df7d7a1fa 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -229,19 +229,38 @@ static void update_audio_tstamp(struct snd_pcm_substream *substream, (runtime->audio_tstamp_report.actual_type == SNDRV_PCM_AUDIO_TSTAMP_TYPE_DEFAULT)) {
/*
 * provide audio timestamp derived from pointer position
 * add delay only if requested
 */
// provide audio timestamp derived from pointer position
audio_frames = runtime->hw_ptr_wrap + runtime->status->hw_ptr;
if (runtime->audio_tstamp_config.report_delay) {
/*
 * If the runtime->delay is greater than zero, it's a
 * genuine delay, e.g. a delay due to a hardware fifo.
 *
 * But if the runtime->delay is less than zero, it's an
 * interpolated estimate of the number of frames output
 * since the hardware pointer was last updated.
 *
 * It would be calculated in the pointer callback.
 * For example, for the bcm_2835 driver, it is calculated in
 * snd_bcm2835_pcm_pointer().
 */
if (runtime->delay < 0) {
	// The delay is an interpolated estimate...
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
		audio_frames -=  runtime->delay;
	else
		audio_frames +=  runtime->delay;
		audio_frames += runtime->delay;
} else {
	// The delay is a real delay. Add it if requested.
	if (runtime->audio_tstamp_config.report_delay) {
		if (substream->stream ==
		    SNDRV_PCM_STREAM_PLAYBACK)
			audio_frames -=  runtime->delay;
		else
			audio_frames +=  runtime->delay;
	}
}
Well, honestly speaking, I'm really not fond of changing the PCM core just for this.

Can we make bcm audio driver providing the finer pointer update instead? If we have a module option to disable that behavior, it's an enough excuse in case anyone really cares about the accuracy.

thanks,

Takashi

2601

Age (days ago)

2672

Last active (days ago)

List overview

Download

17 comments

4 participants

tags (0)

participants (4)

Kirill Marinushkin
Mike Brady
Stefan Wahren
Takashi Iwai