[alsa-devel] [PATCH] ALSA: usb: improve delay reporting

Pierre-Louis Bossart pierre-louis.bossart at linux.intel.com
Thu Apr 5 16:13:11 CEST 2018


On 4/5/18 6:49 AM, Georg Chini wrote:
> On 04.04.2018 21:24, Pierre-Louis Bossart wrote:
>> On 04/04/2018 08:57 AM, Georg Chini wrote:
>>
>>> Currently, the kernel uses only the number of USB frames to 
>>> interpolate the delay
>>> between sending or receiving URB's. This limits the granuarity of the 
>>> delay reports
>>> to one millisecond because the frame counter is updated in a 
>>> millisecond interval.
>>>
>>> This patch additionally uses time stamps to achieve higher precision 
>>> of the reported
>>> delay. If the time difference computed using time stamps does not 
>>> deviate more than
>>> one millisecond from the difference computed using frames, it is 
>>> assumed that the
>>> system time difference delivers the more precise delay estimate.
>>>
>>> This will significantly increase the precision of the reported delay 
>>> in the majority
>>> of cases where sound card clock and system clock are almost in sync 
>>> while falling back
>>> to the old method for the rare cases where the two times differ too 
>>> much.
>> I am all for better delay estimates but I have mixed feelings about 
>> this - and some hardware-related parts are broken.
> 
> Well, I only stumbled into this because of the recent delay estimation 
> issues in pulseaudio,
> so forgive my ignorance about ALSA. If the general idea of the patch is 
> not right, just ignore
> my other comments and drop the patch.

It's complicated to change this part of the code. When I added the use 
of the frame counter some 5 years back, it created all sorts of issues 
on various platforms that I was not even aware of.

Your implementation raises obvious questions and would need quite a bit 
of revalidation. In its current state it's not ready for prime time.

> 
>>
>> The frame counter is already an approximation of the actual delay 
>> which doesn't really account for the type of USB interface (Isoc, 
>> Async) used on the device side. You'd be adding a lot of precision 
>> without really solving the fact that you don't know what the actual 
>> playback rate is. You'd really need a time smoother that models 
>> playback rate v. system time if you want to be fully accurate below 
>> the ms level for all types of USB audio devices.
> 
> The goal is not to be fully accurate, only to be somewhat better than 
> the current estimate.

Define 'better'. What is the goal and the expected benefit for applications?

> I'll copy here what I already said to Takashi in private communication:
> 
> The mismatch between sound card time and system time should not matter 
> on the
> timescale we are talking about. Let's say we are using 10 USB frames 
> (which is
> more than I have seen while debugging), so hw_ptr is updated every 10 
> ms. Assume
> that the deviation between sound card and system time is 1% (which again 
> should
> be more than you will ever see). Then using system time, the computed 
> delay can
> be off by 100 microseconds at maximum, while the delay computed using 
> USB frames
> can be off by 2ms. Therefore using system time should be acceptable in that
> situation because it delivers higher precision results as can be 
> achieved using
> USB frames.
> 
> For the rare cases where system time and sound card time are completely 
> different,
> the patch falls back to the original method.
> 
>>
>> In addition I've only seen drivers report the delay based on hardware 
>> counters. If we start doing interpolations based on system time, then 
>> why not do it for other devices as well with a fix at the core level
> 
> I could not do that, I don't have enough ALSA knowledge. A motivation to 
> improve
> the delay reports for USB devices only could be that they are probably 
> the most
> common devices apart from HDA.

Exactly, why not do this for HDAudio then...

> 
>>
>> And last, I think you have an issue for the pause case and you cannot 
>> remove the code that limits the frame counter to 8 bits, this will not 
>> work across multiple versions of [e|x]HCI controllers, see below.
>>
>>>
>>> As a downside, the possible worst case error increases from 2ms to 3ms.
>>> ---
>>>   sound/usb/card.h |  1 +
>>>   sound/usb/pcm.c  | 47 +++++++++++++++++++++++++++++++----------------
>>>   2 files changed, 32 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/sound/usb/card.h b/sound/usb/card.h
>>> index ed87cc83eb47..8835d9628cac 100644
>>> --- a/sound/usb/card.h
>>> +++ b/sound/usb/card.h
>>> @@ -149,6 +149,7 @@ struct snd_usb_substream {
>>>         int last_frame_number;          /* stored frame number */
>>>       int last_delay;                 /* stored delay */
>>> +    struct timespec last_update_time;    /* time when last_delay was 
>>> updated */
>>>         struct {
>>>           int marker;
>>> diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
>>> index 3cbfae6604f9..6d40cb0b8777 100644
>>> --- a/sound/usb/pcm.c
>>> +++ b/sound/usb/pcm.c
>>> @@ -46,9 +46,8 @@ snd_pcm_uframes_t snd_usb_pcm_delay(struct 
>>> snd_usb_substream *subs,
>>>       int current_frame_number;
>>>       int frame_diff;
>>>       int est_delay;
>>> -
>>> -    if (!subs->last_delay)
>>> -        return 0; /* short path */
>>
>> NAK. This was added on purpose by 48779a0b8ffc ('ALSA: usb-audio: fix 
>> delay account during pause')
>> And since you kept the code that sets last_delay there is no reason to 
>> take this out.
> 
> OK, I did not know that this is needed. I took it out because otherwise 
> the delay
> reported for capture will always be 0. retire_capture_urb() sets 
> last_delay to 0.
> Does the problem that was fixed in the commit  only apply to playback 
> streams?
> Then I could change the if condition accordingly.

That's precisely the issue: the code was contributed by multiple people 
to work on multiple platforms. What parts do what exactly is complicated 
to reverse-engineer.

> 
> 
>>> +    int time_diff;
>>> +    struct timespec now;
>>>         current_frame_number = usb_get_current_frame_number(subs->dev);
>>>       /*
>>> @@ -58,9 +57,21 @@ snd_pcm_uframes_t snd_usb_pcm_delay(struct 
>>> snd_usb_substream *subs,
>>>        */
>>>       frame_diff = (current_frame_number - subs->last_frame_number) & 
>>> 0xff;
>>>   -    /* Approximation based on number of samples per USB frame (ms),
>>> -       some truncation for 44.1 but the estimate is good enough */
>>> -    est_delay =  frame_diff * rate / 1000;
>>> +    /* Estimate time since last update of last_delay */
>>> +    snd_pcm_gettime(subs->pcm_substream->runtime, &now);
>>> +    time_diff = (now.tv_sec - subs->last_update_time.tv_sec) * 
>>> 1000000000;
>>> +    time_diff += (int)now.tv_nsec - subs->last_update_time.tv_nsec;
>>> +    time_diff /= 1000;
>>> +
>>> +    if (abs(time_diff - frame_diff * 1000) < 1000) {
>> that's a 1ms error, this is quite high IMO to assess a divergence 
>> between system and frame deltas
> 
> Because time_diff is used to interpolate between frames and frames are 
> updated every millisecond,
> it is perfectly normal if time_diff is for example 900 microseconds 
> larger than frame_diff * 1000.
> Only when the difference is larger than 1 ms we can conclude that 
> something is wrong.
> That is the reason why the patch increases the worst case error from 2 
> ms to 3 ms.

Yeah, well, given that the actual playback rate is not aligned with the 
frame counter for async interfaces, this test is questionable.

> 
>>
>>> +        /* System time and sound card time do not differ
>>> +         * too much. Use system time to calculate delay */
>>> +        est_delay =  time_diff * rate / 1000000;
>>> +    } else {
>>> +        /* Approximation based on number of samples per USB frame (ms),
>>> +           some truncation for 44.1 but the estimate is good enough */
>>> +        est_delay =  frame_diff * rate / 1000;
>>> +    }
>>>       if (subs->direction == SNDRV_PCM_STREAM_PLAYBACK)
>>>           est_delay = subs->last_delay - est_delay;
>>>       else
>>> @@ -853,6 +864,8 @@ static int snd_usb_pcm_prepare(struct 
>>> snd_pcm_substream *substream)
>>>       subs->hwptr_done = 0;
>>>       subs->transfer_done = 0;
>>>       subs->last_delay = 0;
>>> +    subs->last_update_time.tv_sec = 0;
>>> +    subs->last_update_time.tv_nsec = 0;
>>>       subs->last_frame_number = 0;
>>>       runtime->delay = 0;
>>>   @@ -1334,7 +1347,9 @@ static void retire_capture_urb(struct 
>>> snd_usb_substream *subs,
>>>             /* realign last_frame_number */
>>>           subs->last_frame_number = current_frame_number;
>>> -        subs->last_frame_number &= 0xFF; /* keep 8 LSBs */
>>
>> You can't remove this, this models the fact that only 8 bits are valid 
>> in the frame counter.
> 
> If the high order bits are not arbitrary, this is not necessary because 
> the value is
> only used in snd_usb_pcm_delay() and (a - b) & 0xff = (a - b & 0xff) & 
> 0xff. You can
> prove this by writing a and b as a = 256 * a1 + a2 and b = 256 * b1 + b2 
> with a2
> and b2 smaller than 256.

the number of valid bits varies in different controllers so the goal was 
to align on the common denominator - 8 bits only.

> 
>>
>>> +
>>> +        /* Take time stamp for delay accounting */
>>> +        snd_pcm_gettime(runtime, &subs->last_update_time);
>>>             spin_unlock_irqrestore(&subs->lock, flags);
>>>           /* copy a data chunk */
>>> @@ -1550,7 +1565,9 @@ static void prepare_playback_urb(struct 
>>> snd_usb_substream *subs,
>>>         /* realign last_frame_number */
>>>       subs->last_frame_number = usb_get_current_frame_number(subs->dev);
>>> -    subs->last_frame_number &= 0xFF; /* keep 8 LSBs */
>> same here, can't do this
>>
>>> +
>>> +    /* Take time stamp for delay accounting */
>>> +    snd_pcm_gettime(runtime, &subs->last_update_time);
>>>         if (subs->trigger_tstamp_pending_update) {
>>>           /* this is the first actual URB submitted,
>>> @@ -1597,6 +1614,12 @@ static void retire_playback_urb(struct 
>>> snd_usb_substream *subs,
>>>           subs->last_delay -= processed;
>>>       runtime->delay = subs->last_delay;
>>>   +    /* realign last_frame_number */
>>> +    subs->last_frame_number = usb_get_current_frame_number(subs->dev);
>>
>> need to filter on 8 bits..
>>> +
>>> +    /* Take time stamp for delay accounting */
>>> +    snd_pcm_gettime(runtime, &subs->last_update_time);
>>> +
>>>       /*
>>>        * Report when delay estimate is off by more than 2ms.
>>>        * The error should be lower than 2ms since the estimate relies
>>> @@ -1607,14 +1630,6 @@ static void retire_playback_urb(struct 
>>> snd_usb_substream *subs,
>>>               "delay: estimated %d, actual %d\n",
>>>               est_delay, subs->last_delay);
>>>   -    if (!subs->running) {
>>> -        /* update last_frame_number for delay counting here since
>>> -         * prepare_playback_urb won't be called during pause
>>> -         */
>>> -        subs->last_frame_number =
>>> -            usb_get_current_frame_number(subs->dev) & 0xff;
>>> -    }
>>> -
>> and why is this removed.
> 
> This is removed because the last_frame_number is now always updated,
> so it is no longer necessary.

makes no sense to me.

> 
>>>    out:
>>>       spin_unlock_irqrestore(&subs->lock, flags);
>>>   }
> 
> 
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel at alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel



More information about the Alsa-devel mailing list