[PATCH 3/3] echoaudio: Address bugs in the interrupt handling

Mark Hills mark at xwax.org
Thu Jun 18 14:29:20 CEST 2020


On Thu, 18 Jun 2020, Takashi Iwai wrote:

> On Thu, 18 Jun 2020 13:07:33 +0200,
> Mark Hills wrote:
> > 
> > On Thu, 18 Jun 2020, Takashi Iwai wrote:
> > 
> > > On Wed, 17 Jun 2020 12:51:05 +0200,
> > > Mark Hills wrote:
> > [...]
> > But could I please ask for help with the bigger picture? As it feels 
> > mismatched.
> > 
> > * Code should take every possible opportunity to update the stream 
> >   position; interrupts, or explicit pcm_pointer calls (whereas the docs 
> >   guide towards doing it in the interrupt handler)
> > 
> > * I critiqued (elsewhere in thread) the older interrupt handler for 
> >   checking the counter, unlocking, calling back into PCM core and checking 
> >   again a moment later. Whereas this is considered good behaviour.
> > 
> > * Seems like the overall aim is for userland to be able (if it wants to)  
> >   to poll the soundcard, even outside of periods.
> 
> Right, the user-space can query the current position at any time, and
> the driver should return the position as precisely as possible.
> 
> Some applications (like PulseAudio) sets the interrupt as minimum as
> possible while it does schedule the update by itself, judging the
> position via the ioctl.  In such operational mode, the accuracy of the
> current position query is vital.
> 
> > If all the above is true, I would expect interrupt handling to be very 
> > simple -- it would straight away call into PCM core, join existing if the 
> > codepaths (to take locks) and do a position update. PCM core would decide 
> > if a period really elapsed, not the driver. But this is not how it works.
> > 
> > This now relates strongly to a question of locking:
> > 
> > I ran the code (top of this message) all day, with a few instances in 
> > dmesg (at irregular intervals, not wrapping):
> > 
> >   [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> >   [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> >   [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> >   [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> >   [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> > 
> > A definite bug, of course.
> > 
> > However (and I am happy to be corrected) the function never finishes with 
> > position == buffer size. Only way is a race between interrupt handler and 
> > pcm_pointer.
> > 
> > Therefore one of these is needed:
> > 
> > * pcm_pointer locks chip->lock
> > 
> >   Even though the docs emphasise PCM core has exclusivity, it it not worth 
> >   much as it does not protect against the interrupt handler.
> > 
> >   But now interrupt handler becomes ugly in total: take chip->lock, check 
> >   the counter, release chip->lock, go to PCM core (which takes middle 
> >   layer locks), take chip->lock again, check counter again, release 
> >   chip->lock again.
> 
> Yes, that's the most robust way to go.  If the lock is really costing
> too much, you can consider a fast-path by some flag for the irq ->
> snd_pcm_period_elapsed() case.

I don't understand how a fast path could be made to work, as it can't pass 
data across snd_pcm_period_elapsed() and it still must syncronise access 
between reading dma_counter and writing pipe->position.

Hence questioning if a better design is simpler interrupt handlers that 
just enter PCM core.
 
> Basically you can do everything in the pointer callback.  The only 
> requirement in the interrupt handle is basically judge whether you need 
> the call of snd_pcm_period_elapsed() and call it.  The rest update could 
> be done in the other places.

Thanks, please just another clarification:

I presume that calls to pcm_pointer are completely independent of the 
period notifications?

ie. period notifications are at regular intervals, regardless of whether 
pcm_pointer is called inbetween. pcm_pointer must not reset any state used 
by the interrupt.

Which means we must handle when non-interrupt call to pcm_pointer causes a 
period to elapse. The next interrupt handler must notify.

I can see in the original code uses chip->last_period exclusively by the 
interrupt handler, and I removed it. Some comments around the intent would 
help. I'll cross reference the original code with my new understanding.

My instinct here is that to preserve

- regular period notifications

- handle period_size not aligning with 32-bit counter

- no races between interrupt and pcm_pointer

that the clearest and bug-free implementation may be to separate the 
interrupt (notifications) and pointer updates to separate state.

Then there's no lock and the only crossover is an atomic read of 
dma_counter.

And that's what I will try -- thanks.

-- 
Mark


More information about the Alsa-devel mailing list