[PATCH 3/3] echoaudio: Address bugs in the interrupt handling
Mark Hills
mark at xwax.org
Thu Jun 18 14:29:20 CEST 2020
On Thu, 18 Jun 2020, Takashi Iwai wrote:
> On Thu, 18 Jun 2020 13:07:33 +0200,
> Mark Hills wrote:
> >
> > On Thu, 18 Jun 2020, Takashi Iwai wrote:
> >
> > > On Wed, 17 Jun 2020 12:51:05 +0200,
> > > Mark Hills wrote:
> > [...]
> > But could I please ask for help with the bigger picture? As it feels
> > mismatched.
> >
> > * Code should take every possible opportunity to update the stream
> > position; interrupts, or explicit pcm_pointer calls (whereas the docs
> > guide towards doing it in the interrupt handler)
> >
> > * I critiqued (elsewhere in thread) the older interrupt handler for
> > checking the counter, unlocking, calling back into PCM core and checking
> > again a moment later. Whereas this is considered good behaviour.
> >
> > * Seems like the overall aim is for userland to be able (if it wants to)
> > to poll the soundcard, even outside of periods.
>
> Right, the user-space can query the current position at any time, and
> the driver should return the position as precisely as possible.
>
> Some applications (like PulseAudio) sets the interrupt as minimum as
> possible while it does schedule the update by itself, judging the
> position via the ioctl. In such operational mode, the accuracy of the
> current position query is vital.
>
> > If all the above is true, I would expect interrupt handling to be very
> > simple -- it would straight away call into PCM core, join existing if the
> > codepaths (to take locks) and do a position update. PCM core would decide
> > if a period really elapsed, not the driver. But this is not how it works.
> >
> > This now relates strongly to a question of locking:
> >
> > I ran the code (top of this message) all day, with a few instances in
> > dmesg (at irregular intervals, not wrapping):
> >
> > [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> > [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> > [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> > [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> > [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64
> >
> > A definite bug, of course.
> >
> > However (and I am happy to be corrected) the function never finishes with
> > position == buffer size. Only way is a race between interrupt handler and
> > pcm_pointer.
> >
> > Therefore one of these is needed:
> >
> > * pcm_pointer locks chip->lock
> >
> > Even though the docs emphasise PCM core has exclusivity, it it not worth
> > much as it does not protect against the interrupt handler.
> >
> > But now interrupt handler becomes ugly in total: take chip->lock, check
> > the counter, release chip->lock, go to PCM core (which takes middle
> > layer locks), take chip->lock again, check counter again, release
> > chip->lock again.
>
> Yes, that's the most robust way to go. If the lock is really costing
> too much, you can consider a fast-path by some flag for the irq ->
> snd_pcm_period_elapsed() case.
I don't understand how a fast path could be made to work, as it can't pass
data across snd_pcm_period_elapsed() and it still must syncronise access
between reading dma_counter and writing pipe->position.
Hence questioning if a better design is simpler interrupt handlers that
just enter PCM core.
> Basically you can do everything in the pointer callback. The only
> requirement in the interrupt handle is basically judge whether you need
> the call of snd_pcm_period_elapsed() and call it. The rest update could
> be done in the other places.
Thanks, please just another clarification:
I presume that calls to pcm_pointer are completely independent of the
period notifications?
ie. period notifications are at regular intervals, regardless of whether
pcm_pointer is called inbetween. pcm_pointer must not reset any state used
by the interrupt.
Which means we must handle when non-interrupt call to pcm_pointer causes a
period to elapse. The next interrupt handler must notify.
I can see in the original code uses chip->last_period exclusively by the
interrupt handler, and I removed it. Some comments around the intent would
help. I'll cross reference the original code with my new understanding.
My instinct here is that to preserve
- regular period notifications
- handle period_size not aligning with 32-bit counter
- no races between interrupt and pcm_pointer
that the clearest and bug-free implementation may be to separate the
interrupt (notifications) and pointer updates to separate state.
Then there's no lock and the only crossover is an atomic read of
dma_counter.
And that's what I will try -- thanks.
--
Mark
More information about the Alsa-devel
mailing list