[alsa-devel] snd_pcsp locking mess

Wed May 28 12:13:44 CEST 2008

At Tue, 27 May 2008 21:40:13 +0400,
Stas Sergeev wrote:
> 
> Hello.
> 
> Takashi Iwai wrote:
> > Well, let's take a look back how the callbacks are called.
> > The callbacks that are called during PCM running are the following:
> > 
> > - hrtimer callback gets called from hrtimer:
> >   This updates the pcsp port and updates the PCM pointer.
> >   
> > - PCM pointer callback:
> >   This returns the current PCM pointer
> > 
> > So, actually between these calls, we have only one thing to protect -
> > the PCM pointer.  The DMA-buffer readiness is necessary only for
> > hrtimer callback.
> OK.
> 
> > And, in addition, we need to call somewhere snd_pcm_period_elapsed()
> > when the period is finished.  This can be done in a tasklet.
> What is the difference between calling
> everything in softirq (current solution)
> and calling snd_pcm_period_elapsed() in
> a tasklet, with the rest in a hardirq
> context? What will this solve?

Think just like the old-good bottomhalf.
The register flip with hrtimer callback is definitely a fast path, so
it should be called inside the hard irq.

OTOH, the update of the whole PCM via snd_pcm_period_elapsed() is a
slow path.  Especially in the case of hrtimer, it brings the lock
mess, too.  Thus it's better to put this into a softirq.

Issuing softirq at each hrtimer callback isn't optimal from
performance perspective.  For RT-kernels, it doesn't matter, but for
normal kernels, it does.

> > Now, remember that hrtimer is started only via trigger start
> > callback.  That is, hrtimer callback isn called first after PCM is
> > actually started; hence the DMA buffer is ready when it's called.
> > 
> > Then, if we assure that the DMA buffer isn't changed and the PCM
> > substream still exists during hrtimer callback (and/or its execution
> > finishes), hrtimer callback can safely run without *any* lock.
> OK so how do you protect the PCM pointer
> after all? Even in your example you use
> the chip->substream_lock for that (why?),

Because the pointer callback and the hrtimer callback can race about
chip->playback_ptr.  The lock is simply to protect it.  Don't take it
as a PCM substream lock.

It can be implemened even without a lock, e.g. using a simple barrier,
though.  I was just too lazy.

> so it doesn't look like you run a callback
> without any lock at all.

You don't need a lock for the PCM callback itself.  The pointer
callback is called inside the PCM substream lock.

> > This is pretty easy.  We just need to add appropriate calls to
> > synchronize the finish of hrtimer (and tasklet) at each place that can
> > change the DMA buffers.
> I wonder if this is SMP-safe (comments
> below).
> 
> > +/*
> > + * Force to stop and sync the stream
> > + */
> > +void pcsp_sync_stop(struct snd_pcsp *chip)
> > +{
> > +	local_irq_disable();
> So what will happen if the timer callback
> is executing on another CPU at that point?

That doesn't matter, because this irqsave is needed just for
i8253_lock in pcsp_stop_playing() (as it doesn't save irq).

> > +	pcsp_stop_playing(chip);
> This will set chip->timer_active to 0,
> but who knows if the callback on another
> CPU have already passed the check or not?

If it passed, then we simply wait until the next hrtimer callback
finished.  That's what hrtimer_cancel() does.  So, after that, it is
guaranteed that neither hrtimer and tasklet remains.

> > +	local_irq_enable();
> > +	hrtimer_cancel(&chip->timer);
> > +	tasklet_kill(&pcsp_pcm_tasklet);
> Ouch, IIRC all three are discouraged to
> use...

Really?  Any pointer?

> I (hopefully) now understand the logic
> you are trying to explain, and I don't
> think it is completely trouble-free (I
> may be wrong of course, but certainly
> it won't be straightforward for every
> reader why is that code correct).
> So the question is still why not to
> simply use snd_pcm_stream_lock().

Because the substream lock isn't necessary!  It just makes things more
complicated -- if we take substream lock, AB/BA deadlocks occur again.

> And
> if right now it is barely usefull, then
> why it can't be improved to simply not
> require a second lock that guards a
> substream itself?

Because it won't be simpler, as you already tried.

Takashi