[alsa-devel] snd_pcsp locking mess

Thu May 29 08:03:26 CEST 2008

At Thu, 29 May 2008 00:08:32 +0400,
Stas Sergeev wrote:
> 
> Hello.
> 
> Takashi Iwai wrote:
> > OTOH, the update of the whole PCM via snd_pcm_period_elapsed() is a
> > slow path.  Especially in the case of hrtimer, it brings the lock
> > mess, too.
> Unless the entire callback is done in
> a softirq, as it currently is, no?
> 
> > Issuing softirq at each hrtimer callback isn't optimal from
> > performance perspective.  For RT-kernels, it doesn't matter, but for
> > normal kernels, it does.
> Oh. But in that case I guess it would
> really be better if just snd_pcm_period_elapsed()
> would raise a softirq. That's something
> you proposed in the past IIRC.
> Is this acceptable to have that change
> not in the driver, or do you think it
> is snd-pcsp-specific, and no other driver
> will benefit?

It would be good in general.
However, better to be the next step after changing pcsp.

> Btw, could you please point me to where
> to read about the softirq expense? I
> thought they are very cheap, almost zero
> overhead.

One big disadvantage of the softirq is its (possible) timing
inaccuracy.  (Note that this is about the normal non-preempt kernel.)
The thing like pcsp driver does, flipping the bits at a high rate,
isn't for softirq per se.

> And yes, if the sotfirqs are expensive,
> then indeed I see why you propose all
> these tricks you do... But if they are
> not and we can keep the entire callback
> in a softirq as it currently is, then
> there might be a simpler solutions.
>
> >> the chip->substream_lock for that (why?),
> > Because the pointer callback and the hrtimer callback can race about
> I asked only why have you chosen
> chip->substream_lock - IIRC it was
> intended not for this.
> 
> >>> +	pcsp_stop_playing(chip);
> >> This will set chip->timer_active to 0,
> >> but who knows if the callback on another
> >> CPU have already passed the check or not?
> > If it passed, then we simply wait until the next hrtimer callback
> > finished.  That's what hrtimer_cancel() does.  So, after that, it is
> > guaranteed that neither hrtimer and tasklet remains.
> Yes, but the state doesn't seem to be
> guaranteed. Running the handler after
> the pcsp_stop_playing(chip) called, looks
> nasty.

This is there because pcsp_stop_playing() doesn't sync.
Both hrtimer_cancel() and tasklet_kill() are basically just for sync,
not for killing the object.

> The timer will be in a different
> mode. Nothing bad will likely to happen,
> but... esp with the nforce workaround,
> where the callback is supposed to be
> called even amount of times. Doesn't
> look very good to me, even if nothing
> bad can happen...

Well, it can be done in another away if you are so nervous about it.

Simply add chip->timer_running atomic_t and reset it when the callback
finished with HRTIMER_NORESTART.  In another side, wait until
chip->timer_running becomes zero.
But, it's just an uneeded addition of the code.

> > Because the substream lock isn't necessary!  It just makes things more
> > complicated -- if we take substream lock, AB/BA deadlocks occur again.
> Even if the callback is called within
> the softirq?

The runtime callbacks, pointer and trigger, are called already inside they
substream lock held by the PCM core.  Thus it doesn't matter whether
it's in a hard or a softirq.

> >> And
> >> if right now it is barely usefull, then
> >> why it can't be improved to simply not
> >> require a second lock that guards a
> >> substream itself?
> > Because it won't be simpler, as you already tried.
> I haven't tried that. :)
> I have tried the snd_pcm_stream_lock(),
> and yes, its nasty. But to me its nastiness
> is only because it needs another lock to
> protect the substream itself.

No, you don't need to protect the substream at all.
Remember that hrtimer callback requires only the DMA-buffer pointer
during its operation.  No other thing is needed from the substream.
This can be guaranteed with a simpler logic.

> So it is
> barely usefull. But it can be improved
> (or an alternative provided), the way the
> second lock is not needed. Then I will
> need only that - a single lock, nothing
> more. This might not be even difficult to
> do at all.

Well, my version is almost i8259_lock only.
As mentioned, chip->substream_lock can be removed safely with a
barrier.

Takashi