[alsa-devel] Clean up of PCM spinlocks

Wed Mar 14 15:33:58 CET 2012

At Wed, 14 Mar 2012 15:20:07 +0100,
Jaroslav Kysela wrote:
> 
> Date 14.3.2012 14:41, Takashi Iwai wrote:
> > At Wed, 14 Mar 2012 12:59:07 +0100,
> > Jaroslav Kysela wrote:
> >>
> >> Date 14.3.2012 12:24, Takashi Iwai wrote:
> >>> At Wed, 14 Mar 2012 10:56:12 +0100,
> >>> Jaroslav Kysela wrote:
> >>>>
> >>>> Date 13.3.2012 21:24, Takashi Iwai wrote:
> >>>>> At Tue, 13 Mar 2012 20:50:00 +0100,
> >>>>> Jaroslav Kysela wrote:
> >>>>>>
> >>>>>> Date 13.3.2012 17:05, Takashi Iwai wrote:
> >>>>>>> At Tue, 13 Mar 2012 16:46:39 +0100,
> >>>>>>> Jaroslav Kysela wrote:
> >>>>>>>>
> >>>>>>>> Date 13.3.2012 16:23, Takashi Iwai wrote:
> >>>>>>>>> At Tue, 13 Mar 2012 16:16:36 +0100,
> >>>>>>>>> Jaroslav Kysela wrote:
> >>>>>>>>>>
> >>>>>>>>>> Date 13.3.2012 16:00, Takashi Iwai wrote:
> >>>>>>>>>>> At Tue, 13 Mar 2012 15:24:11 +0100,
> >>>>>>>>>>> Jaroslav Kysela wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Date 13.3.2012 11:57, Takashi Iwai wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> below is a patch I posted to LKML as the follow up for the lockdep
> >>>>>>>>>>>>> report from Dave Jones.  It's a resurrected patch from what I worked
> >>>>>>>>>>>>> on quite ago.  Although this fix is suboptimal, I see no better way.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If this looks OK, I'm going to queue it as a 3.4 material.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> thanks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Takashi
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ===
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> From: Takashi Iwai <tiwai at suse.de>
> >>>>>>>>>>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The spinlocks held for PCM substreams have been always messy.  For the
> >>>>>>>>>>>>> better concurrent accessibility, we took always the substream's lock
> >>>>>>>>>>>>> first.  When substreams are linked, we need a group-wide lock, but
> >>>>>>>>>>>>> this should be applied outside substream's lock.  So, we unlock the
> >>>>>>>>>>>>> substream's lock first, then do locks twice.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This scheme is known to be problematic with lockdep.  Maybe because
> >>>>>>>>>>>>> the nested lock isn't marked properly, and partly because the lock and
> >>>>>>>>>>>>> unlock sequences are different (A/B -> A/B).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This patch tries to simplify the scheme.  Instead of holding the
> >>>>>>>>>>>>> substream's lock, we take the group lock always at first.  A drawback
> >>>>>>>>>>>>> of this is that the access to the individual substream in a same group
> >>>>>>>>>>>>> won't be allowed any longer.  But, the code looks much easier, and
> >>>>>>>>>>>>> likely less buggy.
> >>>>>>>>>>>>
> >>>>>>>>>>>> What do you mean by this? I noticed that you switched the behaviour for
> >>>>>>>>>>>> the stream_lock():
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1) self_group spinlock is used for single (non-linked) stream
> >>>>>>>>>>>>    - same as before
> >>>>>>>>>>>> 2) group.lock spinlock is used for the linked streams
> >>>>>>>>>>>>    - the self_group spinlock is used only in the group action
> >>>>>>>>>>>>      - the question is why? it seems useless, the
> >>>>>>>>>>>>        stream_lock() functions use the allocated group.lock
> >>>>>>>>>>>
> >>>>>>>>>>> Oh yeah, right, this is utterly useless!
> >>>>>>>>>>> I just thought of the mutex case, but in this case, the lock is
> >>>>>>>>>>> avoided anyway.
> >>>>>>>>>>>
> >>>>>>>>>>>> I don't think that this patch will work in the way we expect.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I believe we should use only group.lock spinlock for the linked streams
> >>>>>>>>>>>> without using the self_group spinlock to make locking work like this:
> >>>>>>>>>>>>
> >>>>>>>>>>>> stream_lock(current)
> >>>>>>>>>>>> group_action(current) {
> >>>>>>>>>>>>   for s from all_group_substreams {
> >>>>>>>>>>>>     if (s != current)
> >>>>>>>>>>>>       stream_lock(s);
> >>>>>>>>>>>>   }
> >>>>>>>>>>>>   .... job ....
> >>>>>>>>>>>>   .... group substreams unlock ....
> >>>>>>>>>>>> }
> >>>>>>>>>>>> stream_unlock(current)
> >>>>>>>>>>>
> >>>>>>>>>>> Well, if we think of a single lock, the code would be much
> >>>>>>>>>>> simplified than above.
> >>>>>>>>>>>
> >>>>>>>>>>> The reviewsed patch is below.  I left the single-stream version as is
> >>>>>>>>>>> since it's used explicitly in the drain() -- the drain action is
> >>>>>>>>>>> applied to each substream although it's linked.
> >>>>>>>>>>
> >>>>>>>>>> I meant use group->lock in the group_action, see patch bellow.
> >>>>>>>>>
> >>>>>>>>> Hmm, I don't think this works.  From linked stream members,
> >>>>>>>>> group->lock points to the very same lock instance.  So,
> >>>>>>>>> snd_pcm_action_group() would dead-lock the same lock with your code.
> >>>>>>>>
> >>>>>>>> I see. The problem is that the concurrent but linked streams can take
> >>>>>>>> different first locks for the group locking:
> >>>>>>>>
> >>>>>>>> time 0		 time 1
> >>>>>>>> --------------------------------
> >>>>>>>> lock stream 0 -> group lock
> >>>>>>>> lock stream 1 -> group lock
> >>>>>>>>
> >>>>>>>> In this case the condition current != s is not enough. I think that we
> >>>>>>>> have to strictly separate situations when the group lock can be
> >>>>>>>> activated and use different locking scheme there - for example using a
> >>>>>>>> master lock per linked substreams.
> >>>>>>>
> >>>>>>> In my revised patch, group->lock is actually the master lock of the
> >>>>>>> group.  This is the only lock applied to the substream.
> >>>>>>>
> >>>>>>> When a stream is linked, it's done in the global write_lock to
> >>>>>>> guarantee that stream is unlocked.  Then group->lock is replaced with
> >>>>>>> a real group lock from self_group.lock.  So, there is no race among
> >>>>>>> this move.
> >>>>>>>
> >>>>>>>> It seems to me that your proposed patch just removes the spinlocks for
> >>>>>>>> linked substreams which can cause inconsistencies for these streams.
> >>>>>>>
> >>>>>>> Not really.  It just uses a single lock for linked streams.  There are
> >>>>>>> no longer stream locks once when the stream is linked.
> >>>>>>>
> >>>>>>> This is of course a drawback -- it disallows the concurrent access to
> >>>>>>> linked stream members in some operations.  How this is regarded as a
> >>>>>>> serious drawback, is a question of operation styles.  If any apps are
> >>>>>>> accessing many streams as a linked stream in realtime manner, and
> >>>>>>> access to each without mmap, then the performance may be worsen.  But,
> >>>>>>> this scenario is rather pretty rare, I guess.
> >>>>>>>
> >>>>>>>> Or just a quick idea - what about to track the the group action
> >>>>>>>> collisions? We can use a spinlock protected counter and serialize these
> >>>>>>>> requests in another way (if counter is above one - the CPU should wait
> >>>>>>>> until it becomes one).
> >>>>>>>
> >>>>>>> Well...  this sounds way too complex, I'm afraid.
> >>>>>>
> >>>>>> I tried to think about it more. I think that the spinlock reorder is at
> >>>>>> the wrong code position. What about the code bellow. I believe, it
> >>>>>> should guarantee the consistent locking scheme (although the first
> >>>>>> substream lock will stay outside of the group->lock, but it should not
> >>>>>> be an issue).
> >>>>>
> >>>>> Unfortunately it doesn't work neither.  It's the way of thinking I've
> >>>>> traced, too.
> >>>>>
> >>>>> snd_pcm_action() could be called in the context where the stream lock
> >>>>> has been already held, e.g. at stopping the stream from the interrupt
> >>>>> handler.  Thus you can't take the group lock over it.
> >>>>
> >>>> Agreed. We must use the "master" lock or "per-substream" lock. Mixing of
> >>>> these locks results in an ugly locking scheme.
> >>>>
> >>>>> The re-locking in the current code (unlock first then two locks) is a
> >>>>> compromise.  It actually works well in practice.  But the difference
> >>>>> of lock and unlock sequences is confusing, and the way of checking the
> >>>>> double-lock via spin_trylock() is too tricky.
> >>>>
> >>>> I think that spin_trylock() can help, but we should really use it in a
> >>>> different way than before. I think that I finally created a way to use
> >>>> the per-substream locks and serialize locking of all those locks for a
> >>>> group operation. Please, review the patch bellow.
> >>>
> >>> Sorry but it's way too complex.  Most of the additional code is a
> >>> specially open-coded spinlock.  Is really worth to add more tricks?
> >>>
> >>> I would accept such optimizations only if the concurrent access of a
> >>> huge amount of linked streams were a realistic scenario.  But it's
> >>> rather unrelaistic.  If the concurrency is no big issue from practical
> >>> POV, we should move on rather toward simplification, IMO.
> >>>
> >>> And, above all, I'm wondering whether allowing the concurrent access
> >>> to linked streams is really safe.  Imagine if both streams are trying
> >>> to change to different states at the same time.  For the consistency,
> >>> a single lock approach sounds much safer.
> >>
> >> This should be assured by the PCM code (it calls the group action when
> >> the state is changed, so all streams are locked at this moment).
> > 
> > But how?  Suppose streams A and B are being accessed concurrently
> > and trying to change the state at the same time but differently.
> > At beginning, the locks held by both are only the self_group.locks.
> > Then at snd_pcm_action_group(), it acquires the group lock.  If A
> > takes the group lock, B is spinning.  With your latest patch, A tries
> > the lock of B, and it can't take, thus it continues.  It will change
> > the state of all linked streams (including B) to state A'.  After
> > finishing snd_pcm_action() of stream A, now snd_pcm_action() of B
> > begins and changes it to state B' again, or get an error.
> > 
> > In this scenario, both A and B should have taken the group lock from
> > the beginning.
> > 
> >> I can see the improvements when the application polls for the stream
> >> positions (for example to synchronize them) and when it uses small
> >> period sizes (many interrupts) for multiple substreams. In this case,
> >> the position updates for substreams can be executed in a parallel way.
> > 
> > Right.  There can be some methods that can be done in parallel.
> > But the stuff including state changes are basically difficult to
> > handle in parallel.
> > 
> >> I hoped to solve the bottleneck you mentioned in your patch. Of course,
> >> if you take ~30 lines (without comments and next 10 lines are curly
> >> brackets) of code as too way complex, our discussion is finished. The
> >> nice thing on my patch is that the lock serialization code is only at
> >> one place, easy to verify and commented (comments may be improved) for
> >> reviewers.
> >>
> >> My next point is, if we allow a "global" lock, we will not take care
> >> about the possible concurrency in future, which is not so good.
> > 
> > Yeah, that's a drawback.  OTOH, the handling of the state of linked
> > streams must be done globally among all linked streams.  And the lock
> > of the each substream assumes that the state change doesn't happen
> > during it, but it's not true when we don't protect via a global lock.
> > 
> > So, maybe we should think things separately: the actions that need a
> > global lock obviously (i.e. involving with the state change) and the
> > actions that can be easily parallelized.  If we want to provide finer
> > locks, use another substream-wise lock for the latter action and use
> > the group lock for the former action (and eventually lock the
> > subststem one in group lock).
> 
> The question is, if it's not simpler to use my scheme and verify/add the
> recovery paths after the possible group call.

Well, the scenario above doesn't work in your case, too.
It's no regression, though.  (And the scenario above doesn't work even
in the current code, too, yes.)

> My code guarantees, that
> the state is not modified when the group action finishes (until the
> substream lock is released). So, we can just do new current state /
> error value checking after these calls. A quick look to the code shows
> that almost all group action calls ends with the lock release, so I
> think that a comment (notifying that the state can be changed) for the
> group action functions should be enough.

Well, remember that the current code actually works.
It's just messy and confusing for lockdep, more imporatntly, also for
human beings.  So, the primary goal of this action is to simplify and
clean up the code to be more understandable.  And, as the next point,
better parallization even with simplification.

> Another possibility is to not do any serialization, just use one atomic
> counter, and return an error to caller when a concurrent access occurs.
> The caller can decide what to do - it may even to release the lock and
> retry the whole operation.

Hmm, judging only from the description, I don't buy it.
All of the things should be able to be implemented with normal
spinlocks.  If not, the design is very likely wrong.

> I also think that the group lock might be acquired uselessly in many cases.

Yes.  We basically think of a single lock in all pcm_lib.c operations.
This is the cause of the whole problems.  If we think of only a single
lock, it should be really a single lock.  If we need more finer locks,
we need two kind of locks in PCM operations themselves.

thanks,

Takashi