[alsa-devel] Clean up of PCM spinlocks

Wed Mar 14 12:24:27 CET 2012

At Wed, 14 Mar 2012 10:56:12 +0100,
Jaroslav Kysela wrote:
> 
> Date 13.3.2012 21:24, Takashi Iwai wrote:
> > At Tue, 13 Mar 2012 20:50:00 +0100,
> > Jaroslav Kysela wrote:
> >>
> >> Date 13.3.2012 17:05, Takashi Iwai wrote:
> >>> At Tue, 13 Mar 2012 16:46:39 +0100,
> >>> Jaroslav Kysela wrote:
> >>>>
> >>>> Date 13.3.2012 16:23, Takashi Iwai wrote:
> >>>>> At Tue, 13 Mar 2012 16:16:36 +0100,
> >>>>> Jaroslav Kysela wrote:
> >>>>>>
> >>>>>> Date 13.3.2012 16:00, Takashi Iwai wrote:
> >>>>>>> At Tue, 13 Mar 2012 15:24:11 +0100,
> >>>>>>> Jaroslav Kysela wrote:
> >>>>>>>>
> >>>>>>>> Date 13.3.2012 11:57, Takashi Iwai wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> below is a patch I posted to LKML as the follow up for the lockdep
> >>>>>>>>> report from Dave Jones.  It's a resurrected patch from what I worked
> >>>>>>>>> on quite ago.  Although this fix is suboptimal, I see no better way.
> >>>>>>>>>
> >>>>>>>>> If this looks OK, I'm going to queue it as a 3.4 material.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> thanks,
> >>>>>>>>>
> >>>>>>>>> Takashi
> >>>>>>>>>
> >>>>>>>>> ===
> >>>>>>>>>
> >>>>>>>>> From: Takashi Iwai <tiwai at suse.de>
> >>>>>>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
> >>>>>>>>>
> >>>>>>>>> The spinlocks held for PCM substreams have been always messy.  For the
> >>>>>>>>> better concurrent accessibility, we took always the substream's lock
> >>>>>>>>> first.  When substreams are linked, we need a group-wide lock, but
> >>>>>>>>> this should be applied outside substream's lock.  So, we unlock the
> >>>>>>>>> substream's lock first, then do locks twice.
> >>>>>>>>>
> >>>>>>>>> This scheme is known to be problematic with lockdep.  Maybe because
> >>>>>>>>> the nested lock isn't marked properly, and partly because the lock and
> >>>>>>>>> unlock sequences are different (A/B -> A/B).
> >>>>>>>>>
> >>>>>>>>> This patch tries to simplify the scheme.  Instead of holding the
> >>>>>>>>> substream's lock, we take the group lock always at first.  A drawback
> >>>>>>>>> of this is that the access to the individual substream in a same group
> >>>>>>>>> won't be allowed any longer.  But, the code looks much easier, and
> >>>>>>>>> likely less buggy.
> >>>>>>>>
> >>>>>>>> What do you mean by this? I noticed that you switched the behaviour for
> >>>>>>>> the stream_lock():
> >>>>>>>>
> >>>>>>>> 1) self_group spinlock is used for single (non-linked) stream
> >>>>>>>>    - same as before
> >>>>>>>> 2) group.lock spinlock is used for the linked streams
> >>>>>>>>    - the self_group spinlock is used only in the group action
> >>>>>>>>      - the question is why? it seems useless, the
> >>>>>>>>        stream_lock() functions use the allocated group.lock
> >>>>>>>
> >>>>>>> Oh yeah, right, this is utterly useless!
> >>>>>>> I just thought of the mutex case, but in this case, the lock is
> >>>>>>> avoided anyway.
> >>>>>>>
> >>>>>>>> I don't think that this patch will work in the way we expect.
> >>>>>>>>
> >>>>>>>> I believe we should use only group.lock spinlock for the linked streams
> >>>>>>>> without using the self_group spinlock to make locking work like this:
> >>>>>>>>
> >>>>>>>> stream_lock(current)
> >>>>>>>> group_action(current) {
> >>>>>>>>   for s from all_group_substreams {
> >>>>>>>>     if (s != current)
> >>>>>>>>       stream_lock(s);
> >>>>>>>>   }
> >>>>>>>>   .... job ....
> >>>>>>>>   .... group substreams unlock ....
> >>>>>>>> }
> >>>>>>>> stream_unlock(current)
> >>>>>>>
> >>>>>>> Well, if we think of a single lock, the code would be much
> >>>>>>> simplified than above.
> >>>>>>>
> >>>>>>> The reviewsed patch is below.  I left the single-stream version as is
> >>>>>>> since it's used explicitly in the drain() -- the drain action is
> >>>>>>> applied to each substream although it's linked.
> >>>>>>
> >>>>>> I meant use group->lock in the group_action, see patch bellow.
> >>>>>
> >>>>> Hmm, I don't think this works.  From linked stream members,
> >>>>> group->lock points to the very same lock instance.  So,
> >>>>> snd_pcm_action_group() would dead-lock the same lock with your code.
> >>>>
> >>>> I see. The problem is that the concurrent but linked streams can take
> >>>> different first locks for the group locking:
> >>>>
> >>>> time 0		 time 1
> >>>> --------------------------------
> >>>> lock stream 0 -> group lock
> >>>> lock stream 1 -> group lock
> >>>>
> >>>> In this case the condition current != s is not enough. I think that we
> >>>> have to strictly separate situations when the group lock can be
> >>>> activated and use different locking scheme there - for example using a
> >>>> master lock per linked substreams.
> >>>
> >>> In my revised patch, group->lock is actually the master lock of the
> >>> group.  This is the only lock applied to the substream.
> >>>
> >>> When a stream is linked, it's done in the global write_lock to
> >>> guarantee that stream is unlocked.  Then group->lock is replaced with
> >>> a real group lock from self_group.lock.  So, there is no race among
> >>> this move.
> >>>
> >>>> It seems to me that your proposed patch just removes the spinlocks for
> >>>> linked substreams which can cause inconsistencies for these streams.
> >>>
> >>> Not really.  It just uses a single lock for linked streams.  There are
> >>> no longer stream locks once when the stream is linked.
> >>>
> >>> This is of course a drawback -- it disallows the concurrent access to
> >>> linked stream members in some operations.  How this is regarded as a
> >>> serious drawback, is a question of operation styles.  If any apps are
> >>> accessing many streams as a linked stream in realtime manner, and
> >>> access to each without mmap, then the performance may be worsen.  But,
> >>> this scenario is rather pretty rare, I guess.
> >>>
> >>>> Or just a quick idea - what about to track the the group action
> >>>> collisions? We can use a spinlock protected counter and serialize these
> >>>> requests in another way (if counter is above one - the CPU should wait
> >>>> until it becomes one).
> >>>
> >>> Well...  this sounds way too complex, I'm afraid.
> >>
> >> I tried to think about it more. I think that the spinlock reorder is at
> >> the wrong code position. What about the code bellow. I believe, it
> >> should guarantee the consistent locking scheme (although the first
> >> substream lock will stay outside of the group->lock, but it should not
> >> be an issue).
> > 
> > Unfortunately it doesn't work neither.  It's the way of thinking I've
> > traced, too.
> > 
> > snd_pcm_action() could be called in the context where the stream lock
> > has been already held, e.g. at stopping the stream from the interrupt
> > handler.  Thus you can't take the group lock over it.
> 
> Agreed. We must use the "master" lock or "per-substream" lock. Mixing of
> these locks results in an ugly locking scheme.
> 
> > The re-locking in the current code (unlock first then two locks) is a
> > compromise.  It actually works well in practice.  But the difference
> > of lock and unlock sequences is confusing, and the way of checking the
> > double-lock via spin_trylock() is too tricky.
> 
> I think that spin_trylock() can help, but we should really use it in a
> different way than before. I think that I finally created a way to use
> the per-substream locks and serialize locking of all those locks for a
> group operation. Please, review the patch bellow.

Sorry but it's way too complex.  Most of the additional code is a
specially open-coded spinlock.  Is really worth to add more tricks?

I would accept such optimizations only if the concurrent access of a
huge amount of linked streams were a realistic scenario.  But it's
rather unrelaistic.  If the concurrency is no big issue from practical
POV, we should move on rather toward simplification, IMO.

And, above all, I'm wondering whether allowing the concurrent access
to linked streams is really safe.  Imagine if both streams are trying
to change to different states at the same time.  For the consistency,
a single lock approach sounds much safer.

thanks,

Takashi

> 
> The next change can be a switch of the spinlocks from snd_pcm_group to
> the snd_pcm_substream and change the group structure allocation to avoid
> GFP_ATOMIC as you suggested.
> 
> 					Jaroslav
> 
> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
> index 1d58d79..9decfa2 100644
> --- a/include/sound/pcm.h
> +++ b/include/sound/pcm.h
> @@ -362,6 +362,8 @@ struct snd_pcm_group {		/* keep linked substreams */
>  	spinlock_t lock;
>  	struct list_head substreams;
>  	int count;
> +	atomic_t master_count;
> +	atomic_t lock_count;
>  };
> 
>  struct pid;
> diff --git a/sound/core/pcm.c b/sound/core/pcm.c
> index 6e4bfcc..c3ae1fc 100644
> --- a/sound/core/pcm.c
> +++ b/sound/core/pcm.c
> @@ -690,6 +690,8 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int
> stream, int substream_count)
>  		substream->group = &substream->self_group;
>  		spin_lock_init(&substream->self_group.lock);
>  		INIT_LIST_HEAD(&substream->self_group.substreams);
> +		atomic_set(&substream->self_group.master_count, 0);
> +		atomic_set(&substream->self_group.lock_count, 0);
>  		list_add_tail(&substream->link_list, &substream->self_group.substreams);
>  		atomic_set(&substream->mmap_count, 0);
>  		prev = substream;
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 25ed9fe..e093f8b 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -709,12 +709,64 @@ static int snd_pcm_action_group(struct action_ops
> *ops,
>  {
>  	struct snd_pcm_substream *s = NULL;
>  	struct snd_pcm_substream *s1;
> -	int res = 0;
> +	struct snd_pcm_group *g = substream->group;
> +	int res = 0, idx, lcnt;
> +	unsigned long locked = 0;
> +
> +	if (!do_lock)
> +		goto _pre;
> +
> +	/*
> +	 * ensure the stream locking serialization here
> +	 */
> +	atomic_inc(&g->master_count);
> + _retry:
> +	if (atomic_inc_return(&g->lock_count) == 1) {
> +		lcnt = 0;
> + _retry_lock:
> +		idx = 0;
> +		snd_pcm_group_for_each_entry(s, substream) {
> +			if (s != substream && !(locked & (1 << idx))) {
> +				if (spin_trylock(&s->self_group.lock)) {
> +					locked |= (1 << idx);
> +					lcnt++;
> +				}
> +			}
> +			idx++;
> +		}
> +		/*
> +		 * at this point, check if another group action
> +		 * owner tries to access this part of code
> +		 *
> +		 * another situation is that another substream lock is
> +		 * active without the group action request;
> +		 * wait, until this request is finished [ref1]
> +		 */
> +		if (g->count != lcnt + atomic_read(&g->master_count))
> +			goto _retry_lock;
> +	} else {
> +		/*
> +		 * another group owner tries to reach this code
> +		 * serialize: wait for the first owners(s)
> +		 */
> +		while (atomic_read(&g->lock_count) != 1) {
> +			/*
> +			 * multiple requests - try to release
> +			 * the atomic counter to avoid deadlock,
> +			 * use new read operation to increase
> +			 * time window for other checkers
> +			 */
> +			if (atomic_read(&g->lock_count) > 2) {
> +				atomic_dec(&g->lock_count);
> +				goto _retry;
> +			}
> +			/* do nothing here, wait for finish */
> +		}
> +		goto _retry_lock;
> +	}
> 
> + _pre:
>  	snd_pcm_group_for_each_entry(s, substream) {
> -		if (do_lock && s != substream)
> -			spin_lock_nested(&s->self_group.lock,
> -					 SINGLE_DEPTH_NESTING);
>  		res = ops->pre_action(s, state);
>  		if (res < 0)
>  			goto _unlock;
> @@ -729,7 +781,6 @@ static int snd_pcm_action_group(struct action_ops *ops,
>  					ops->undo_action(s1, state);
>  				}
>  			}
> -			s = NULL; /* unlock all */
>  			goto _unlock;
>  		}
>  	}
> @@ -738,13 +789,18 @@ static int snd_pcm_action_group(struct action_ops
> *ops,
>  	}
>   _unlock:
>  	if (do_lock) {
> -		/* unlock streams */
> +		/* unlock all streams */
> +		idx = 0;
>  		snd_pcm_group_for_each_entry(s1, substream) {
> -			if (s1 != substream)
> +			if (s != substream && (locked & (1 << idx)) != 0)
>  				spin_unlock(&s1->self_group.lock);
> -			if (s1 == s)	/* end */
> -				break;
>  		}
> +		/*
> +		 * keep decrement order reverse to avoid
> +		 * a bad [ref1] condition check
> +		 */
> +		atomic_dec(&g->master_count);
> +		atomic_dec(&g->lock_count);
>  	}
>  	return res;
>  }
> @@ -779,13 +835,7 @@ static int snd_pcm_action(struct action_ops *ops,
>  	int res;
> 
>  	if (snd_pcm_stream_linked(substream)) {
> -		if (!spin_trylock(&substream->group->lock)) {
> -			spin_unlock(&substream->self_group.lock);
> -			spin_lock(&substream->group->lock);
> -			spin_lock(&substream->self_group.lock);
> -		}
>  		res = snd_pcm_action_group(ops, substream, state, 1);
> -		spin_unlock(&substream->group->lock);
>  	} else {
>  		res = snd_pcm_action_single(ops, substream, state);
>  	}
> @@ -802,17 +852,7 @@ static int snd_pcm_action_lock_irq(struct
> action_ops *ops,
>  	int res;
> 
>  	read_lock_irq(&snd_pcm_link_rwlock);
> -	if (snd_pcm_stream_linked(substream)) {
> -		spin_lock(&substream->group->lock);
> -		spin_lock(&substream->self_group.lock);
> -		res = snd_pcm_action_group(ops, substream, state, 1);
> -		spin_unlock(&substream->self_group.lock);
> -		spin_unlock(&substream->group->lock);
> -	} else {
> -		spin_lock(&substream->self_group.lock);
> -		res = snd_pcm_action_single(ops, substream, state);
> -		spin_unlock(&substream->self_group.lock);
> -	}
> +	res = snd_pcm_action(ops, substream, state);
>  	read_unlock_irq(&snd_pcm_link_rwlock);
>  	return res;
>  }
> @@ -1611,6 +1651,8 @@ static int snd_pcm_link(struct snd_pcm_substream
> *substream, int fd)
>  		}
>  		spin_lock_init(&substream->group->lock);
>  		INIT_LIST_HEAD(&substream->group->substreams);
> +		atomic_set(&substream->group->master_count, 0);
> +		atomic_set(&substream->group->lock_count, 0);
>  		list_add_tail(&substream->link_list, &substream->group->substreams);
>  		substream->group->count = 1;
>  	}
> 
> 
> -- 
> Jaroslav Kysela <perex at perex.cz>
> Linux Kernel Sound Maintainer
> ALSA Project; Red Hat, Inc.
>