[alsa-devel] Clean up of PCM spinlocks

Wed Mar 14 10:56:12 CET 2012

Date 13.3.2012 21:24, Takashi Iwai wrote:
> At Tue, 13 Mar 2012 20:50:00 +0100,
> Jaroslav Kysela wrote:
>>
>> Date 13.3.2012 17:05, Takashi Iwai wrote:
>>> At Tue, 13 Mar 2012 16:46:39 +0100,
>>> Jaroslav Kysela wrote:
>>>>
>>>> Date 13.3.2012 16:23, Takashi Iwai wrote:
>>>>> At Tue, 13 Mar 2012 16:16:36 +0100,
>>>>> Jaroslav Kysela wrote:
>>>>>>
>>>>>> Date 13.3.2012 16:00, Takashi Iwai wrote:
>>>>>>> At Tue, 13 Mar 2012 15:24:11 +0100,
>>>>>>> Jaroslav Kysela wrote:
>>>>>>>>
>>>>>>>> Date 13.3.2012 11:57, Takashi Iwai wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> below is a patch I posted to LKML as the follow up for the lockdep
>>>>>>>>> report from Dave Jones.  It's a resurrected patch from what I worked
>>>>>>>>> on quite ago.  Although this fix is suboptimal, I see no better way.
>>>>>>>>>
>>>>>>>>> If this looks OK, I'm going to queue it as a 3.4 material.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Takashi
>>>>>>>>>
>>>>>>>>> ===
>>>>>>>>>
>>>>>>>>> From: Takashi Iwai <tiwai at suse.de>
>>>>>>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
>>>>>>>>>
>>>>>>>>> The spinlocks held for PCM substreams have been always messy.  For the
>>>>>>>>> better concurrent accessibility, we took always the substream's lock
>>>>>>>>> first.  When substreams are linked, we need a group-wide lock, but
>>>>>>>>> this should be applied outside substream's lock.  So, we unlock the
>>>>>>>>> substream's lock first, then do locks twice.
>>>>>>>>>
>>>>>>>>> This scheme is known to be problematic with lockdep.  Maybe because
>>>>>>>>> the nested lock isn't marked properly, and partly because the lock and
>>>>>>>>> unlock sequences are different (A/B -> A/B).
>>>>>>>>>
>>>>>>>>> This patch tries to simplify the scheme.  Instead of holding the
>>>>>>>>> substream's lock, we take the group lock always at first.  A drawback
>>>>>>>>> of this is that the access to the individual substream in a same group
>>>>>>>>> won't be allowed any longer.  But, the code looks much easier, and
>>>>>>>>> likely less buggy.
>>>>>>>>
>>>>>>>> What do you mean by this? I noticed that you switched the behaviour for
>>>>>>>> the stream_lock():
>>>>>>>>
>>>>>>>> 1) self_group spinlock is used for single (non-linked) stream
>>>>>>>>    - same as before
>>>>>>>> 2) group.lock spinlock is used for the linked streams
>>>>>>>>    - the self_group spinlock is used only in the group action
>>>>>>>>      - the question is why? it seems useless, the
>>>>>>>>        stream_lock() functions use the allocated group.lock
>>>>>>>
>>>>>>> Oh yeah, right, this is utterly useless!
>>>>>>> I just thought of the mutex case, but in this case, the lock is
>>>>>>> avoided anyway.
>>>>>>>
>>>>>>>> I don't think that this patch will work in the way we expect.
>>>>>>>>
>>>>>>>> I believe we should use only group.lock spinlock for the linked streams
>>>>>>>> without using the self_group spinlock to make locking work like this:
>>>>>>>>
>>>>>>>> stream_lock(current)
>>>>>>>> group_action(current) {
>>>>>>>>   for s from all_group_substreams {
>>>>>>>>     if (s != current)
>>>>>>>>       stream_lock(s);
>>>>>>>>   }
>>>>>>>>   .... job ....
>>>>>>>>   .... group substreams unlock ....
>>>>>>>> }
>>>>>>>> stream_unlock(current)
>>>>>>>
>>>>>>> Well, if we think of a single lock, the code would be much
>>>>>>> simplified than above.
>>>>>>>
>>>>>>> The reviewsed patch is below.  I left the single-stream version as is
>>>>>>> since it's used explicitly in the drain() -- the drain action is
>>>>>>> applied to each substream although it's linked.
>>>>>>
>>>>>> I meant use group->lock in the group_action, see patch bellow.
>>>>>
>>>>> Hmm, I don't think this works.  From linked stream members,
>>>>> group->lock points to the very same lock instance.  So,
>>>>> snd_pcm_action_group() would dead-lock the same lock with your code.
>>>>
>>>> I see. The problem is that the concurrent but linked streams can take
>>>> different first locks for the group locking:
>>>>
>>>> time 0		 time 1
>>>> --------------------------------
>>>> lock stream 0 -> group lock
>>>> lock stream 1 -> group lock
>>>>
>>>> In this case the condition current != s is not enough. I think that we
>>>> have to strictly separate situations when the group lock can be
>>>> activated and use different locking scheme there - for example using a
>>>> master lock per linked substreams.
>>>
>>> In my revised patch, group->lock is actually the master lock of the
>>> group.  This is the only lock applied to the substream.
>>>
>>> When a stream is linked, it's done in the global write_lock to
>>> guarantee that stream is unlocked.  Then group->lock is replaced with
>>> a real group lock from self_group.lock.  So, there is no race among
>>> this move.
>>>
>>>> It seems to me that your proposed patch just removes the spinlocks for
>>>> linked substreams which can cause inconsistencies for these streams.
>>>
>>> Not really.  It just uses a single lock for linked streams.  There are
>>> no longer stream locks once when the stream is linked.
>>>
>>> This is of course a drawback -- it disallows the concurrent access to
>>> linked stream members in some operations.  How this is regarded as a
>>> serious drawback, is a question of operation styles.  If any apps are
>>> accessing many streams as a linked stream in realtime manner, and
>>> access to each without mmap, then the performance may be worsen.  But,
>>> this scenario is rather pretty rare, I guess.
>>>
>>>> Or just a quick idea - what about to track the the group action
>>>> collisions? We can use a spinlock protected counter and serialize these
>>>> requests in another way (if counter is above one - the CPU should wait
>>>> until it becomes one).
>>>
>>> Well...  this sounds way too complex, I'm afraid.
>>
>> I tried to think about it more. I think that the spinlock reorder is at
>> the wrong code position. What about the code bellow. I believe, it
>> should guarantee the consistent locking scheme (although the first
>> substream lock will stay outside of the group->lock, but it should not
>> be an issue).
> 
> Unfortunately it doesn't work neither.  It's the way of thinking I've
> traced, too.
> 
> snd_pcm_action() could be called in the context where the stream lock
> has been already held, e.g. at stopping the stream from the interrupt
> handler.  Thus you can't take the group lock over it.

Agreed. We must use the "master" lock or "per-substream" lock. Mixing of
these locks results in an ugly locking scheme.

> The re-locking in the current code (unlock first then two locks) is a
> compromise.  It actually works well in practice.  But the difference
> of lock and unlock sequences is confusing, and the way of checking the
> double-lock via spin_trylock() is too tricky.

I think that spin_trylock() can help, but we should really use it in a
different way than before. I think that I finally created a way to use
the per-substream locks and serialize locking of all those locks for a
group operation. Please, review the patch bellow.

The next change can be a switch of the spinlocks from snd_pcm_group to
the snd_pcm_substream and change the group structure allocation to avoid
GFP_ATOMIC as you suggested.

					Jaroslav

diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index 1d58d79..9decfa2 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -362,6 +362,8 @@ struct snd_pcm_group {		/* keep linked substreams */
 	spinlock_t lock;
 	struct list_head substreams;
 	int count;
+	atomic_t master_count;
+	atomic_t lock_count;
 };

 struct pid;
diff --git a/sound/core/pcm.c b/sound/core/pcm.c
index 6e4bfcc..c3ae1fc 100644
--- a/sound/core/pcm.c
+++ b/sound/core/pcm.c
@@ -690,6 +690,8 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int
stream, int substream_count)
 		substream->group = &substream->self_group;
 		spin_lock_init(&substream->self_group.lock);
 		INIT_LIST_HEAD(&substream->self_group.substreams);
+		atomic_set(&substream->self_group.master_count, 0);
+		atomic_set(&substream->self_group.lock_count, 0);
 		list_add_tail(&substream->link_list, &substream->self_group.substreams);
 		atomic_set(&substream->mmap_count, 0);
 		prev = substream;
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 25ed9fe..e093f8b 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -709,12 +709,64 @@ static int snd_pcm_action_group(struct action_ops
*ops,
 {
 	struct snd_pcm_substream *s = NULL;
 	struct snd_pcm_substream *s1;
-	int res = 0;
+	struct snd_pcm_group *g = substream->group;
+	int res = 0, idx, lcnt;
+	unsigned long locked = 0;
+
+	if (!do_lock)
+		goto _pre;
+
+	/*
+	 * ensure the stream locking serialization here
+	 */
+	atomic_inc(&g->master_count);
+ _retry:
+	if (atomic_inc_return(&g->lock_count) == 1) {
+		lcnt = 0;
+ _retry_lock:
+		idx = 0;
+		snd_pcm_group_for_each_entry(s, substream) {
+			if (s != substream && !(locked & (1 << idx))) {
+				if (spin_trylock(&s->self_group.lock)) {
+					locked |= (1 << idx);
+					lcnt++;
+				}
+			}
+			idx++;
+		}
+		/*
+		 * at this point, check if another group action
+		 * owner tries to access this part of code
+		 *
+		 * another situation is that another substream lock is
+		 * active without the group action request;
+		 * wait, until this request is finished [ref1]
+		 */
+		if (g->count != lcnt + atomic_read(&g->master_count))
+			goto _retry_lock;
+	} else {
+		/*
+		 * another group owner tries to reach this code
+		 * serialize: wait for the first owners(s)
+		 */
+		while (atomic_read(&g->lock_count) != 1) {
+			/*
+			 * multiple requests - try to release
+			 * the atomic counter to avoid deadlock,
+			 * use new read operation to increase
+			 * time window for other checkers
+			 */
+			if (atomic_read(&g->lock_count) > 2) {
+				atomic_dec(&g->lock_count);
+				goto _retry;
+			}
+			/* do nothing here, wait for finish */
+		}
+		goto _retry_lock;
+	}

+ _pre:
 	snd_pcm_group_for_each_entry(s, substream) {
-		if (do_lock && s != substream)
-			spin_lock_nested(&s->self_group.lock,
-					 SINGLE_DEPTH_NESTING);
 		res = ops->pre_action(s, state);
 		if (res < 0)
 			goto _unlock;
@@ -729,7 +781,6 @@ static int snd_pcm_action_group(struct action_ops *ops,
 					ops->undo_action(s1, state);
 				}
 			}
-			s = NULL; /* unlock all */
 			goto _unlock;
 		}
 	}
@@ -738,13 +789,18 @@ static int snd_pcm_action_group(struct action_ops
*ops,
 	}
  _unlock:
 	if (do_lock) {
-		/* unlock streams */
+		/* unlock all streams */
+		idx = 0;
 		snd_pcm_group_for_each_entry(s1, substream) {
-			if (s1 != substream)
+			if (s != substream && (locked & (1 << idx)) != 0)
 				spin_unlock(&s1->self_group.lock);
-			if (s1 == s)	/* end */
-				break;
 		}
+		/*
+		 * keep decrement order reverse to avoid
+		 * a bad [ref1] condition check
+		 */
+		atomic_dec(&g->master_count);
+		atomic_dec(&g->lock_count);
 	}
 	return res;
 }
@@ -779,13 +835,7 @@ static int snd_pcm_action(struct action_ops *ops,
 	int res;

 	if (snd_pcm_stream_linked(substream)) {
-		if (!spin_trylock(&substream->group->lock)) {
-			spin_unlock(&substream->self_group.lock);
-			spin_lock(&substream->group->lock);
-			spin_lock(&substream->self_group.lock);
-		}
 		res = snd_pcm_action_group(ops, substream, state, 1);
-		spin_unlock(&substream->group->lock);
 	} else {
 		res = snd_pcm_action_single(ops, substream, state);
 	}
@@ -802,17 +852,7 @@ static int snd_pcm_action_lock_irq(struct
action_ops *ops,
 	int res;

 	read_lock_irq(&snd_pcm_link_rwlock);
-	if (snd_pcm_stream_linked(substream)) {
-		spin_lock(&substream->group->lock);
-		spin_lock(&substream->self_group.lock);
-		res = snd_pcm_action_group(ops, substream, state, 1);
-		spin_unlock(&substream->self_group.lock);
-		spin_unlock(&substream->group->lock);
-	} else {
-		spin_lock(&substream->self_group.lock);
-		res = snd_pcm_action_single(ops, substream, state);
-		spin_unlock(&substream->self_group.lock);
-	}
+	res = snd_pcm_action(ops, substream, state);
 	read_unlock_irq(&snd_pcm_link_rwlock);
 	return res;
 }
@@ -1611,6 +1651,8 @@ static int snd_pcm_link(struct snd_pcm_substream
*substream, int fd)
 		}
 		spin_lock_init(&substream->group->lock);
 		INIT_LIST_HEAD(&substream->group->substreams);
+		atomic_set(&substream->group->master_count, 0);
+		atomic_set(&substream->group->lock_count, 0);
 		list_add_tail(&substream->link_list, &substream->group->substreams);
 		substream->group->count = 1;
 	}


-- 
Jaroslav Kysela <perex at perex.cz>
Linux Kernel Sound Maintainer
ALSA Project; Red Hat, Inc.