[alsa-devel] Clean up of PCM spinlocks
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
Note that the group lock is identical with the substream's lock initially before the substream is linked with others. So, in the case of non-linked PCM, the new code behaves exactly same as before.
Signed-off-by: Takashi Iwai tiwai@suse.de --- include/sound/pcm.h | 12 ++++++------ sound/core/pcm_native.c | 36 ++++++++++++++---------------------- 2 files changed, 20 insertions(+), 28 deletions(-)
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 0cf91b2..d634dcc 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -531,36 +531,36 @@ static inline int snd_pcm_stream_linked(struct snd_pcm_substream *substream) static inline void snd_pcm_stream_lock(struct snd_pcm_substream *substream) { read_lock(&snd_pcm_link_rwlock); - spin_lock(&substream->self_group.lock); + spin_lock(&substream->group->lock); }
static inline void snd_pcm_stream_unlock(struct snd_pcm_substream *substream) { - spin_unlock(&substream->self_group.lock); + spin_unlock(&substream->group->lock); read_unlock(&snd_pcm_link_rwlock); }
static inline void snd_pcm_stream_lock_irq(struct snd_pcm_substream *substream) { read_lock_irq(&snd_pcm_link_rwlock); - spin_lock(&substream->self_group.lock); + spin_lock(&substream->group->lock); }
static inline void snd_pcm_stream_unlock_irq(struct snd_pcm_substream *substream) { - spin_unlock(&substream->self_group.lock); + spin_unlock(&substream->group->lock); read_unlock_irq(&snd_pcm_link_rwlock); }
#define snd_pcm_stream_lock_irqsave(substream, flags) \ do { \ read_lock_irqsave(&snd_pcm_link_rwlock, (flags)); \ - spin_lock(&substream->self_group.lock); \ + spin_lock(&substream->group->lock); \ } while (0)
#define snd_pcm_stream_unlock_irqrestore(substream, flags) \ do { \ - spin_unlock(&substream->self_group.lock); \ + spin_unlock(&substream->group->lock); \ read_unlock_irqrestore(&snd_pcm_link_rwlock, (flags)); \ } while (0)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..c5f28ed 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -712,7 +712,7 @@ static int snd_pcm_action_group(struct action_ops *ops, int res = 0;
snd_pcm_group_for_each_entry(s, substream) { - if (do_lock && s != substream) + if (do_lock) spin_lock_nested(&s->self_group.lock, SINGLE_DEPTH_NESTING); res = ops->pre_action(s, state); @@ -740,8 +740,7 @@ static int snd_pcm_action_group(struct action_ops *ops, if (do_lock) { /* unlock streams */ snd_pcm_group_for_each_entry(s1, substream) { - if (s1 != substream) - spin_unlock(&s1->self_group.lock); + spin_unlock(&s1->self_group.lock); if (s1 == s) /* end */ break; } @@ -779,13 +778,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) { - if (!spin_trylock(&substream->group->lock)) { - spin_unlock(&substream->self_group.lock); - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - } res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->group->lock); } else { res = snd_pcm_action_single(ops, substream, state); } @@ -801,19 +794,13 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, { int res;
- read_lock_irq(&snd_pcm_link_rwlock); + snd_pcm_stream_lock_irq(substream); if (snd_pcm_stream_linked(substream)) { - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->self_group.lock); - spin_unlock(&substream->group->lock); } else { - spin_lock(&substream->self_group.lock); res = snd_pcm_action_single(ops, substream, state); - spin_unlock(&substream->self_group.lock); } - read_unlock_irq(&snd_pcm_link_rwlock); + snd_pcm_stream_unlock_irq(substream); return res; }
@@ -1586,12 +1573,18 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) struct file *file; struct snd_pcm_file *pcm_file; struct snd_pcm_substream *substream1; + struct snd_pcm_group *group;
file = snd_pcm_file_fd(fd); if (!file) return -EBADFD; pcm_file = file->private_data; substream1 = pcm_file->substream; + group = kmalloc(sizeof(*group), GFP_KERNEL); + if (!group) { + res = -ENOMEM; + goto _nolock; + } down_write(&snd_pcm_link_rwsem); write_lock_irq(&snd_pcm_link_rwlock); if (substream->runtime->status->state == SNDRV_PCM_STATE_OPEN || @@ -1604,11 +1597,7 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) goto _end; } if (!snd_pcm_stream_linked(substream)) { - substream->group = kmalloc(sizeof(struct snd_pcm_group), GFP_ATOMIC); - if (substream->group == NULL) { - res = -ENOMEM; - goto _end; - } + substream->group = group; spin_lock_init(&substream->group->lock); INIT_LIST_HEAD(&substream->group->substreams); list_add_tail(&substream->link_list, &substream->group->substreams); @@ -1620,7 +1609,10 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) _end: write_unlock_irq(&snd_pcm_link_rwlock); up_write(&snd_pcm_link_rwsem); + _nolock: fput(file); + if (res < 0) + kfree(group); return res; }
Date 13.3.2012 11:57, Takashi Iwai wrote:
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
1) self_group spinlock is used for single (non-linked) stream - same as before 2) group.lock spinlock is used for the linked streams - the self_group spinlock is used only in the group action - the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
So we use only one lock per stream.
Jaroslav
Note that the group lock is identical with the substream's lock initially before the substream is linked with others. So, in the case of non-linked PCM, the new code behaves exactly same as before.
Signed-off-by: Takashi Iwai tiwai@suse.de
include/sound/pcm.h | 12 ++++++------ sound/core/pcm_native.c | 36 ++++++++++++++---------------------- 2 files changed, 20 insertions(+), 28 deletions(-)
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 0cf91b2..d634dcc 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -531,36 +531,36 @@ static inline int snd_pcm_stream_linked(struct snd_pcm_substream *substream) static inline void snd_pcm_stream_lock(struct snd_pcm_substream *substream) { read_lock(&snd_pcm_link_rwlock);
- spin_lock(&substream->self_group.lock);
- spin_lock(&substream->group->lock);
}
static inline void snd_pcm_stream_unlock(struct snd_pcm_substream *substream) {
- spin_unlock(&substream->self_group.lock);
- spin_unlock(&substream->group->lock); read_unlock(&snd_pcm_link_rwlock);
}
static inline void snd_pcm_stream_lock_irq(struct snd_pcm_substream *substream) { read_lock_irq(&snd_pcm_link_rwlock);
- spin_lock(&substream->self_group.lock);
- spin_lock(&substream->group->lock);
}
static inline void snd_pcm_stream_unlock_irq(struct snd_pcm_substream *substream) {
- spin_unlock(&substream->self_group.lock);
- spin_unlock(&substream->group->lock); read_unlock_irq(&snd_pcm_link_rwlock);
}
#define snd_pcm_stream_lock_irqsave(substream, flags) \ do { \ read_lock_irqsave(&snd_pcm_link_rwlock, (flags)); \
- spin_lock(&substream->self_group.lock); \
- spin_lock(&substream->group->lock); \
} while (0)
#define snd_pcm_stream_unlock_irqrestore(substream, flags) \ do { \
- spin_unlock(&substream->self_group.lock); \
- spin_unlock(&substream->group->lock); \ read_unlock_irqrestore(&snd_pcm_link_rwlock, (flags)); \
} while (0)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..c5f28ed 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -712,7 +712,7 @@ static int snd_pcm_action_group(struct action_ops *ops, int res = 0;
snd_pcm_group_for_each_entry(s, substream) {
if (do_lock && s != substream)
res = ops->pre_action(s, state);if (do_lock) spin_lock_nested(&s->self_group.lock, SINGLE_DEPTH_NESTING);
@@ -740,8 +740,7 @@ static int snd_pcm_action_group(struct action_ops *ops, if (do_lock) { /* unlock streams */ snd_pcm_group_for_each_entry(s1, substream) {
if (s1 != substream)
spin_unlock(&s1->self_group.lock);
}spin_unlock(&s1->self_group.lock); if (s1 == s) /* end */ break;
@@ -779,13 +778,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) {
if (!spin_trylock(&substream->group->lock)) {
spin_unlock(&substream->self_group.lock);
spin_lock(&substream->group->lock);
spin_lock(&substream->self_group.lock);
res = snd_pcm_action_group(ops, substream, state, 1);}
} else { res = snd_pcm_action_single(ops, substream, state); }spin_unlock(&substream->group->lock);
@@ -801,19 +794,13 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, { int res;
- read_lock_irq(&snd_pcm_link_rwlock);
- snd_pcm_stream_lock_irq(substream); if (snd_pcm_stream_linked(substream)) {
spin_lock(&substream->group->lock);
res = snd_pcm_action_group(ops, substream, state, 1);spin_lock(&substream->self_group.lock);
spin_unlock(&substream->self_group.lock);
} else {spin_unlock(&substream->group->lock);
res = snd_pcm_action_single(ops, substream, state);spin_lock(&substream->self_group.lock);
}spin_unlock(&substream->self_group.lock);
- read_unlock_irq(&snd_pcm_link_rwlock);
- snd_pcm_stream_unlock_irq(substream); return res;
}
@@ -1586,12 +1573,18 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) struct file *file; struct snd_pcm_file *pcm_file; struct snd_pcm_substream *substream1;
struct snd_pcm_group *group;
file = snd_pcm_file_fd(fd); if (!file) return -EBADFD; pcm_file = file->private_data; substream1 = pcm_file->substream;
group = kmalloc(sizeof(*group), GFP_KERNEL);
if (!group) {
res = -ENOMEM;
goto _nolock;
} down_write(&snd_pcm_link_rwsem); write_lock_irq(&snd_pcm_link_rwlock); if (substream->runtime->status->state == SNDRV_PCM_STATE_OPEN ||
@@ -1604,11 +1597,7 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) goto _end; } if (!snd_pcm_stream_linked(substream)) {
substream->group = kmalloc(sizeof(struct snd_pcm_group), GFP_ATOMIC);
if (substream->group == NULL) {
res = -ENOMEM;
goto _end;
}
spin_lock_init(&substream->group->lock); INIT_LIST_HEAD(&substream->group->substreams); list_add_tail(&substream->link_list, &substream->group->substreams);substream->group = group;
@@ -1620,7 +1609,10 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) _end: write_unlock_irq(&snd_pcm_link_rwlock); up_write(&snd_pcm_link_rwsem);
- _nolock: fput(file);
- if (res < 0)
return res;kfree(group);
}
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 11:57, Takashi Iwai wrote:
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
- self_group spinlock is used for single (non-linked) stream
- same as before
- group.lock spinlock is used for the linked streams
- the self_group spinlock is used only in the group action
- the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
Also, the changes in snd_pcm_link() to use GFP_KERNEL is splitted out of this patch.
thanks,
Takashi
--- From 4ad4cd8120958fa9ec037f21da6001ec7768577d Mon Sep 17 00:00:00 2001 From: Takashi Iwai tiwai@suse.de Date: Mon, 12 Mar 2012 18:30:13 +0100 Subject: [PATCH v2] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the nested spinlocks, we hold a single lock, substream->group->lock. It points to the self_group.lock when the stream is unlinked. When the stream is linked, it points to its group's common lock.
A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
Signed-off-by: Takashi Iwai tiwai@suse.de --- include/sound/pcm.h | 12 +++++------ sound/core/pcm_native.c | 51 +++++++++-------------------------------------- 2 files changed, 15 insertions(+), 48 deletions(-)
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 0cf91b2..d634dcc 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -531,36 +531,36 @@ static inline int snd_pcm_stream_linked(struct snd_pcm_substream *substream) static inline void snd_pcm_stream_lock(struct snd_pcm_substream *substream) { read_lock(&snd_pcm_link_rwlock); - spin_lock(&substream->self_group.lock); + spin_lock(&substream->group->lock); }
static inline void snd_pcm_stream_unlock(struct snd_pcm_substream *substream) { - spin_unlock(&substream->self_group.lock); + spin_unlock(&substream->group->lock); read_unlock(&snd_pcm_link_rwlock); }
static inline void snd_pcm_stream_lock_irq(struct snd_pcm_substream *substream) { read_lock_irq(&snd_pcm_link_rwlock); - spin_lock(&substream->self_group.lock); + spin_lock(&substream->group->lock); }
static inline void snd_pcm_stream_unlock_irq(struct snd_pcm_substream *substream) { - spin_unlock(&substream->self_group.lock); + spin_unlock(&substream->group->lock); read_unlock_irq(&snd_pcm_link_rwlock); }
#define snd_pcm_stream_lock_irqsave(substream, flags) \ do { \ read_lock_irqsave(&snd_pcm_link_rwlock, (flags)); \ - spin_lock(&substream->self_group.lock); \ + spin_lock(&substream->group->lock); \ } while (0)
#define snd_pcm_stream_unlock_irqrestore(substream, flags) \ do { \ - spin_unlock(&substream->self_group.lock); \ + spin_unlock(&substream->group->lock); \ read_unlock_irqrestore(&snd_pcm_link_rwlock, (flags)); \ } while (0)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..aff5187 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -701,23 +701,20 @@ struct action_ops { /* * this functions is core for handling of linked stream * Note: the stream state might be changed also on failure - * Note2: call with calling stream lock + link lock + * Note2: call with calling stream lock */ static int snd_pcm_action_group(struct action_ops *ops, struct snd_pcm_substream *substream, - int state, int do_lock) + int state) { struct snd_pcm_substream *s = NULL; struct snd_pcm_substream *s1; int res = 0;
snd_pcm_group_for_each_entry(s, substream) { - if (do_lock && s != substream) - spin_lock_nested(&s->self_group.lock, - SINGLE_DEPTH_NESTING); res = ops->pre_action(s, state); if (res < 0) - goto _unlock; + return res; } snd_pcm_group_for_each_entry(s, substream) { res = ops->do_action(s, state); @@ -729,23 +726,12 @@ static int snd_pcm_action_group(struct action_ops *ops, ops->undo_action(s1, state); } } - s = NULL; /* unlock all */ - goto _unlock; + return res; } } snd_pcm_group_for_each_entry(s, substream) { ops->post_action(s, state); } - _unlock: - if (do_lock) { - /* unlock streams */ - snd_pcm_group_for_each_entry(s1, substream) { - if (s1 != substream) - spin_unlock(&s1->self_group.lock); - if (s1 == s) /* end */ - break; - } - } return res; }
@@ -779,13 +765,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) { - if (!spin_trylock(&substream->group->lock)) { - spin_unlock(&substream->self_group.lock); - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - } - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->group->lock); + res = snd_pcm_action_group(ops, substream, state); } else { res = snd_pcm_action_single(ops, substream, state); } @@ -801,19 +781,9 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, { int res;
- read_lock_irq(&snd_pcm_link_rwlock); - if (snd_pcm_stream_linked(substream)) { - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->self_group.lock); - spin_unlock(&substream->group->lock); - } else { - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_single(ops, substream, state); - spin_unlock(&substream->self_group.lock); - } - read_unlock_irq(&snd_pcm_link_rwlock); + snd_pcm_stream_lock_irq(substream); + res = snd_pcm_action(ops, substream, state); + snd_pcm_stream_unlock_irq(substream); return res; }
@@ -826,10 +796,7 @@ static int snd_pcm_action_nonatomic(struct action_ops *ops, int res;
down_read(&snd_pcm_link_rwsem); - if (snd_pcm_stream_linked(substream)) - res = snd_pcm_action_group(ops, substream, state, 0); - else - res = snd_pcm_action_single(ops, substream, state); + res = snd_pcm_action(ops, substream, state); up_read(&snd_pcm_link_rwsem); return res; }
Date 13.3.2012 16:00, Takashi Iwai wrote:
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 11:57, Takashi Iwai wrote:
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
- self_group spinlock is used for single (non-linked) stream
- same as before
- group.lock spinlock is used for the linked streams
- the self_group spinlock is used only in the group action
- the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Also, the changes in snd_pcm_link() to use GFP_KERNEL is splitted out of this patch.
Maybe, we can reuse the self_group container for the linked streams, too? In this case, the dynamic allocation of new group structure can be removed.
Jaroslav
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 1d58d79..301ed22 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -535,36 +535,36 @@ static inline int snd_pcm_stream_linked(struct snd_pcm_substream *substream) static inline void snd_pcm_stream_lock(struct snd_pcm_substream *substream) { read_lock(&snd_pcm_link_rwlock); - spin_lock(&substream->self_group.lock); + spin_lock(&substream->group->lock); }
static inline void snd_pcm_stream_unlock(struct snd_pcm_substream *substream) { - spin_unlock(&substream->self_group.lock); + spin_unlock(&substream->group->lock); read_unlock(&snd_pcm_link_rwlock); }
static inline void snd_pcm_stream_lock_irq(struct snd_pcm_substream *substream) { read_lock_irq(&snd_pcm_link_rwlock); - spin_lock(&substream->self_group.lock); + spin_lock(&substream->group->lock); }
static inline void snd_pcm_stream_unlock_irq(struct snd_pcm_substream *substream) { - spin_unlock(&substream->self_group.lock); + spin_unlock(&substream->group->lock); read_unlock_irq(&snd_pcm_link_rwlock); }
#define snd_pcm_stream_lock_irqsave(substream, flags) \ do { \ read_lock_irqsave(&snd_pcm_link_rwlock, (flags)); \ - spin_lock(&substream->self_group.lock); \ + spin_lock(&substream->group->lock); \ } while (0)
#define snd_pcm_stream_unlock_irqrestore(substream, flags) \ do { \ - spin_unlock(&substream->self_group.lock); \ + spin_unlock(&substream->group->lock); \ read_unlock_irqrestore(&snd_pcm_link_rwlock, (flags)); \ } while (0)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..5096c83 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -713,7 +713,7 @@ static int snd_pcm_action_group(struct action_ops *ops,
snd_pcm_group_for_each_entry(s, substream) { if (do_lock && s != substream) - spin_lock_nested(&s->self_group.lock, + spin_lock_nested(&s->group->lock, SINGLE_DEPTH_NESTING); res = ops->pre_action(s, state); if (res < 0) @@ -741,7 +741,7 @@ static int snd_pcm_action_group(struct action_ops *ops, /* unlock streams */ snd_pcm_group_for_each_entry(s1, substream) { if (s1 != substream) - spin_unlock(&s1->self_group.lock); + spin_unlock(&s1->group->lock); if (s1 == s) /* end */ break; } @@ -779,13 +779,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) { - if (!spin_trylock(&substream->group->lock)) { - spin_unlock(&substream->self_group.lock); - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - } - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->group->lock); + res = snd_pcm_action_group(ops, substream, state); } else { res = snd_pcm_action_single(ops, substream, state); } @@ -801,19 +795,9 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, { int res;
- read_lock_irq(&snd_pcm_link_rwlock); - if (snd_pcm_stream_linked(substream)) { - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->self_group.lock); - spin_unlock(&substream->group->lock); - } else { - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_single(ops, substream, state); - spin_unlock(&substream->self_group.lock); - } - read_unlock_irq(&snd_pcm_link_rwlock); + snd_pcm_stream_lock_irq(substream); + res = snd_pcm_action(ops, substream, state); + snd_pcm_stream_unlock_irq(substream); return res; }
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:00, Takashi Iwai wrote:
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 11:57, Takashi Iwai wrote:
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
- self_group spinlock is used for single (non-linked) stream
- same as before
- group.lock spinlock is used for the linked streams
- the self_group spinlock is used only in the group action
- the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
Also, the changes in snd_pcm_link() to use GFP_KERNEL is splitted out of this patch.
Maybe, we can reuse the self_group container for the linked streams, too? In this case, the dynamic allocation of new group structure can be removed.
A substream can be unlinked at any time, thus it can't belong to the substream instance. The point of substream->group is to detach the instance from the substream. self_group is just an optimization for unlinked streams.
The GFP_KERNEL thing is a thing of optimization, too: whether to call kmalloc() superfluously in the case of errors or not. As seen in the patch below, it's no big matter...
thanks,
Takashi
--- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Avoid GFP_ATOMIC in snd_pcm_link()
GFP_ATOMIC is used in snd_pcm_link() just because the kmalloc is called inside a lock. Since this function isn't too critical for speed and is rarely called in practice, better to allocate the chunk at first before spinlock and free it in error paths, so that GFP_KERNEL can be used.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/core/pcm_native.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index aff5187..afa4b83 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1553,12 +1553,18 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) struct file *file; struct snd_pcm_file *pcm_file; struct snd_pcm_substream *substream1; + struct snd_pcm_group *group;
file = snd_pcm_file_fd(fd); if (!file) return -EBADFD; pcm_file = file->private_data; substream1 = pcm_file->substream; + group = kmalloc(sizeof(*group), GFP_KERNEL); + if (!group) { + res = -ENOMEM; + goto _nolock; + } down_write(&snd_pcm_link_rwsem); write_lock_irq(&snd_pcm_link_rwlock); if (substream->runtime->status->state == SNDRV_PCM_STATE_OPEN || @@ -1571,11 +1577,7 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) goto _end; } if (!snd_pcm_stream_linked(substream)) { - substream->group = kmalloc(sizeof(struct snd_pcm_group), GFP_ATOMIC); - if (substream->group == NULL) { - res = -ENOMEM; - goto _end; - } + substream->group = group; spin_lock_init(&substream->group->lock); INIT_LIST_HEAD(&substream->group->substreams); list_add_tail(&substream->link_list, &substream->group->substreams); @@ -1587,7 +1589,10 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) _end: write_unlock_irq(&snd_pcm_link_rwlock); up_write(&snd_pcm_link_rwsem); + _nolock: fput(file); + if (res < 0) + kfree(group); return res; }
Date 13.3.2012 16:23, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:00, Takashi Iwai wrote:
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 11:57, Takashi Iwai wrote:
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
- self_group spinlock is used for single (non-linked) stream
- same as before
- group.lock spinlock is used for the linked streams
- the self_group spinlock is used only in the group action
- the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1 -------------------------------- lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Also, the changes in snd_pcm_link() to use GFP_KERNEL is splitted out of this patch.
Maybe, we can reuse the self_group container for the linked streams, too? In this case, the dynamic allocation of new group structure can be removed.
A substream can be unlinked at any time, thus it can't belong to the substream instance. The point of substream->group is to detach the instance from the substream. self_group is just an optimization for unlinked streams.
I see.
The GFP_KERNEL thing is a thing of optimization, too: whether to call kmalloc() superfluously in the case of errors or not. As seen in the patch below, it's no big matter...
I agree.
Jaroslav
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:23, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:00, Takashi Iwai wrote:
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 11:57, Takashi Iwai wrote:
Hi,
below is a patch I posted to LKML as the follow up for the lockdep report from Dave Jones. It's a resurrected patch from what I worked on quite ago. Although this fix is suboptimal, I see no better way.
If this looks OK, I'm going to queue it as a 3.4 material.
thanks,
Takashi
===
From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks
The spinlocks held for PCM substreams have been always messy. For the better concurrent accessibility, we took always the substream's lock first. When substreams are linked, we need a group-wide lock, but this should be applied outside substream's lock. So, we unlock the substream's lock first, then do locks twice.
This scheme is known to be problematic with lockdep. Maybe because the nested lock isn't marked properly, and partly because the lock and unlock sequences are different (A/B -> A/B).
This patch tries to simplify the scheme. Instead of holding the substream's lock, we take the group lock always at first. A drawback of this is that the access to the individual substream in a same group won't be allowed any longer. But, the code looks much easier, and likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
- self_group spinlock is used for single (non-linked) stream
- same as before
- group.lock spinlock is used for the linked streams
- the self_group spinlock is used only in the group action
- the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1
lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Well... this sounds way too complex, I'm afraid.
thanks,
Takashi
Date 13.3.2012 17:05, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:23, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:00, Takashi Iwai wrote:
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 11:57, Takashi Iwai wrote: > Hi, > > below is a patch I posted to LKML as the follow up for the lockdep > report from Dave Jones. It's a resurrected patch from what I worked > on quite ago. Although this fix is suboptimal, I see no better way. > > If this looks OK, I'm going to queue it as a 3.4 material. > > > thanks, > > Takashi > > === > > From: Takashi Iwai tiwai@suse.de > Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks > > The spinlocks held for PCM substreams have been always messy. For the > better concurrent accessibility, we took always the substream's lock > first. When substreams are linked, we need a group-wide lock, but > this should be applied outside substream's lock. So, we unlock the > substream's lock first, then do locks twice. > > This scheme is known to be problematic with lockdep. Maybe because > the nested lock isn't marked properly, and partly because the lock and > unlock sequences are different (A/B -> A/B). > > This patch tries to simplify the scheme. Instead of holding the > substream's lock, we take the group lock always at first. A drawback > of this is that the access to the individual substream in a same group > won't be allowed any longer. But, the code looks much easier, and > likely less buggy.
What do you mean by this? I noticed that you switched the behaviour for the stream_lock():
- self_group spinlock is used for single (non-linked) stream
- same as before
- group.lock spinlock is used for the linked streams
- the self_group spinlock is used only in the group action
- the question is why? it seems useless, the stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
I don't think that this patch will work in the way we expect.
I believe we should use only group.lock spinlock for the linked streams without using the self_group spinlock to make locking work like this:
stream_lock(current) group_action(current) { for s from all_group_substreams { if (s != current) stream_lock(s); } .... job .... .... group substreams unlock .... } stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1
lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..758a7c0 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -711,10 +711,12 @@ static int snd_pcm_action_group(struct action_ops *ops, struct snd_pcm_substream *s1; int res = 0;
+ spin_lock(&substream->group->lock); snd_pcm_group_for_each_entry(s, substream) { - if (do_lock && s != substream) - spin_lock_nested(&s->self_group.lock, - SINGLE_DEPTH_NESTING); + if (do_lock && s != substream) { + while (!spin_trylock(&s->self_group.lock)) + /* nothing */ ; + } res = ops->pre_action(s, state); if (res < 0) goto _unlock; @@ -746,6 +748,7 @@ static int snd_pcm_action_group(struct action_ops *ops, break; } } + spin_unlock(&substream->group->lock); return res; }
@@ -779,13 +782,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) { - if (!spin_trylock(&substream->group->lock)) { - spin_unlock(&substream->self_group.lock); - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - } - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->group->lock); + res = snd_pcm_action_group(ops, substream, state); } else { res = snd_pcm_action_single(ops, substream, state); } @@ -802,17 +799,7 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, int res;
read_lock_irq(&snd_pcm_link_rwlock); - if (snd_pcm_stream_linked(substream)) { - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->self_group.lock); - spin_unlock(&substream->group->lock); - } else { - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_single(ops, substream, state); - spin_unlock(&substream->self_group.lock); - } + res = snd_pcm_action(ops, substream, state); read_unlock_irq(&snd_pcm_link_rwlock); return res; }
Jaroslav
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 17:05, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:23, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:00, Takashi Iwai wrote:
At Tue, 13 Mar 2012 15:24:11 +0100, Jaroslav Kysela wrote: > > Date 13.3.2012 11:57, Takashi Iwai wrote: >> Hi, >> >> below is a patch I posted to LKML as the follow up for the lockdep >> report from Dave Jones. It's a resurrected patch from what I worked >> on quite ago. Although this fix is suboptimal, I see no better way. >> >> If this looks OK, I'm going to queue it as a 3.4 material. >> >> >> thanks, >> >> Takashi >> >> === >> >> From: Takashi Iwai tiwai@suse.de >> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >> >> The spinlocks held for PCM substreams have been always messy. For the >> better concurrent accessibility, we took always the substream's lock >> first. When substreams are linked, we need a group-wide lock, but >> this should be applied outside substream's lock. So, we unlock the >> substream's lock first, then do locks twice. >> >> This scheme is known to be problematic with lockdep. Maybe because >> the nested lock isn't marked properly, and partly because the lock and >> unlock sequences are different (A/B -> A/B). >> >> This patch tries to simplify the scheme. Instead of holding the >> substream's lock, we take the group lock always at first. A drawback >> of this is that the access to the individual substream in a same group >> won't be allowed any longer. But, the code looks much easier, and >> likely less buggy. > > What do you mean by this? I noticed that you switched the behaviour for > the stream_lock(): > > 1) self_group spinlock is used for single (non-linked) stream > - same as before > 2) group.lock spinlock is used for the linked streams > - the self_group spinlock is used only in the group action > - the question is why? it seems useless, the > stream_lock() functions use the allocated group.lock
Oh yeah, right, this is utterly useless! I just thought of the mutex case, but in this case, the lock is avoided anyway.
> I don't think that this patch will work in the way we expect. > > I believe we should use only group.lock spinlock for the linked streams > without using the self_group spinlock to make locking work like this: > > stream_lock(current) > group_action(current) { > for s from all_group_substreams { > if (s != current) > stream_lock(s); > } > .... job .... > .... group substreams unlock .... > } > stream_unlock(current)
Well, if we think of a single lock, the code would be much simplified than above.
The reviewsed patch is below. I left the single-stream version as is since it's used explicitly in the drain() -- the drain action is applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1
lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
thanks,
Takashi
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..758a7c0 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -711,10 +711,12 @@ static int snd_pcm_action_group(struct action_ops *ops, struct snd_pcm_substream *s1; int res = 0;
- spin_lock(&substream->group->lock); snd_pcm_group_for_each_entry(s, substream) {
if (do_lock && s != substream)
spin_lock_nested(&s->self_group.lock,
SINGLE_DEPTH_NESTING);
if (do_lock && s != substream) {
while (!spin_trylock(&s->self_group.lock))
/* nothing */ ;
res = ops->pre_action(s, state); if (res < 0) goto _unlock;}
@@ -746,6 +748,7 @@ static int snd_pcm_action_group(struct action_ops *ops, break; } }
- spin_unlock(&substream->group->lock); return res;
}
@@ -779,13 +782,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) {
if (!spin_trylock(&substream->group->lock)) {
spin_unlock(&substream->self_group.lock);
spin_lock(&substream->group->lock);
spin_lock(&substream->self_group.lock);
}
res = snd_pcm_action_group(ops, substream, state, 1);
spin_unlock(&substream->group->lock);
} else { res = snd_pcm_action_single(ops, substream, state); }res = snd_pcm_action_group(ops, substream, state);
@@ -802,17 +799,7 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, int res;
read_lock_irq(&snd_pcm_link_rwlock);
- if (snd_pcm_stream_linked(substream)) {
spin_lock(&substream->group->lock);
spin_lock(&substream->self_group.lock);
res = snd_pcm_action_group(ops, substream, state, 1);
spin_unlock(&substream->self_group.lock);
spin_unlock(&substream->group->lock);
- } else {
spin_lock(&substream->self_group.lock);
res = snd_pcm_action_single(ops, substream, state);
spin_unlock(&substream->self_group.lock);
- }
- res = snd_pcm_action(ops, substream, state); read_unlock_irq(&snd_pcm_link_rwlock); return res;
}
Jaroslav
-- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project; Red Hat, Inc.
Date 13.3.2012 21:24, Takashi Iwai wrote:
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 17:05, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:23, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:00, Takashi Iwai wrote: > At Tue, 13 Mar 2012 15:24:11 +0100, > Jaroslav Kysela wrote: >> >> Date 13.3.2012 11:57, Takashi Iwai wrote: >>> Hi, >>> >>> below is a patch I posted to LKML as the follow up for the lockdep >>> report from Dave Jones. It's a resurrected patch from what I worked >>> on quite ago. Although this fix is suboptimal, I see no better way. >>> >>> If this looks OK, I'm going to queue it as a 3.4 material. >>> >>> >>> thanks, >>> >>> Takashi >>> >>> === >>> >>> From: Takashi Iwai tiwai@suse.de >>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >>> >>> The spinlocks held for PCM substreams have been always messy. For the >>> better concurrent accessibility, we took always the substream's lock >>> first. When substreams are linked, we need a group-wide lock, but >>> this should be applied outside substream's lock. So, we unlock the >>> substream's lock first, then do locks twice. >>> >>> This scheme is known to be problematic with lockdep. Maybe because >>> the nested lock isn't marked properly, and partly because the lock and >>> unlock sequences are different (A/B -> A/B). >>> >>> This patch tries to simplify the scheme. Instead of holding the >>> substream's lock, we take the group lock always at first. A drawback >>> of this is that the access to the individual substream in a same group >>> won't be allowed any longer. But, the code looks much easier, and >>> likely less buggy. >> >> What do you mean by this? I noticed that you switched the behaviour for >> the stream_lock(): >> >> 1) self_group spinlock is used for single (non-linked) stream >> - same as before >> 2) group.lock spinlock is used for the linked streams >> - the self_group spinlock is used only in the group action >> - the question is why? it seems useless, the >> stream_lock() functions use the allocated group.lock > > Oh yeah, right, this is utterly useless! > I just thought of the mutex case, but in this case, the lock is > avoided anyway. > >> I don't think that this patch will work in the way we expect. >> >> I believe we should use only group.lock spinlock for the linked streams >> without using the self_group spinlock to make locking work like this: >> >> stream_lock(current) >> group_action(current) { >> for s from all_group_substreams { >> if (s != current) >> stream_lock(s); >> } >> .... job .... >> .... group substreams unlock .... >> } >> stream_unlock(current) > > Well, if we think of a single lock, the code would be much > simplified than above. > > The reviewsed patch is below. I left the single-stream version as is > since it's used explicitly in the drain() -- the drain action is > applied to each substream although it's linked.
I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1
lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
Agreed. We must use the "master" lock or "per-substream" lock. Mixing of these locks results in an ugly locking scheme.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
I think that spin_trylock() can help, but we should really use it in a different way than before. I think that I finally created a way to use the per-substream locks and serialize locking of all those locks for a group operation. Please, review the patch bellow.
The next change can be a switch of the spinlocks from snd_pcm_group to the snd_pcm_substream and change the group structure allocation to avoid GFP_ATOMIC as you suggested.
Jaroslav
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 1d58d79..9decfa2 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -362,6 +362,8 @@ struct snd_pcm_group { /* keep linked substreams */ spinlock_t lock; struct list_head substreams; int count; + atomic_t master_count; + atomic_t lock_count; };
struct pid; diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 6e4bfcc..c3ae1fc 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -690,6 +690,8 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int stream, int substream_count) substream->group = &substream->self_group; spin_lock_init(&substream->self_group.lock); INIT_LIST_HEAD(&substream->self_group.substreams); + atomic_set(&substream->self_group.master_count, 0); + atomic_set(&substream->self_group.lock_count, 0); list_add_tail(&substream->link_list, &substream->self_group.substreams); atomic_set(&substream->mmap_count, 0); prev = substream; diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..e093f8b 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -709,12 +709,64 @@ static int snd_pcm_action_group(struct action_ops *ops, { struct snd_pcm_substream *s = NULL; struct snd_pcm_substream *s1; - int res = 0; + struct snd_pcm_group *g = substream->group; + int res = 0, idx, lcnt; + unsigned long locked = 0; + + if (!do_lock) + goto _pre; + + /* + * ensure the stream locking serialization here + */ + atomic_inc(&g->master_count); + _retry: + if (atomic_inc_return(&g->lock_count) == 1) { + lcnt = 0; + _retry_lock: + idx = 0; + snd_pcm_group_for_each_entry(s, substream) { + if (s != substream && !(locked & (1 << idx))) { + if (spin_trylock(&s->self_group.lock)) { + locked |= (1 << idx); + lcnt++; + } + } + idx++; + } + /* + * at this point, check if another group action + * owner tries to access this part of code + * + * another situation is that another substream lock is + * active without the group action request; + * wait, until this request is finished [ref1] + */ + if (g->count != lcnt + atomic_read(&g->master_count)) + goto _retry_lock; + } else { + /* + * another group owner tries to reach this code + * serialize: wait for the first owners(s) + */ + while (atomic_read(&g->lock_count) != 1) { + /* + * multiple requests - try to release + * the atomic counter to avoid deadlock, + * use new read operation to increase + * time window for other checkers + */ + if (atomic_read(&g->lock_count) > 2) { + atomic_dec(&g->lock_count); + goto _retry; + } + /* do nothing here, wait for finish */ + } + goto _retry_lock; + }
+ _pre: snd_pcm_group_for_each_entry(s, substream) { - if (do_lock && s != substream) - spin_lock_nested(&s->self_group.lock, - SINGLE_DEPTH_NESTING); res = ops->pre_action(s, state); if (res < 0) goto _unlock; @@ -729,7 +781,6 @@ static int snd_pcm_action_group(struct action_ops *ops, ops->undo_action(s1, state); } } - s = NULL; /* unlock all */ goto _unlock; } } @@ -738,13 +789,18 @@ static int snd_pcm_action_group(struct action_ops *ops, } _unlock: if (do_lock) { - /* unlock streams */ + /* unlock all streams */ + idx = 0; snd_pcm_group_for_each_entry(s1, substream) { - if (s1 != substream) + if (s != substream && (locked & (1 << idx)) != 0) spin_unlock(&s1->self_group.lock); - if (s1 == s) /* end */ - break; } + /* + * keep decrement order reverse to avoid + * a bad [ref1] condition check + */ + atomic_dec(&g->master_count); + atomic_dec(&g->lock_count); } return res; } @@ -779,13 +835,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) { - if (!spin_trylock(&substream->group->lock)) { - spin_unlock(&substream->self_group.lock); - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - } res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->group->lock); } else { res = snd_pcm_action_single(ops, substream, state); } @@ -802,17 +852,7 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, int res;
read_lock_irq(&snd_pcm_link_rwlock); - if (snd_pcm_stream_linked(substream)) { - spin_lock(&substream->group->lock); - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_group(ops, substream, state, 1); - spin_unlock(&substream->self_group.lock); - spin_unlock(&substream->group->lock); - } else { - spin_lock(&substream->self_group.lock); - res = snd_pcm_action_single(ops, substream, state); - spin_unlock(&substream->self_group.lock); - } + res = snd_pcm_action(ops, substream, state); read_unlock_irq(&snd_pcm_link_rwlock); return res; } @@ -1611,6 +1651,8 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) } spin_lock_init(&substream->group->lock); INIT_LIST_HEAD(&substream->group->substreams); + atomic_set(&substream->group->master_count, 0); + atomic_set(&substream->group->lock_count, 0); list_add_tail(&substream->link_list, &substream->group->substreams); substream->group->count = 1; }
At Wed, 14 Mar 2012 10:56:12 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 21:24, Takashi Iwai wrote:
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 17:05, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:23, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:16:36 +0100, Jaroslav Kysela wrote: > > Date 13.3.2012 16:00, Takashi Iwai wrote: >> At Tue, 13 Mar 2012 15:24:11 +0100, >> Jaroslav Kysela wrote: >>> >>> Date 13.3.2012 11:57, Takashi Iwai wrote: >>>> Hi, >>>> >>>> below is a patch I posted to LKML as the follow up for the lockdep >>>> report from Dave Jones. It's a resurrected patch from what I worked >>>> on quite ago. Although this fix is suboptimal, I see no better way. >>>> >>>> If this looks OK, I'm going to queue it as a 3.4 material. >>>> >>>> >>>> thanks, >>>> >>>> Takashi >>>> >>>> === >>>> >>>> From: Takashi Iwai tiwai@suse.de >>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >>>> >>>> The spinlocks held for PCM substreams have been always messy. For the >>>> better concurrent accessibility, we took always the substream's lock >>>> first. When substreams are linked, we need a group-wide lock, but >>>> this should be applied outside substream's lock. So, we unlock the >>>> substream's lock first, then do locks twice. >>>> >>>> This scheme is known to be problematic with lockdep. Maybe because >>>> the nested lock isn't marked properly, and partly because the lock and >>>> unlock sequences are different (A/B -> A/B). >>>> >>>> This patch tries to simplify the scheme. Instead of holding the >>>> substream's lock, we take the group lock always at first. A drawback >>>> of this is that the access to the individual substream in a same group >>>> won't be allowed any longer. But, the code looks much easier, and >>>> likely less buggy. >>> >>> What do you mean by this? I noticed that you switched the behaviour for >>> the stream_lock(): >>> >>> 1) self_group spinlock is used for single (non-linked) stream >>> - same as before >>> 2) group.lock spinlock is used for the linked streams >>> - the self_group spinlock is used only in the group action >>> - the question is why? it seems useless, the >>> stream_lock() functions use the allocated group.lock >> >> Oh yeah, right, this is utterly useless! >> I just thought of the mutex case, but in this case, the lock is >> avoided anyway. >> >>> I don't think that this patch will work in the way we expect. >>> >>> I believe we should use only group.lock spinlock for the linked streams >>> without using the self_group spinlock to make locking work like this: >>> >>> stream_lock(current) >>> group_action(current) { >>> for s from all_group_substreams { >>> if (s != current) >>> stream_lock(s); >>> } >>> .... job .... >>> .... group substreams unlock .... >>> } >>> stream_unlock(current) >> >> Well, if we think of a single lock, the code would be much >> simplified than above. >> >> The reviewsed patch is below. I left the single-stream version as is >> since it's used explicitly in the drain() -- the drain action is >> applied to each substream although it's linked. > > I meant use group->lock in the group_action, see patch bellow.
Hmm, I don't think this works. From linked stream members, group->lock points to the very same lock instance. So, snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1
lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
Agreed. We must use the "master" lock or "per-substream" lock. Mixing of these locks results in an ugly locking scheme.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
I think that spin_trylock() can help, but we should really use it in a different way than before. I think that I finally created a way to use the per-substream locks and serialize locking of all those locks for a group operation. Please, review the patch bellow.
Sorry but it's way too complex. Most of the additional code is a specially open-coded spinlock. Is really worth to add more tricks?
I would accept such optimizations only if the concurrent access of a huge amount of linked streams were a realistic scenario. But it's rather unrelaistic. If the concurrency is no big issue from practical POV, we should move on rather toward simplification, IMO.
And, above all, I'm wondering whether allowing the concurrent access to linked streams is really safe. Imagine if both streams are trying to change to different states at the same time. For the consistency, a single lock approach sounds much safer.
thanks,
Takashi
The next change can be a switch of the spinlocks from snd_pcm_group to the snd_pcm_substream and change the group structure allocation to avoid GFP_ATOMIC as you suggested.
Jaroslav
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 1d58d79..9decfa2 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -362,6 +362,8 @@ struct snd_pcm_group { /* keep linked substreams */ spinlock_t lock; struct list_head substreams; int count;
- atomic_t master_count;
- atomic_t lock_count;
};
struct pid; diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 6e4bfcc..c3ae1fc 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -690,6 +690,8 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int stream, int substream_count) substream->group = &substream->self_group; spin_lock_init(&substream->self_group.lock); INIT_LIST_HEAD(&substream->self_group.substreams);
atomic_set(&substream->self_group.master_count, 0);
list_add_tail(&substream->link_list, &substream->self_group.substreams); atomic_set(&substream->mmap_count, 0); prev = substream;atomic_set(&substream->self_group.lock_count, 0);
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 25ed9fe..e093f8b 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -709,12 +709,64 @@ static int snd_pcm_action_group(struct action_ops *ops, { struct snd_pcm_substream *s = NULL; struct snd_pcm_substream *s1;
- int res = 0;
struct snd_pcm_group *g = substream->group;
int res = 0, idx, lcnt;
unsigned long locked = 0;
if (!do_lock)
goto _pre;
/*
* ensure the stream locking serialization here
*/
atomic_inc(&g->master_count);
_retry:
if (atomic_inc_return(&g->lock_count) == 1) {
lcnt = 0;
_retry_lock:
idx = 0;
snd_pcm_group_for_each_entry(s, substream) {
if (s != substream && !(locked & (1 << idx))) {
if (spin_trylock(&s->self_group.lock)) {
locked |= (1 << idx);
lcnt++;
}
}
idx++;
}
/*
* at this point, check if another group action
* owner tries to access this part of code
*
* another situation is that another substream lock is
* active without the group action request;
* wait, until this request is finished [ref1]
*/
if (g->count != lcnt + atomic_read(&g->master_count))
goto _retry_lock;
} else {
/*
* another group owner tries to reach this code
* serialize: wait for the first owners(s)
*/
while (atomic_read(&g->lock_count) != 1) {
/*
* multiple requests - try to release
* the atomic counter to avoid deadlock,
* use new read operation to increase
* time window for other checkers
*/
if (atomic_read(&g->lock_count) > 2) {
atomic_dec(&g->lock_count);
goto _retry;
}
/* do nothing here, wait for finish */
}
goto _retry_lock;
}
_pre: snd_pcm_group_for_each_entry(s, substream) {
if (do_lock && s != substream)
spin_lock_nested(&s->self_group.lock,
res = ops->pre_action(s, state); if (res < 0) goto _unlock;SINGLE_DEPTH_NESTING);
@@ -729,7 +781,6 @@ static int snd_pcm_action_group(struct action_ops *ops, ops->undo_action(s1, state); } }
} }s = NULL; /* unlock all */ goto _unlock;
@@ -738,13 +789,18 @@ static int snd_pcm_action_group(struct action_ops *ops, } _unlock: if (do_lock) {
/* unlock streams */
/* unlock all streams */
snd_pcm_group_for_each_entry(s1, substream) {idx = 0;
if (s1 != substream)
if (s != substream && (locked & (1 << idx)) != 0) spin_unlock(&s1->self_group.lock);
if (s1 == s) /* end */
}break;
/*
* keep decrement order reverse to avoid
* a bad [ref1] condition check
*/
atomic_dec(&g->master_count);
} return res;atomic_dec(&g->lock_count);
} @@ -779,13 +835,7 @@ static int snd_pcm_action(struct action_ops *ops, int res;
if (snd_pcm_stream_linked(substream)) {
if (!spin_trylock(&substream->group->lock)) {
spin_unlock(&substream->self_group.lock);
spin_lock(&substream->group->lock);
spin_lock(&substream->self_group.lock);
res = snd_pcm_action_group(ops, substream, state, 1);}
} else { res = snd_pcm_action_single(ops, substream, state); }spin_unlock(&substream->group->lock);
@@ -802,17 +852,7 @@ static int snd_pcm_action_lock_irq(struct action_ops *ops, int res;
read_lock_irq(&snd_pcm_link_rwlock);
- if (snd_pcm_stream_linked(substream)) {
spin_lock(&substream->group->lock);
spin_lock(&substream->self_group.lock);
res = snd_pcm_action_group(ops, substream, state, 1);
spin_unlock(&substream->self_group.lock);
spin_unlock(&substream->group->lock);
- } else {
spin_lock(&substream->self_group.lock);
res = snd_pcm_action_single(ops, substream, state);
spin_unlock(&substream->self_group.lock);
- }
- res = snd_pcm_action(ops, substream, state); read_unlock_irq(&snd_pcm_link_rwlock); return res;
} @@ -1611,6 +1651,8 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd) } spin_lock_init(&substream->group->lock); INIT_LIST_HEAD(&substream->group->substreams);
atomic_set(&substream->group->master_count, 0);
list_add_tail(&substream->link_list, &substream->group->substreams); substream->group->count = 1; }atomic_set(&substream->group->lock_count, 0);
-- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project; Red Hat, Inc.
Date 14.3.2012 12:24, Takashi Iwai wrote:
At Wed, 14 Mar 2012 10:56:12 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 21:24, Takashi Iwai wrote:
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 17:05, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 16:23, Takashi Iwai wrote: > At Tue, 13 Mar 2012 16:16:36 +0100, > Jaroslav Kysela wrote: >> >> Date 13.3.2012 16:00, Takashi Iwai wrote: >>> At Tue, 13 Mar 2012 15:24:11 +0100, >>> Jaroslav Kysela wrote: >>>> >>>> Date 13.3.2012 11:57, Takashi Iwai wrote: >>>>> Hi, >>>>> >>>>> below is a patch I posted to LKML as the follow up for the lockdep >>>>> report from Dave Jones. It's a resurrected patch from what I worked >>>>> on quite ago. Although this fix is suboptimal, I see no better way. >>>>> >>>>> If this looks OK, I'm going to queue it as a 3.4 material. >>>>> >>>>> >>>>> thanks, >>>>> >>>>> Takashi >>>>> >>>>> === >>>>> >>>>> From: Takashi Iwai tiwai@suse.de >>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >>>>> >>>>> The spinlocks held for PCM substreams have been always messy. For the >>>>> better concurrent accessibility, we took always the substream's lock >>>>> first. When substreams are linked, we need a group-wide lock, but >>>>> this should be applied outside substream's lock. So, we unlock the >>>>> substream's lock first, then do locks twice. >>>>> >>>>> This scheme is known to be problematic with lockdep. Maybe because >>>>> the nested lock isn't marked properly, and partly because the lock and >>>>> unlock sequences are different (A/B -> A/B). >>>>> >>>>> This patch tries to simplify the scheme. Instead of holding the >>>>> substream's lock, we take the group lock always at first. A drawback >>>>> of this is that the access to the individual substream in a same group >>>>> won't be allowed any longer. But, the code looks much easier, and >>>>> likely less buggy. >>>> >>>> What do you mean by this? I noticed that you switched the behaviour for >>>> the stream_lock(): >>>> >>>> 1) self_group spinlock is used for single (non-linked) stream >>>> - same as before >>>> 2) group.lock spinlock is used for the linked streams >>>> - the self_group spinlock is used only in the group action >>>> - the question is why? it seems useless, the >>>> stream_lock() functions use the allocated group.lock >>> >>> Oh yeah, right, this is utterly useless! >>> I just thought of the mutex case, but in this case, the lock is >>> avoided anyway. >>> >>>> I don't think that this patch will work in the way we expect. >>>> >>>> I believe we should use only group.lock spinlock for the linked streams >>>> without using the self_group spinlock to make locking work like this: >>>> >>>> stream_lock(current) >>>> group_action(current) { >>>> for s from all_group_substreams { >>>> if (s != current) >>>> stream_lock(s); >>>> } >>>> .... job .... >>>> .... group substreams unlock .... >>>> } >>>> stream_unlock(current) >>> >>> Well, if we think of a single lock, the code would be much >>> simplified than above. >>> >>> The reviewsed patch is below. I left the single-stream version as is >>> since it's used explicitly in the drain() -- the drain action is >>> applied to each substream although it's linked. >> >> I meant use group->lock in the group_action, see patch bellow. > > Hmm, I don't think this works. From linked stream members, > group->lock points to the very same lock instance. So, > snd_pcm_action_group() would dead-lock the same lock with your code.
I see. The problem is that the concurrent but linked streams can take different first locks for the group locking:
time 0 time 1
lock stream 0 -> group lock lock stream 1 -> group lock
In this case the condition current != s is not enough. I think that we have to strictly separate situations when the group lock can be activated and use different locking scheme there - for example using a master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
It seems to me that your proposed patch just removes the spinlocks for linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
Or just a quick idea - what about to track the the group action collisions? We can use a spinlock protected counter and serialize these requests in another way (if counter is above one - the CPU should wait until it becomes one).
Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
Agreed. We must use the "master" lock or "per-substream" lock. Mixing of these locks results in an ugly locking scheme.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
I think that spin_trylock() can help, but we should really use it in a different way than before. I think that I finally created a way to use the per-substream locks and serialize locking of all those locks for a group operation. Please, review the patch bellow.
Sorry but it's way too complex. Most of the additional code is a specially open-coded spinlock. Is really worth to add more tricks?
I would accept such optimizations only if the concurrent access of a huge amount of linked streams were a realistic scenario. But it's rather unrelaistic. If the concurrency is no big issue from practical POV, we should move on rather toward simplification, IMO.
And, above all, I'm wondering whether allowing the concurrent access to linked streams is really safe. Imagine if both streams are trying to change to different states at the same time. For the consistency, a single lock approach sounds much safer.
This should be assured by the PCM code (it calls the group action when the state is changed, so all streams are locked at this moment).
I can see the improvements when the application polls for the stream positions (for example to synchronize them) and when it uses small period sizes (many interrupts) for multiple substreams. In this case, the position updates for substreams can be executed in a parallel way.
I hoped to solve the bottleneck you mentioned in your patch. Of course, if you take ~30 lines (without comments and next 10 lines are curly brackets) of code as too way complex, our discussion is finished. The nice thing on my patch is that the lock serialization code is only at one place, easy to verify and commented (comments may be improved) for reviewers.
My next point is, if we allow a "global" lock, we will not take care about the possible concurrency in future, which is not so good.
Jaroslav
At Wed, 14 Mar 2012 12:59:07 +0100, Jaroslav Kysela wrote:
Date 14.3.2012 12:24, Takashi Iwai wrote:
At Wed, 14 Mar 2012 10:56:12 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 21:24, Takashi Iwai wrote:
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 17:05, Takashi Iwai wrote:
At Tue, 13 Mar 2012 16:46:39 +0100, Jaroslav Kysela wrote: > > Date 13.3.2012 16:23, Takashi Iwai wrote: >> At Tue, 13 Mar 2012 16:16:36 +0100, >> Jaroslav Kysela wrote: >>> >>> Date 13.3.2012 16:00, Takashi Iwai wrote: >>>> At Tue, 13 Mar 2012 15:24:11 +0100, >>>> Jaroslav Kysela wrote: >>>>> >>>>> Date 13.3.2012 11:57, Takashi Iwai wrote: >>>>>> Hi, >>>>>> >>>>>> below is a patch I posted to LKML as the follow up for the lockdep >>>>>> report from Dave Jones. It's a resurrected patch from what I worked >>>>>> on quite ago. Although this fix is suboptimal, I see no better way. >>>>>> >>>>>> If this looks OK, I'm going to queue it as a 3.4 material. >>>>>> >>>>>> >>>>>> thanks, >>>>>> >>>>>> Takashi >>>>>> >>>>>> === >>>>>> >>>>>> From: Takashi Iwai tiwai@suse.de >>>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >>>>>> >>>>>> The spinlocks held for PCM substreams have been always messy. For the >>>>>> better concurrent accessibility, we took always the substream's lock >>>>>> first. When substreams are linked, we need a group-wide lock, but >>>>>> this should be applied outside substream's lock. So, we unlock the >>>>>> substream's lock first, then do locks twice. >>>>>> >>>>>> This scheme is known to be problematic with lockdep. Maybe because >>>>>> the nested lock isn't marked properly, and partly because the lock and >>>>>> unlock sequences are different (A/B -> A/B). >>>>>> >>>>>> This patch tries to simplify the scheme. Instead of holding the >>>>>> substream's lock, we take the group lock always at first. A drawback >>>>>> of this is that the access to the individual substream in a same group >>>>>> won't be allowed any longer. But, the code looks much easier, and >>>>>> likely less buggy. >>>>> >>>>> What do you mean by this? I noticed that you switched the behaviour for >>>>> the stream_lock(): >>>>> >>>>> 1) self_group spinlock is used for single (non-linked) stream >>>>> - same as before >>>>> 2) group.lock spinlock is used for the linked streams >>>>> - the self_group spinlock is used only in the group action >>>>> - the question is why? it seems useless, the >>>>> stream_lock() functions use the allocated group.lock >>>> >>>> Oh yeah, right, this is utterly useless! >>>> I just thought of the mutex case, but in this case, the lock is >>>> avoided anyway. >>>> >>>>> I don't think that this patch will work in the way we expect. >>>>> >>>>> I believe we should use only group.lock spinlock for the linked streams >>>>> without using the self_group spinlock to make locking work like this: >>>>> >>>>> stream_lock(current) >>>>> group_action(current) { >>>>> for s from all_group_substreams { >>>>> if (s != current) >>>>> stream_lock(s); >>>>> } >>>>> .... job .... >>>>> .... group substreams unlock .... >>>>> } >>>>> stream_unlock(current) >>>> >>>> Well, if we think of a single lock, the code would be much >>>> simplified than above. >>>> >>>> The reviewsed patch is below. I left the single-stream version as is >>>> since it's used explicitly in the drain() -- the drain action is >>>> applied to each substream although it's linked. >>> >>> I meant use group->lock in the group_action, see patch bellow. >> >> Hmm, I don't think this works. From linked stream members, >> group->lock points to the very same lock instance. So, >> snd_pcm_action_group() would dead-lock the same lock with your code. > > I see. The problem is that the concurrent but linked streams can take > different first locks for the group locking: > > time 0 time 1 > -------------------------------- > lock stream 0 -> group lock > lock stream 1 -> group lock > > In this case the condition current != s is not enough. I think that we > have to strictly separate situations when the group lock can be > activated and use different locking scheme there - for example using a > master lock per linked substreams.
In my revised patch, group->lock is actually the master lock of the group. This is the only lock applied to the substream.
When a stream is linked, it's done in the global write_lock to guarantee that stream is unlocked. Then group->lock is replaced with a real group lock from self_group.lock. So, there is no race among this move.
> It seems to me that your proposed patch just removes the spinlocks for > linked substreams which can cause inconsistencies for these streams.
Not really. It just uses a single lock for linked streams. There are no longer stream locks once when the stream is linked.
This is of course a drawback -- it disallows the concurrent access to linked stream members in some operations. How this is regarded as a serious drawback, is a question of operation styles. If any apps are accessing many streams as a linked stream in realtime manner, and access to each without mmap, then the performance may be worsen. But, this scenario is rather pretty rare, I guess.
> Or just a quick idea - what about to track the the group action > collisions? We can use a spinlock protected counter and serialize these > requests in another way (if counter is above one - the CPU should wait > until it becomes one).
Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
Agreed. We must use the "master" lock or "per-substream" lock. Mixing of these locks results in an ugly locking scheme.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
I think that spin_trylock() can help, but we should really use it in a different way than before. I think that I finally created a way to use the per-substream locks and serialize locking of all those locks for a group operation. Please, review the patch bellow.
Sorry but it's way too complex. Most of the additional code is a specially open-coded spinlock. Is really worth to add more tricks?
I would accept such optimizations only if the concurrent access of a huge amount of linked streams were a realistic scenario. But it's rather unrelaistic. If the concurrency is no big issue from practical POV, we should move on rather toward simplification, IMO.
And, above all, I'm wondering whether allowing the concurrent access to linked streams is really safe. Imagine if both streams are trying to change to different states at the same time. For the consistency, a single lock approach sounds much safer.
This should be assured by the PCM code (it calls the group action when the state is changed, so all streams are locked at this moment).
But how? Suppose streams A and B are being accessed concurrently and trying to change the state at the same time but differently. At beginning, the locks held by both are only the self_group.locks. Then at snd_pcm_action_group(), it acquires the group lock. If A takes the group lock, B is spinning. With your latest patch, A tries the lock of B, and it can't take, thus it continues. It will change the state of all linked streams (including B) to state A'. After finishing snd_pcm_action() of stream A, now snd_pcm_action() of B begins and changes it to state B' again, or get an error.
In this scenario, both A and B should have taken the group lock from the beginning.
I can see the improvements when the application polls for the stream positions (for example to synchronize them) and when it uses small period sizes (many interrupts) for multiple substreams. In this case, the position updates for substreams can be executed in a parallel way.
Right. There can be some methods that can be done in parallel. But the stuff including state changes are basically difficult to handle in parallel.
I hoped to solve the bottleneck you mentioned in your patch. Of course, if you take ~30 lines (without comments and next 10 lines are curly brackets) of code as too way complex, our discussion is finished. The nice thing on my patch is that the lock serialization code is only at one place, easy to verify and commented (comments may be improved) for reviewers.
My next point is, if we allow a "global" lock, we will not take care about the possible concurrency in future, which is not so good.
Yeah, that's a drawback. OTOH, the handling of the state of linked streams must be done globally among all linked streams. And the lock of the each substream assumes that the state change doesn't happen during it, but it's not true when we don't protect via a global lock.
So, maybe we should think things separately: the actions that need a global lock obviously (i.e. involving with the state change) and the actions that can be easily parallelized. If we want to provide finer locks, use another substream-wise lock for the latter action and use the group lock for the former action (and eventually lock the subststem one in group lock).
Takashi
Date 14.3.2012 14:41, Takashi Iwai wrote:
At Wed, 14 Mar 2012 12:59:07 +0100, Jaroslav Kysela wrote:
Date 14.3.2012 12:24, Takashi Iwai wrote:
At Wed, 14 Mar 2012 10:56:12 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 21:24, Takashi Iwai wrote:
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 17:05, Takashi Iwai wrote: > At Tue, 13 Mar 2012 16:46:39 +0100, > Jaroslav Kysela wrote: >> >> Date 13.3.2012 16:23, Takashi Iwai wrote: >>> At Tue, 13 Mar 2012 16:16:36 +0100, >>> Jaroslav Kysela wrote: >>>> >>>> Date 13.3.2012 16:00, Takashi Iwai wrote: >>>>> At Tue, 13 Mar 2012 15:24:11 +0100, >>>>> Jaroslav Kysela wrote: >>>>>> >>>>>> Date 13.3.2012 11:57, Takashi Iwai wrote: >>>>>>> Hi, >>>>>>> >>>>>>> below is a patch I posted to LKML as the follow up for the lockdep >>>>>>> report from Dave Jones. It's a resurrected patch from what I worked >>>>>>> on quite ago. Although this fix is suboptimal, I see no better way. >>>>>>> >>>>>>> If this looks OK, I'm going to queue it as a 3.4 material. >>>>>>> >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Takashi >>>>>>> >>>>>>> === >>>>>>> >>>>>>> From: Takashi Iwai tiwai@suse.de >>>>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >>>>>>> >>>>>>> The spinlocks held for PCM substreams have been always messy. For the >>>>>>> better concurrent accessibility, we took always the substream's lock >>>>>>> first. When substreams are linked, we need a group-wide lock, but >>>>>>> this should be applied outside substream's lock. So, we unlock the >>>>>>> substream's lock first, then do locks twice. >>>>>>> >>>>>>> This scheme is known to be problematic with lockdep. Maybe because >>>>>>> the nested lock isn't marked properly, and partly because the lock and >>>>>>> unlock sequences are different (A/B -> A/B). >>>>>>> >>>>>>> This patch tries to simplify the scheme. Instead of holding the >>>>>>> substream's lock, we take the group lock always at first. A drawback >>>>>>> of this is that the access to the individual substream in a same group >>>>>>> won't be allowed any longer. But, the code looks much easier, and >>>>>>> likely less buggy. >>>>>> >>>>>> What do you mean by this? I noticed that you switched the behaviour for >>>>>> the stream_lock(): >>>>>> >>>>>> 1) self_group spinlock is used for single (non-linked) stream >>>>>> - same as before >>>>>> 2) group.lock spinlock is used for the linked streams >>>>>> - the self_group spinlock is used only in the group action >>>>>> - the question is why? it seems useless, the >>>>>> stream_lock() functions use the allocated group.lock >>>>> >>>>> Oh yeah, right, this is utterly useless! >>>>> I just thought of the mutex case, but in this case, the lock is >>>>> avoided anyway. >>>>> >>>>>> I don't think that this patch will work in the way we expect. >>>>>> >>>>>> I believe we should use only group.lock spinlock for the linked streams >>>>>> without using the self_group spinlock to make locking work like this: >>>>>> >>>>>> stream_lock(current) >>>>>> group_action(current) { >>>>>> for s from all_group_substreams { >>>>>> if (s != current) >>>>>> stream_lock(s); >>>>>> } >>>>>> .... job .... >>>>>> .... group substreams unlock .... >>>>>> } >>>>>> stream_unlock(current) >>>>> >>>>> Well, if we think of a single lock, the code would be much >>>>> simplified than above. >>>>> >>>>> The reviewsed patch is below. I left the single-stream version as is >>>>> since it's used explicitly in the drain() -- the drain action is >>>>> applied to each substream although it's linked. >>>> >>>> I meant use group->lock in the group_action, see patch bellow. >>> >>> Hmm, I don't think this works. From linked stream members, >>> group->lock points to the very same lock instance. So, >>> snd_pcm_action_group() would dead-lock the same lock with your code. >> >> I see. The problem is that the concurrent but linked streams can take >> different first locks for the group locking: >> >> time 0 time 1 >> -------------------------------- >> lock stream 0 -> group lock >> lock stream 1 -> group lock >> >> In this case the condition current != s is not enough. I think that we >> have to strictly separate situations when the group lock can be >> activated and use different locking scheme there - for example using a >> master lock per linked substreams. > > In my revised patch, group->lock is actually the master lock of the > group. This is the only lock applied to the substream. > > When a stream is linked, it's done in the global write_lock to > guarantee that stream is unlocked. Then group->lock is replaced with > a real group lock from self_group.lock. So, there is no race among > this move. > >> It seems to me that your proposed patch just removes the spinlocks for >> linked substreams which can cause inconsistencies for these streams. > > Not really. It just uses a single lock for linked streams. There are > no longer stream locks once when the stream is linked. > > This is of course a drawback -- it disallows the concurrent access to > linked stream members in some operations. How this is regarded as a > serious drawback, is a question of operation styles. If any apps are > accessing many streams as a linked stream in realtime manner, and > access to each without mmap, then the performance may be worsen. But, > this scenario is rather pretty rare, I guess. > >> Or just a quick idea - what about to track the the group action >> collisions? We can use a spinlock protected counter and serialize these >> requests in another way (if counter is above one - the CPU should wait >> until it becomes one). > > Well... this sounds way too complex, I'm afraid.
I tried to think about it more. I think that the spinlock reorder is at the wrong code position. What about the code bellow. I believe, it should guarantee the consistent locking scheme (although the first substream lock will stay outside of the group->lock, but it should not be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
Agreed. We must use the "master" lock or "per-substream" lock. Mixing of these locks results in an ugly locking scheme.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
I think that spin_trylock() can help, but we should really use it in a different way than before. I think that I finally created a way to use the per-substream locks and serialize locking of all those locks for a group operation. Please, review the patch bellow.
Sorry but it's way too complex. Most of the additional code is a specially open-coded spinlock. Is really worth to add more tricks?
I would accept such optimizations only if the concurrent access of a huge amount of linked streams were a realistic scenario. But it's rather unrelaistic. If the concurrency is no big issue from practical POV, we should move on rather toward simplification, IMO.
And, above all, I'm wondering whether allowing the concurrent access to linked streams is really safe. Imagine if both streams are trying to change to different states at the same time. For the consistency, a single lock approach sounds much safer.
This should be assured by the PCM code (it calls the group action when the state is changed, so all streams are locked at this moment).
But how? Suppose streams A and B are being accessed concurrently and trying to change the state at the same time but differently. At beginning, the locks held by both are only the self_group.locks. Then at snd_pcm_action_group(), it acquires the group lock. If A takes the group lock, B is spinning. With your latest patch, A tries the lock of B, and it can't take, thus it continues. It will change the state of all linked streams (including B) to state A'. After finishing snd_pcm_action() of stream A, now snd_pcm_action() of B begins and changes it to state B' again, or get an error.
In this scenario, both A and B should have taken the group lock from the beginning.
I can see the improvements when the application polls for the stream positions (for example to synchronize them) and when it uses small period sizes (many interrupts) for multiple substreams. In this case, the position updates for substreams can be executed in a parallel way.
Right. There can be some methods that can be done in parallel. But the stuff including state changes are basically difficult to handle in parallel.
I hoped to solve the bottleneck you mentioned in your patch. Of course, if you take ~30 lines (without comments and next 10 lines are curly brackets) of code as too way complex, our discussion is finished. The nice thing on my patch is that the lock serialization code is only at one place, easy to verify and commented (comments may be improved) for reviewers.
My next point is, if we allow a "global" lock, we will not take care about the possible concurrency in future, which is not so good.
Yeah, that's a drawback. OTOH, the handling of the state of linked streams must be done globally among all linked streams. And the lock of the each substream assumes that the state change doesn't happen during it, but it's not true when we don't protect via a global lock.
So, maybe we should think things separately: the actions that need a global lock obviously (i.e. involving with the state change) and the actions that can be easily parallelized. If we want to provide finer locks, use another substream-wise lock for the latter action and use the group lock for the former action (and eventually lock the subststem one in group lock).
The question is, if it's not simpler to use my scheme and verify/add the recovery paths after the possible group call. My code guarantees, that the state is not modified when the group action finishes (until the substream lock is released). So, we can just do new current state / error value checking after these calls. A quick look to the code shows that almost all group action calls ends with the lock release, so I think that a comment (notifying that the state can be changed) for the group action functions should be enough.
Another possibility is to not do any serialization, just use one atomic counter, and return an error to caller when a concurrent access occurs. The caller can decide what to do - it may even to release the lock and retry the whole operation.
I also think that the group lock might be acquired uselessly in many cases.
Jaroslav
At Wed, 14 Mar 2012 15:20:07 +0100, Jaroslav Kysela wrote:
Date 14.3.2012 14:41, Takashi Iwai wrote:
At Wed, 14 Mar 2012 12:59:07 +0100, Jaroslav Kysela wrote:
Date 14.3.2012 12:24, Takashi Iwai wrote:
At Wed, 14 Mar 2012 10:56:12 +0100, Jaroslav Kysela wrote:
Date 13.3.2012 21:24, Takashi Iwai wrote:
At Tue, 13 Mar 2012 20:50:00 +0100, Jaroslav Kysela wrote: > > Date 13.3.2012 17:05, Takashi Iwai wrote: >> At Tue, 13 Mar 2012 16:46:39 +0100, >> Jaroslav Kysela wrote: >>> >>> Date 13.3.2012 16:23, Takashi Iwai wrote: >>>> At Tue, 13 Mar 2012 16:16:36 +0100, >>>> Jaroslav Kysela wrote: >>>>> >>>>> Date 13.3.2012 16:00, Takashi Iwai wrote: >>>>>> At Tue, 13 Mar 2012 15:24:11 +0100, >>>>>> Jaroslav Kysela wrote: >>>>>>> >>>>>>> Date 13.3.2012 11:57, Takashi Iwai wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> below is a patch I posted to LKML as the follow up for the lockdep >>>>>>>> report from Dave Jones. It's a resurrected patch from what I worked >>>>>>>> on quite ago. Although this fix is suboptimal, I see no better way. >>>>>>>> >>>>>>>> If this looks OK, I'm going to queue it as a 3.4 material. >>>>>>>> >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Takashi >>>>>>>> >>>>>>>> === >>>>>>>> >>>>>>>> From: Takashi Iwai tiwai@suse.de >>>>>>>> Subject: [PATCH] ALSA: pcm - Simplify linked PCM substream spinlocks >>>>>>>> >>>>>>>> The spinlocks held for PCM substreams have been always messy. For the >>>>>>>> better concurrent accessibility, we took always the substream's lock >>>>>>>> first. When substreams are linked, we need a group-wide lock, but >>>>>>>> this should be applied outside substream's lock. So, we unlock the >>>>>>>> substream's lock first, then do locks twice. >>>>>>>> >>>>>>>> This scheme is known to be problematic with lockdep. Maybe because >>>>>>>> the nested lock isn't marked properly, and partly because the lock and >>>>>>>> unlock sequences are different (A/B -> A/B). >>>>>>>> >>>>>>>> This patch tries to simplify the scheme. Instead of holding the >>>>>>>> substream's lock, we take the group lock always at first. A drawback >>>>>>>> of this is that the access to the individual substream in a same group >>>>>>>> won't be allowed any longer. But, the code looks much easier, and >>>>>>>> likely less buggy. >>>>>>> >>>>>>> What do you mean by this? I noticed that you switched the behaviour for >>>>>>> the stream_lock(): >>>>>>> >>>>>>> 1) self_group spinlock is used for single (non-linked) stream >>>>>>> - same as before >>>>>>> 2) group.lock spinlock is used for the linked streams >>>>>>> - the self_group spinlock is used only in the group action >>>>>>> - the question is why? it seems useless, the >>>>>>> stream_lock() functions use the allocated group.lock >>>>>> >>>>>> Oh yeah, right, this is utterly useless! >>>>>> I just thought of the mutex case, but in this case, the lock is >>>>>> avoided anyway. >>>>>> >>>>>>> I don't think that this patch will work in the way we expect. >>>>>>> >>>>>>> I believe we should use only group.lock spinlock for the linked streams >>>>>>> without using the self_group spinlock to make locking work like this: >>>>>>> >>>>>>> stream_lock(current) >>>>>>> group_action(current) { >>>>>>> for s from all_group_substreams { >>>>>>> if (s != current) >>>>>>> stream_lock(s); >>>>>>> } >>>>>>> .... job .... >>>>>>> .... group substreams unlock .... >>>>>>> } >>>>>>> stream_unlock(current) >>>>>> >>>>>> Well, if we think of a single lock, the code would be much >>>>>> simplified than above. >>>>>> >>>>>> The reviewsed patch is below. I left the single-stream version as is >>>>>> since it's used explicitly in the drain() -- the drain action is >>>>>> applied to each substream although it's linked. >>>>> >>>>> I meant use group->lock in the group_action, see patch bellow. >>>> >>>> Hmm, I don't think this works. From linked stream members, >>>> group->lock points to the very same lock instance. So, >>>> snd_pcm_action_group() would dead-lock the same lock with your code. >>> >>> I see. The problem is that the concurrent but linked streams can take >>> different first locks for the group locking: >>> >>> time 0 time 1 >>> -------------------------------- >>> lock stream 0 -> group lock >>> lock stream 1 -> group lock >>> >>> In this case the condition current != s is not enough. I think that we >>> have to strictly separate situations when the group lock can be >>> activated and use different locking scheme there - for example using a >>> master lock per linked substreams. >> >> In my revised patch, group->lock is actually the master lock of the >> group. This is the only lock applied to the substream. >> >> When a stream is linked, it's done in the global write_lock to >> guarantee that stream is unlocked. Then group->lock is replaced with >> a real group lock from self_group.lock. So, there is no race among >> this move. >> >>> It seems to me that your proposed patch just removes the spinlocks for >>> linked substreams which can cause inconsistencies for these streams. >> >> Not really. It just uses a single lock for linked streams. There are >> no longer stream locks once when the stream is linked. >> >> This is of course a drawback -- it disallows the concurrent access to >> linked stream members in some operations. How this is regarded as a >> serious drawback, is a question of operation styles. If any apps are >> accessing many streams as a linked stream in realtime manner, and >> access to each without mmap, then the performance may be worsen. But, >> this scenario is rather pretty rare, I guess. >> >>> Or just a quick idea - what about to track the the group action >>> collisions? We can use a spinlock protected counter and serialize these >>> requests in another way (if counter is above one - the CPU should wait >>> until it becomes one). >> >> Well... this sounds way too complex, I'm afraid. > > I tried to think about it more. I think that the spinlock reorder is at > the wrong code position. What about the code bellow. I believe, it > should guarantee the consistent locking scheme (although the first > substream lock will stay outside of the group->lock, but it should not > be an issue).
Unfortunately it doesn't work neither. It's the way of thinking I've traced, too.
snd_pcm_action() could be called in the context where the stream lock has been already held, e.g. at stopping the stream from the interrupt handler. Thus you can't take the group lock over it.
Agreed. We must use the "master" lock or "per-substream" lock. Mixing of these locks results in an ugly locking scheme.
The re-locking in the current code (unlock first then two locks) is a compromise. It actually works well in practice. But the difference of lock and unlock sequences is confusing, and the way of checking the double-lock via spin_trylock() is too tricky.
I think that spin_trylock() can help, but we should really use it in a different way than before. I think that I finally created a way to use the per-substream locks and serialize locking of all those locks for a group operation. Please, review the patch bellow.
Sorry but it's way too complex. Most of the additional code is a specially open-coded spinlock. Is really worth to add more tricks?
I would accept such optimizations only if the concurrent access of a huge amount of linked streams were a realistic scenario. But it's rather unrelaistic. If the concurrency is no big issue from practical POV, we should move on rather toward simplification, IMO.
And, above all, I'm wondering whether allowing the concurrent access to linked streams is really safe. Imagine if both streams are trying to change to different states at the same time. For the consistency, a single lock approach sounds much safer.
This should be assured by the PCM code (it calls the group action when the state is changed, so all streams are locked at this moment).
But how? Suppose streams A and B are being accessed concurrently and trying to change the state at the same time but differently. At beginning, the locks held by both are only the self_group.locks. Then at snd_pcm_action_group(), it acquires the group lock. If A takes the group lock, B is spinning. With your latest patch, A tries the lock of B, and it can't take, thus it continues. It will change the state of all linked streams (including B) to state A'. After finishing snd_pcm_action() of stream A, now snd_pcm_action() of B begins and changes it to state B' again, or get an error.
In this scenario, both A and B should have taken the group lock from the beginning.
I can see the improvements when the application polls for the stream positions (for example to synchronize them) and when it uses small period sizes (many interrupts) for multiple substreams. In this case, the position updates for substreams can be executed in a parallel way.
Right. There can be some methods that can be done in parallel. But the stuff including state changes are basically difficult to handle in parallel.
I hoped to solve the bottleneck you mentioned in your patch. Of course, if you take ~30 lines (without comments and next 10 lines are curly brackets) of code as too way complex, our discussion is finished. The nice thing on my patch is that the lock serialization code is only at one place, easy to verify and commented (comments may be improved) for reviewers.
My next point is, if we allow a "global" lock, we will not take care about the possible concurrency in future, which is not so good.
Yeah, that's a drawback. OTOH, the handling of the state of linked streams must be done globally among all linked streams. And the lock of the each substream assumes that the state change doesn't happen during it, but it's not true when we don't protect via a global lock.
So, maybe we should think things separately: the actions that need a global lock obviously (i.e. involving with the state change) and the actions that can be easily parallelized. If we want to provide finer locks, use another substream-wise lock for the latter action and use the group lock for the former action (and eventually lock the subststem one in group lock).
The question is, if it's not simpler to use my scheme and verify/add the recovery paths after the possible group call.
Well, the scenario above doesn't work in your case, too. It's no regression, though. (And the scenario above doesn't work even in the current code, too, yes.)
My code guarantees, that the state is not modified when the group action finishes (until the substream lock is released). So, we can just do new current state / error value checking after these calls. A quick look to the code shows that almost all group action calls ends with the lock release, so I think that a comment (notifying that the state can be changed) for the group action functions should be enough.
Well, remember that the current code actually works. It's just messy and confusing for lockdep, more imporatntly, also for human beings. So, the primary goal of this action is to simplify and clean up the code to be more understandable. And, as the next point, better parallization even with simplification.
Another possibility is to not do any serialization, just use one atomic counter, and return an error to caller when a concurrent access occurs. The caller can decide what to do - it may even to release the lock and retry the whole operation.
Hmm, judging only from the description, I don't buy it. All of the things should be able to be implemented with normal spinlocks. If not, the design is very likely wrong.
I also think that the group lock might be acquired uselessly in many cases.
Yes. We basically think of a single lock in all pcm_lib.c operations. This is the cause of the whole problems. If we think of only a single lock, it should be really a single lock. If we need more finer locks, we need two kind of locks in PCM operations themselves.
thanks,
Takashi
participants (2)
-
Jaroslav Kysela
-
Takashi Iwai