[RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE
We've been adding a 'deep buffer' PCM device to several SOF topologies in order to reduce power consumption. The typical use-case would be music playback over a headset: this additional PCM device provides more buffering and longer latencies, leaving the rest of the system sleep for longer periods. Notifications and 'regular' low-latency audio playback would still use the 'normal' PCM device and be mixed with the 'deep buffer' before rendering on the headphone endpoint. The tentative direction would be to expose this alternate device to PulseAudio/PipeWire/CRAS via the UCM SectionModifier definitions.
That seemed a straightforward topology change until our automated validation stress tests started reporting issues on SoundWire platforms, when e.g. two START triggers might be send and conversely the STOP trigger is never sent. The SoundWire stream state management flagged inconsistent states when the two 'normal' and 'deep buffer' devices are used concurrently with rapid play/stop/pause monkey testing.
Looking at the soc-pcm.c code, it seems that the BE state management needs a lot of love.
a) there is no consistent protection for the BE state. In some parts of the code, the state updates are protected by a spinlock but in the trigger they are not. When we open/play/close the two PCM devices in stress tests, we end-up testing a state that is being modified. That can't be good.
b) there is a conceptual deadlock: on stop we check the FE states to see if a shared BE can be stopped, but since we trigger the BE first the FE states have not been modified yet, so the TRIGGER_STOP is never sent.
This patchset suggests a transition from a spinlock to a mutex, an extended protection when walking through the BE list, and the use of a refcount to decide when to trigger the BE. With these patches I am able to run our entire validation suite without any issues with this new 'deep buffer' topology, and no regressions on existing solutions [1]
One might ask 'how come we didn't see this earlier'? The answer is probably that the .trigger callbacks in most implementations seems to perform DAPM operations, and sending the triggers multiple times is not an issue. In the case of SoundWire, we do use the .trigger callback to reconfigure the bus using the 'bank switch' mechanism. It could be acceptable to tolerate a trigger multiple times, but the deadlock on stop cannot be fixed at the SoundWire layer alone.
I chose to send this patchset as an RFCv2 to gather more feedback and make use others know about DPCM issues. We're going to spend more time on this but if others can provide feedback/test results it would be greatly appreciated.
Opens:
1) is this the right solution? The DPCM code is far from simple, has notions such as SND_SOC_DPCM_UPDATE_NO and 'trigger_pending' that I have no background on.
2) There are other reports of kernel oopses [2] that seem related to the lack of protection. I'd be good to confirm if this patchset solve these problems as well.
[1] https://github.com/thesofproject/linux/pull/3146 [2] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...
changes since RFC v1: Removed unused function Removed exported symbols only used in soc-pcm.c, used static instead Use a mutex instead of a spinlock Protect all for_each_dpcm_be() loops Fix bugs introduced in the refcount
Pierre-Louis Bossart (5): ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update() ASoC: soc-pcm: don't export local functions, use static ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex ASoC: soc-pcm: protect for_each_dpcm_be() loops with dpcm_mutex ASoC: soc-pcm: test refcount before triggering
include/sound/soc-dpcm.h | 17 +---- include/sound/soc.h | 2 +- sound/soc/soc-core.c | 2 +- sound/soc/soc-pcm.c | 153 ++++++++++++++++++++++++++------------- 4 files changed, 108 insertions(+), 66 deletions(-)
This function is not used anywhere, including soc-pcm.c
Remove dead code.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc-dpcm.h | 3 --- sound/soc/soc-pcm.c | 9 --------- 2 files changed, 12 deletions(-)
diff --git a/include/sound/soc-dpcm.h b/include/sound/soc-dpcm.h index bc7af90099a8..72d45ad47ee3 100644 --- a/include/sound/soc-dpcm.h +++ b/include/sound/soc-dpcm.h @@ -121,9 +121,6 @@ int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream);
-/* is the current PCM operation for this FE ? */ -int snd_soc_dpcm_fe_can_update(struct snd_soc_pcm_runtime *fe, int stream); - /* is the current PCM operation for this BE ? */ int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream); diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index fc1854e3e43f..17476e3219ea 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2805,15 +2805,6 @@ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num) return ret; }
-/* is the current PCM operation for this FE ? */ -int snd_soc_dpcm_fe_can_update(struct snd_soc_pcm_runtime *fe, int stream) -{ - if (fe->dpcm[stream].runtime_update == SND_SOC_DPCM_UPDATE_FE) - return 1; - return 0; -} -EXPORT_SYMBOL_GPL(snd_soc_dpcm_fe_can_update); - /* is the current PCM operation for this BE ? */ int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream)
No one uses the following functions outside of soc-pcm.c
snd_soc_dpcm_can_be_free_stop() snd_soc_dpcm_can_be_params() snd_soc_dpcm_be_can_update()
In preparation for locking changes, move them to static functions and remove the EXPORT_SYMBOL_GPL()
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc-dpcm.h | 12 ------------ sound/soc/soc-pcm.c | 21 +++++++++++++++------ 2 files changed, 15 insertions(+), 18 deletions(-)
diff --git a/include/sound/soc-dpcm.h b/include/sound/soc-dpcm.h index 72d45ad47ee3..9c00118603e7 100644 --- a/include/sound/soc-dpcm.h +++ b/include/sound/soc-dpcm.h @@ -113,18 +113,6 @@ struct snd_soc_dpcm_runtime { #define for_each_dpcm_be_rollback(fe, stream, _dpcm) \ list_for_each_entry_continue_reverse(_dpcm, &(fe)->dpcm[stream].be_clients, list_be)
-/* can this BE stop and free */ -int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, - struct snd_soc_pcm_runtime *be, int stream); - -/* can this BE perform a hw_params() */ -int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, - struct snd_soc_pcm_runtime *be, int stream); - -/* is the current PCM operation for this BE ? */ -int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, - struct snd_soc_pcm_runtime *be, int stream); - /* get the substream for this BE */ struct snd_pcm_substream * snd_soc_dpcm_get_substream(struct snd_soc_pcm_runtime *be, int stream); diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 17476e3219ea..360811f8a18c 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -29,6 +29,18 @@
#define DPCM_MAX_BE_USERS 8
+/* can this BE stop and free */ +static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, + struct snd_soc_pcm_runtime *be, int stream); + +/* can this BE perform a hw_params() */ +static int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, + struct snd_soc_pcm_runtime *be, int stream); + +/* is the current PCM operation for this BE ? */ +static int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, + struct snd_soc_pcm_runtime *be, int stream); + static inline const char *soc_cpu_dai_name(struct snd_soc_pcm_runtime *rtd) { return (rtd)->num_cpus == 1 ? asoc_rtd_to_cpu(rtd, 0)->name : "multicpu"; @@ -2806,7 +2818,7 @@ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num) }
/* is the current PCM operation for this BE ? */ -int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, +static int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { if ((fe->dpcm[stream].runtime_update == SND_SOC_DPCM_UPDATE_FE) || @@ -2815,7 +2827,6 @@ int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, return 1; return 0; } -EXPORT_SYMBOL_GPL(snd_soc_dpcm_be_can_update);
/* get the substream for this BE */ struct snd_pcm_substream * @@ -2861,7 +2872,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, * We can only hw_free, stop, pause or suspend a BE DAI if any of it's FE * are not running, paused or suspended for the specified stream direction. */ -int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, +static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { const enum snd_soc_dpcm_state state[] = { @@ -2872,13 +2883,12 @@ int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe,
return snd_soc_dpcm_check_state(fe, be, stream, state, ARRAY_SIZE(state)); } -EXPORT_SYMBOL_GPL(snd_soc_dpcm_can_be_free_stop);
/* * We can only change hw params a BE DAI if any of it's FE are not prepared, * running, paused or suspended for the specified stream direction. */ -int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, +static int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { const enum snd_soc_dpcm_state state[] = { @@ -2890,4 +2900,3 @@ int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe,
return snd_soc_dpcm_check_state(fe, be, stream, state, ARRAY_SIZE(state)); } -EXPORT_SYMBOL_GPL(snd_soc_dpcm_can_be_params);
It's not clear why a spinlock was used, and even less why the 'dpcm_lock' is used with the irqsave/irqrestore: DPCM functions are typically not used in interrupt context.
Move to a mutex which will allow us to sleep for longer periods of time and handle non-atomic triggers.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc.h | 2 +- sound/soc/soc-core.c | 2 +- sound/soc/soc-pcm.c | 30 ++++++++++++------------------ 3 files changed, 14 insertions(+), 20 deletions(-)
diff --git a/include/sound/soc.h b/include/sound/soc.h index 8e6dd8a257c5..c13d8ec72d62 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -893,7 +893,7 @@ struct snd_soc_card { struct mutex pcm_mutex; enum snd_soc_pcm_subclass pcm_subclass;
- spinlock_t dpcm_lock; + struct mutex dpcm_mutex; /* protect all dpcm states and updates */
int (*probe)(struct snd_soc_card *card); int (*late_probe)(struct snd_soc_card *card); diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c830e96afba2..e94d43c392e0 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2339,7 +2339,7 @@ int snd_soc_register_card(struct snd_soc_card *card) mutex_init(&card->mutex); mutex_init(&card->dapm_mutex); mutex_init(&card->pcm_mutex); - spin_lock_init(&card->dpcm_lock); + mutex_init(&card->dpcm_mutex);
return snd_soc_bind_card(card); } diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 360811f8a18c..af12593b033a 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -85,7 +85,6 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, struct snd_pcm_hw_params *params = &fe->dpcm[stream].hw_params; struct snd_soc_dpcm *dpcm; ssize_t offset = 0; - unsigned long flags;
/* FE state */ offset += scnprintf(buf + offset, size - offset, @@ -113,7 +112,7 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, goto out; }
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; params = &dpcm->hw_params; @@ -134,7 +133,7 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, params_channels(params), params_rate(params)); } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex); out: return offset; } @@ -1141,7 +1140,6 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { struct snd_soc_dpcm *dpcm; - unsigned long flags;
/* only add new dpcms */ for_each_dpcm_be(fe, stream, dpcm) { @@ -1157,10 +1155,10 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, dpcm->fe = fe; be->dpcm[stream].runtime = fe->dpcm[stream].runtime; dpcm->state = SND_SOC_DPCM_LINK_STATE_NEW; - spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); list_add(&dpcm->list_be, &fe->dpcm[stream].be_clients); list_add(&dpcm->list_fe, &be->dpcm[stream].fe_clients); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex);
dev_dbg(fe->dev, "connected new DPCM %s path %s %s %s\n", stream ? "capture" : "playback", fe->dai_link->name, @@ -1203,7 +1201,6 @@ static void dpcm_be_reparent(struct snd_soc_pcm_runtime *fe, void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm, *d; - unsigned long flags;
for_each_dpcm_be_safe(fe, stream, dpcm, d) { dev_dbg(fe->dev, "ASoC: BE %s disconnect check for %s\n", @@ -1222,10 +1219,10 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream)
dpcm_remove_debugfs_state(dpcm);
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); list_del(&dpcm->list_be); list_del(&dpcm->list_fe); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex); kfree(dpcm); } } @@ -1441,12 +1438,11 @@ int dpcm_process_paths(struct snd_soc_pcm_runtime *fe, void dpcm_clear_pending_state(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm; - unsigned long flags;
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) dpcm_set_be_update_state(dpcm->be, stream, SND_SOC_DPCM_UPDATE_NO); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex); }
void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, @@ -2364,7 +2360,6 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) struct snd_soc_dpcm *dpcm; enum snd_soc_dpcm_trigger trigger = fe->dai_link->trigger[stream]; int ret = 0; - unsigned long flags;
dev_dbg(fe->dev, "ASoC: runtime %s open on FE %s\n", stream ? "capture" : "playback", fe->dai_link->name); @@ -2433,7 +2428,7 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) dpcm_be_dai_shutdown(fe, stream); disconnect: /* disconnect any pending BEs */ - spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be;
@@ -2445,7 +2440,7 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state == SND_SOC_DPCM_STATE_NEW) dpcm->state = SND_SOC_DPCM_LINK_STATE_FREE; } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex);
if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed (%d)\n", __func__, ret); @@ -2845,10 +2840,9 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, struct snd_soc_dpcm *dpcm; int state; int ret = 1; - unsigned long flags; int i;
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_fe(be, stream, dpcm) {
if (dpcm->fe == fe) @@ -2862,7 +2856,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, } } } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex);
/* it's safe to do this BE DAI */ return ret;
When more than one FE is connected to a BE, e.g. in a mixing use case, the BE can be triggered multiple times when the FE are opened/started concurrently. This race condition is problematic in the case of SoundWire BE dailinks, and this is not desirable in a general case. The code carefully checks when the BE can be stopped or hw_free'ed, but the trigger code does not use any mutual exclusion.
This can be fixed by using the dpcm_mutex.
As discussed on the alsa-mailing list [1], there is an additional issue that the BEs can be disconnected dynamically, leading to potential accesses to freed memory.
This patch suggests a general protection of the for_each_dpcm_be() loop with the dpcm_mutex, to solve both the problem of multiple triggers and BE dynamic addition/removal.
[1] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/soc-pcm.c | 48 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 39 insertions(+), 9 deletions(-)
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index af12593b033a..6089e0c1bf9f 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -320,6 +320,7 @@ int dpcm_dapm_stream_event(struct snd_soc_pcm_runtime *fe, int dir, { struct snd_soc_dpcm *dpcm;
+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, dir, dpcm) {
struct snd_soc_pcm_runtime *be = dpcm->be; @@ -333,6 +334,8 @@ int dpcm_dapm_stream_event(struct snd_soc_pcm_runtime *fe, int dir,
snd_soc_dapm_stream_event(be, dir, event); } + mutex_unlock(&fe->card->dpcm_mutex); +
snd_soc_dapm_stream_event(fe, dir, event);
@@ -1142,10 +1145,14 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, struct snd_soc_dpcm *dpcm;
/* only add new dpcms */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { - if (dpcm->be == be && dpcm->fe == fe) + if (dpcm->be == be && dpcm->fe == fe) { + mutex_unlock(&fe->card->dpcm_mutex); return 0; + } } + mutex_unlock(&fe->card->dpcm_mutex);
dpcm = kzalloc(sizeof(struct snd_soc_dpcm), GFP_KERNEL); if (!dpcm) @@ -1202,6 +1209,7 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm, *d;
+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be_safe(fe, stream, dpcm, d) { dev_dbg(fe->dev, "ASoC: BE %s disconnect check for %s\n", stream ? "capture" : "playback", @@ -1219,12 +1227,11 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream)
dpcm_remove_debugfs_state(dpcm);
- mutex_lock(&fe->card->dpcm_mutex); list_del(&dpcm->list_be); list_del(&dpcm->list_fe); - mutex_unlock(&fe->card->dpcm_mutex); kfree(dpcm); } + mutex_unlock(&fe->card->dpcm_mutex); }
/* get BE for DAI widget and stream */ @@ -1351,6 +1358,7 @@ static int dpcm_prune_paths(struct snd_soc_pcm_runtime *fe, int stream, int prune = 0;
/* Destroy any old FE <--> BE connections */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { if (dpcm_be_is_active(dpcm, stream, *list_)) continue; @@ -1362,6 +1370,7 @@ static int dpcm_prune_paths(struct snd_soc_pcm_runtime *fe, int stream, dpcm_set_be_update_state(dpcm->be, stream, SND_SOC_DPCM_UPDATE_BE); prune++; } + mutex_unlock(&fe->card->dpcm_mutex);
dev_dbg(fe->dev, "ASoC: found %d old BE paths for pruning\n", prune); return prune; @@ -1451,13 +1460,16 @@ void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, struct snd_soc_dpcm *dpcm;
/* disable any enabled and non active backends */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_pcm_substream *be_substream = snd_soc_dpcm_get_substream(be, stream);
- if (dpcm == last) + if (dpcm == last) { + mutex_unlock(&fe->card->dpcm_mutex); return; + }
/* is this op for this BE ? */ if (!snd_soc_dpcm_be_can_update(fe, be, stream)) @@ -1487,6 +1499,7 @@ void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, be_substream->runtime = NULL; be->dpcm[stream].state = SND_SOC_DPCM_STATE_CLOSE; } + mutex_unlock(&fe->card->dpcm_mutex); }
int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) @@ -1496,6 +1509,7 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) int err, count = 0;
/* only startup BE DAIs that are either sinks or sources to this FE DAI */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_pcm_substream *be_substream;
@@ -1546,11 +1560,13 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state = SND_SOC_DPCM_STATE_OPEN; count++; } + mutex_unlock(&fe->card->dpcm_mutex);
return count;
unwind: dpcm_be_dai_startup_rollback(fe, stream, dpcm); + mutex_unlock(&fe->card->dpcm_mutex);
dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, err); @@ -1605,6 +1621,7 @@ static void dpcm_runtime_setup_be_format(struct snd_pcm_substream *substream) * if FE want to use it (= dpcm_merged_format) */
+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_soc_pcm_stream *codec_stream; @@ -1623,6 +1640,7 @@ static void dpcm_runtime_setup_be_format(struct snd_pcm_substream *substream) soc_pcm_hw_update_format(hw, codec_stream); } } + mutex_unlock(&fe->card->dpcm_mutex); }
static void dpcm_runtime_setup_be_chan(struct snd_pcm_substream *substream) @@ -1640,7 +1658,7 @@ static void dpcm_runtime_setup_be_chan(struct snd_pcm_substream *substream) * It returns merged BE codec channel; * if FE want to use it (= dpcm_merged_chan) */ - + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_soc_pcm_stream *cpu_stream; @@ -1671,6 +1689,7 @@ static void dpcm_runtime_setup_be_chan(struct snd_pcm_substream *substream) soc_pcm_hw_update_chan(hw, codec_stream); } } + mutex_unlock(&fe->card->dpcm_mutex); }
static void dpcm_runtime_setup_be_rate(struct snd_pcm_substream *substream) @@ -1688,7 +1707,7 @@ static void dpcm_runtime_setup_be_rate(struct snd_pcm_substream *substream) * It returns merged BE codec channel; * if FE want to use it (= dpcm_merged_chan) */ - + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_soc_pcm_stream *pcm; @@ -1708,6 +1727,7 @@ static void dpcm_runtime_setup_be_rate(struct snd_pcm_substream *substream) soc_pcm_hw_update_rate(hw, pcm); } } + mutex_unlock(&fe->card->dpcm_mutex); }
static int dpcm_apply_symmetry(struct snd_pcm_substream *fe_substream, @@ -1730,6 +1750,7 @@ static int dpcm_apply_symmetry(struct snd_pcm_substream *fe_substream, }
/* apply symmetry for BE */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_pcm_substream *be_substream = @@ -1755,6 +1776,7 @@ static int dpcm_apply_symmetry(struct snd_pcm_substream *fe_substream, } } error: + mutex_unlock(&fe->card->dpcm_mutex); if (err < 0) dev_err(fe->dev, "ASoC: %s failed (%d)\n", __func__, err);
@@ -1830,6 +1852,7 @@ void dpcm_be_dai_hw_free(struct snd_soc_pcm_runtime *fe, int stream)
/* only hw_params backends that are either sinks or sources * to this frontend DAI */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) {
struct snd_soc_pcm_runtime *be = dpcm->be; @@ -1842,7 +1865,7 @@ void dpcm_be_dai_hw_free(struct snd_soc_pcm_runtime *fe, int stream)
/* only free hw when no longer used - check all FEs */ if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) - continue; + continue;
/* do not free hw if this BE is used by other FE */ if (be->dpcm[stream].users > 1) @@ -1863,6 +1886,7 @@ void dpcm_be_dai_hw_free(struct snd_soc_pcm_runtime *fe, int stream)
be->dpcm[stream].state = SND_SOC_DPCM_STATE_HW_FREE; } + mutex_unlock(&fe->card->dpcm_mutex); }
static int dpcm_fe_dai_hw_free(struct snd_pcm_substream *substream) @@ -1896,6 +1920,7 @@ int dpcm_be_dai_hw_params(struct snd_soc_pcm_runtime *fe, int stream) struct snd_soc_dpcm *dpcm; int ret;
+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { be = dpcm->be; be_substream = snd_soc_dpcm_get_substream(be, stream); @@ -1935,6 +1960,7 @@ int dpcm_be_dai_hw_params(struct snd_soc_pcm_runtime *fe, int stream)
be->dpcm[stream].state = SND_SOC_DPCM_STATE_HW_PARAMS; } + mutex_unlock(&fe->card->dpcm_mutex); return 0;
unwind: @@ -1961,6 +1987,7 @@ int dpcm_be_dai_hw_params(struct snd_soc_pcm_runtime *fe, int stream)
soc_pcm_hw_free(be_substream); } + mutex_unlock(&fe->card->dpcm_mutex);
return ret; } @@ -2008,6 +2035,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, struct snd_soc_dpcm *dpcm; int ret = 0;
+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_pcm_substream *be_substream;
@@ -2097,6 +2125,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, } } end: + mutex_unlock(&fe->card->dpcm_mutex); if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, ret); @@ -2831,6 +2860,7 @@ struct snd_pcm_substream * } EXPORT_SYMBOL_GPL(snd_soc_dpcm_get_substream);
+/* This must be called with fe->card->dpcm_mutex held */ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream, @@ -2842,7 +2872,6 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, int ret = 1; int i;
- mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_fe(be, stream, dpcm) {
if (dpcm->fe == fe) @@ -2856,7 +2885,6 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, } } } - mutex_unlock(&fe->card->dpcm_mutex);
/* it's safe to do this BE DAI */ return ret; @@ -2865,6 +2893,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, /* * We can only hw_free, stop, pause or suspend a BE DAI if any of it's FE * are not running, paused or suspended for the specified stream direction. + * This must be called with fe->card->dpcm_mutex held. */ static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) @@ -2881,6 +2910,7 @@ static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, /* * We can only change hw params a BE DAI if any of it's FE are not prepared, * running, paused or suspended for the specified stream direction. + * This must be called with fe->card->dpcm_mutex held. */ static int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream)
On start/pause_release/resume, when more than one FE is connected to the same BE, it's possible that the trigger is sent more than once. This is not desirable, we only want to trigger a BE once, which is straightforward to implement with a refcount.
For stop/pause/suspend, the problem is more complicated: the check implemented in snd_soc_dpcm_can_be_free_stop() may fail due to a conceptual deadlock when we trigger the BE before the FE. In this case, the FE states have not yet changed, so there are corner cases where the TRIGGER_STOP is never sent - the dual case of start where multiple triggers might be sent.
This patch suggests an unconditional trigger in all cases, without checking the FE states, using a refcount protected by a mutex.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc-dpcm.h | 2 ++ sound/soc/soc-pcm.c | 53 +++++++++++++++++++++++++++++++--------- 2 files changed, 44 insertions(+), 11 deletions(-)
diff --git a/include/sound/soc-dpcm.h b/include/sound/soc-dpcm.h index 9c00118603e7..6d84e2c60ca8 100644 --- a/include/sound/soc-dpcm.h +++ b/include/sound/soc-dpcm.h @@ -101,6 +101,8 @@ struct snd_soc_dpcm_runtime { enum snd_soc_dpcm_state state;
int trigger_pending; /* trigger cmd + 1 if pending, 0 if not */ + + int be_start; /* refcount protected by card's dpcm_mutex */ };
#define for_each_dpcm_fe(be, stream, _dpcm) \ diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 6089e0c1bf9f..ed898d83fe17 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -1556,7 +1556,7 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state = SND_SOC_DPCM_STATE_CLOSE; goto unwind; } - + be->dpcm[stream].be_start = 0; be->dpcm[stream].state = SND_SOC_DPCM_STATE_OPEN; count++; } @@ -2051,14 +2051,21 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream,
switch (cmd) { case SNDRV_PCM_TRIGGER_START: - if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && + if (!be->dpcm[stream].be_start && + (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) continue;
+ be->dpcm[stream].be_start++; + if (be->dpcm[stream].be_start != 1) + continue; + ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start--; goto end; + }
be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; @@ -2066,9 +2073,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_SUSPEND)) continue;
+ be->dpcm[stream].be_start++; + if (be->dpcm[stream].be_start != 1) + continue; + ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start--; goto end; + }
be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; @@ -2076,9 +2089,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) continue;
+ be->dpcm[stream].be_start++; + if (be->dpcm[stream].be_start != 1) + continue; + ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start--; goto end; + }
be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; @@ -2087,12 +2106,18 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) continue;
- if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) + if (be->dpcm[stream].state == SND_SOC_DPCM_STATE_START) + be->dpcm[stream].be_start--; + + if (be->dpcm[stream].be_start != 0) continue;
ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + if (be->dpcm[stream].state == SND_SOC_DPCM_STATE_START) + be->dpcm[stream].be_start++; goto end; + }
be->dpcm[stream].state = SND_SOC_DPCM_STATE_STOP; break; @@ -2100,12 +2125,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) continue;
- if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) + be->dpcm[stream].be_start--; + if (be->dpcm[stream].be_start != 0) continue;
ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start++; goto end; + }
be->dpcm[stream].state = SND_SOC_DPCM_STATE_SUSPEND; break; @@ -2113,12 +2141,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) continue;
- if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) + be->dpcm[stream].be_start--; + if (be->dpcm[stream].be_start != 0) continue;
ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start++; goto end; + }
be->dpcm[stream].state = SND_SOC_DPCM_STATE_PAUSED; break;
On 10/5/2021 4:24 AM, Pierre-Louis Bossart wrote:
External email: Use caution opening links or attachments
We've been adding a 'deep buffer' PCM device to several SOF topologies in order to reduce power consumption. The typical use-case would be music playback over a headset: this additional PCM device provides more buffering and longer latencies, leaving the rest of the system sleep for longer periods. Notifications and 'regular' low-latency audio playback would still use the 'normal' PCM device and be mixed with the 'deep buffer' before rendering on the headphone endpoint. The tentative direction would be to expose this alternate device to PulseAudio/PipeWire/CRAS via the UCM SectionModifier definitions.
That seemed a straightforward topology change until our automated validation stress tests started reporting issues on SoundWire platforms, when e.g. two START triggers might be send and conversely the STOP trigger is never sent. The SoundWire stream state management flagged inconsistent states when the two 'normal' and 'deep buffer' devices are used concurrently with rapid play/stop/pause monkey testing.
Looking at the soc-pcm.c code, it seems that the BE state management needs a lot of love.
a) there is no consistent protection for the BE state. In some parts of the code, the state updates are protected by a spinlock but in the trigger they are not. When we open/play/close the two PCM devices in stress tests, we end-up testing a state that is being modified. That can't be good.
b) there is a conceptual deadlock: on stop we check the FE states to see if a shared BE can be stopped, but since we trigger the BE first the FE states have not been modified yet, so the TRIGGER_STOP is never sent.
This patchset suggests a transition from a spinlock to a mutex, an extended protection when walking through the BE list, and the use of a refcount to decide when to trigger the BE. With these patches I am able to run our entire validation suite without any issues with this new 'deep buffer' topology, and no regressions on existing solutions [1]
One might ask 'how come we didn't see this earlier'? The answer is probably that the .trigger callbacks in most implementations seems to perform DAPM operations, and sending the triggers multiple times is not an issue. In the case of SoundWire, we do use the .trigger callback to reconfigure the bus using the 'bank switch' mechanism. It could be acceptable to tolerate a trigger multiple times, but the deadlock on stop cannot be fixed at the SoundWire layer alone.
I chose to send this patchset as an RFCv2 to gather more feedback and make use others know about DPCM issues. We're going to spend more time on this but if others can provide feedback/test results it would be greatly appreciated.
Opens:
- is this the right solution? The DPCM code is far from simple, has
notions such as SND_SOC_DPCM_UPDATE_NO and 'trigger_pending' that I have no background on.
- There are other reports of kernel oopses [2] that seem related to
the lack of protection. I'd be good to confirm if this patchset solve these problems as well.
[1] https://github.com/thesofproject/linux/pull/3146 [2] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...
changes since RFC v1: Removed unused function Removed exported symbols only used in soc-pcm.c, used static instead Use a mutex instead of a spinlock Protect all for_each_dpcm_be() loops Fix bugs introduced in the refcount
Pierre-Louis Bossart (5): ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update() ASoC: soc-pcm: don't export local functions, use static ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex ASoC: soc-pcm: protect for_each_dpcm_be() loops with dpcm_mutex ASoC: soc-pcm: test refcount before triggering
Thank you Pierre-Louis for adding me to this thread.
I did a quick test of your patches on my Tegra board and seeing issues with multiple streams. For instance, I ran it for a 2x1 mixer configuration and hitting below:
[ 277.661886] BUG: scheduling while atomic: aplay/1306/0x00000100 [ 287.713193] BUG: spinlock cpu recursion on CPU#0, aplay/1307 [ 287.719138] lock: 0xffff00008cc820f0, .magic: dead4ead, .owner: aplay/1306, .owner_cpu: 0 [ 287.727319] CPU: 0 PID: 1307 Comm: aplay Tainted: G W 5.15.0-rc3-next-20210927-00026-gffdabce987b1 #12 [ 287.737783] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) [ 287.744228] Call trace: [ 287.746656] dump_backtrace+0x0/0x1c0 [ 287.750300] show_stack+0x18/0x28 [ 287.753604] dump_stack_lvl+0x7c/0xa8 [ 287.757236] dump_stack+0x18/0x34 [ 287.760536] spin_dump+0x70/0x90 [ 287.763732] do_raw_spin_lock+0xd8/0x120 [ 287.767615] _raw_spin_lock_irq+0x60/0x80 [ 287.771581] snd_pcm_stream_lock_irq+0x20/0x48 [snd_pcm] [ 287.776853] snd_pcm_drain+0x1ec/0x348 [snd_pcm] [ 287.781421] snd_pcm_common_ioctl+0xacc/0x1938 [snd_pcm] [ 287.786685] snd_pcm_ioctl+0x2c/0x48 [snd_pcm] [ 287.791101] __arm64_sys_ioctl+0xb0/0xf0 [ 287.794982] invoke_syscall+0x44/0x108 [ 287.798683] el0_svc_common.constprop.3+0x74/0x100 [ 287.803416] do_el0_svc+0x24/0x90 [ 287.806687] el0_svc+0x20/0x60 [ 287.809705] el0t_64_sync_handler+0x94/0xb8 [ 287.813839] el0t_64_sync+0x180/0x184
And in some case just below:
[ 1074.212276] BUG: scheduling while atomic: aplay/12327/0x00000100 [ 1095.227509] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 1095.233443] rcu: 0-...0: (1 GPs behind) idle=4af/1/0x4000000000000004 softirq=19902/19902 fqs=2626 [ 1095.242528] rcu: 2-...0: (1 GPs behind) idle=9d5/1/0x4000000000000000 softirq=22707/22707 fqs=262
I did a quick test of your patches on my Tegra board and seeing issues with multiple streams. For instance, I ran it for a 2x1 mixer configuration and hitting below:
[ 277.661886] BUG: scheduling while atomic: aplay/1306/0x00000100 [ 287.713193] BUG: spinlock cpu recursion on CPU#0, aplay/1307 [ 287.719138] lock: 0xffff00008cc820f0, .magic: dead4ead, .owner: aplay/1306, .owner_cpu: 0 [ 287.727319] CPU: 0 PID: 1307 Comm: aplay Tainted: G W 5.15.0-rc3-next-20210927-00026-gffdabce987b1 #12 [ 287.737783] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) [ 287.744228] Call trace: [ 287.746656] dump_backtrace+0x0/0x1c0 [ 287.750300] show_stack+0x18/0x28 [ 287.753604] dump_stack_lvl+0x7c/0xa8 [ 287.757236] dump_stack+0x18/0x34 [ 287.760536] spin_dump+0x70/0x90 [ 287.763732] do_raw_spin_lock+0xd8/0x120 [ 287.767615] _raw_spin_lock_irq+0x60/0x80 [ 287.771581] snd_pcm_stream_lock_irq+0x20/0x48 [snd_pcm] [ 287.776853] snd_pcm_drain+0x1ec/0x348 [snd_pcm] [ 287.781421] snd_pcm_common_ioctl+0xacc/0x1938 [snd_pcm] [ 287.786685] snd_pcm_ioctl+0x2c/0x48 [snd_pcm] [ 287.791101] __arm64_sys_ioctl+0xb0/0xf0 [ 287.794982] invoke_syscall+0x44/0x108 [ 287.798683] el0_svc_common.constprop.3+0x74/0x100 [ 287.803416] do_el0_svc+0x24/0x90 [ 287.806687] el0_svc+0x20/0x60 [ 287.809705] el0t_64_sync_handler+0x94/0xb8 [ 287.813839] el0t_64_sync+0x180/0x184
And in some case just below:
[ 1074.212276] BUG: scheduling while atomic: aplay/12327/0x00000100 [ 1095.227509] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 1095.233443] rcu: 0-...0: (1 GPs behind) idle=4af/1/0x4000000000000004 softirq=19902/19902 fqs=2626 [ 1095.242528] rcu: 2-...0: (1 GPs behind) idle=9d5/1/0x4000000000000000 softirq=22707/22707 fqs=262
Thanks Sameer for the overnight tests, much appreciated.
My patches don't change anything related to a spinlock or pcm stream management, so not sure what could cause this rather spectacular failure. That hints at a fundamental configuration difference, possibly caused by your component chaining?
Since in your case you have a 1:1 mapping between FE and BE, would you mind testing by backtracking, one patch at a time to see which one of the three last patches could cause a problem on your board?
Thanks again for your time! -Pierre
Hi Pierre,
On 10/5/2021 6:47 PM, Pierre-Louis Bossart wrote:
My patches don't change anything related to a spinlock or pcm stream management, so not sure what could cause this rather spectacular failure. That hints at a fundamental configuration difference, possibly caused by your component chaining?
Since in your case you have a 1:1 mapping between FE and BE, would you mind testing by backtracking, one patch at a time to see which one of the three last patches could cause a problem on your board?
I tested this further. It appears that things work fine till 'patch 3/5' of yours. After I take 'patch 4/5', the print "BUG: scheduling while atomic: aplay" started showing up and I see a hang. This failure was seen for 2x1 mixer test itself.
The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger(). This seems to be the problem, since trigger() runs in atomic context depending on the PCM 'nonatomic' flag. I am not sure if your design sets 'nonatomic' flag by default and that is why the issue is not seen at your end?
With below (just for testing purpose), tests ran well. I was able to run 2x1, 3x1 ... 10x1 mixer tests.
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index e5df898..2ce30d1 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2045,7 +2045,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, struct snd_soc_dpcm *dpcm; int ret = 0;
- mutex_lock(&fe->card->dpcm_mutex); + //mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_pcm_substream *be_substream;
@@ -2166,7 +2166,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, } } end: - mutex_unlock(&fe->card->dpcm_mutex); + //mutex_unlock(&fe->card->dpcm_mutex); if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, ret);
In fact I picked up all of your patches + above test patch, it worked fine.
Thanks, Sameer.
I tested this further. It appears that things work fine till 'patch 3/5' of yours. After I take 'patch 4/5', the print "BUG: scheduling while atomic: aplay" started showing up and I see a hang. This failure was seen for 2x1 mixer test itself.
The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger(). This seems to be the problem, since trigger() runs in atomic context depending on the PCM 'nonatomic' flag. I am not sure if your design sets 'nonatomic' flag by default and that is why the issue is not seen at your end?
With below (just for testing purpose), tests ran well. I was able to run 2x1, 3x1 ... 10x1 mixer tests.
This is useful information, thanks a lot Sameer for your time.
Unfortunately removing the lines below will not work, that's precisely what's needed to prevent multiple triggers from being sent to the same shared BE.
I don't have a clear idea why we see different results, and now I have even less understanding of the ALSA/ASoC/DPCM locking model. We use card->mutex, card->pcm_mutex, card->dpcm_lock, substream->self_group.mutex or lock, client_mutex, dapm_mutex....
I don't think any of the DPCM code is ever called from an interrupt context, so the errors you reported seem more like a card configuration, specifically the interaction with 'aplay'/userspace will happen for FEs. BEs are mostly hidden to userspace.
One possible difference is that all our DPCM solutions are based on non-atomic FEs (since they all involve an IPC with a DSP), so we would always use the top banch of these tests:
if (nonatomic) \ mutex_ ## mutex_action(&group->mutex); \ else \ spin_ ## action(&group->lock);
I don't see this nonatomic flag set anywhere in the sound/soc/tegra code, so it's possible that in your case the spin_lock/spin_lock_irq is used before triggering the FE, and using a mutex before the BE trigger throws errors? I think Takashi mentioned that the BEs inherit the properties of the FE to some extent.
Looking at the code, I see that only Intel legacy, SOF and Qualcomm drivers set nonatomic=1, so maybe we can assume that that FEs for a card are either all atomic or all non-atomic, maybe we could then use a spinlock or a mutex. But if the FEs can be different then I am not sure what locking model might work, we can't have a BE protected by a spinlock or a mutex depending on the property of the FE. We need one method only to guarantee a mutual exclusion.
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?
On Wed, 06 Oct 2021 21:47:33 +0200, Pierre-Louis Bossart wrote:
I tested this further. It appears that things work fine till 'patch 3/5' of yours. After I take 'patch 4/5', the print "BUG: scheduling while atomic: aplay" started showing up and I see a hang. This failure was seen for 2x1 mixer test itself.
The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger(). This seems to be the problem, since trigger() runs in atomic context depending on the PCM 'nonatomic' flag. I am not sure if your design sets 'nonatomic' flag by default and that is why the issue is not seen at your end?
With below (just for testing purpose), tests ran well. I was able to run 2x1, 3x1 ... 10x1 mixer tests.
This is useful information, thanks a lot Sameer for your time.
Unfortunately removing the lines below will not work, that's precisely what's needed to prevent multiple triggers from being sent to the same shared BE.
I don't have a clear idea why we see different results, and now I have even less understanding of the ALSA/ASoC/DPCM locking model. We use card->mutex, card->pcm_mutex, card->dpcm_lock, substream->self_group.mutex or lock, client_mutex, dapm_mutex....
I don't think any of the DPCM code is ever called from an interrupt context, so the errors you reported seem more like a card configuration, specifically the interaction with 'aplay'/userspace will happen for FEs. BEs are mostly hidden to userspace.
One possible difference is that all our DPCM solutions are based on non-atomic FEs (since they all involve an IPC with a DSP), so we would always use the top banch of these tests:
if (nonatomic) \ mutex_ ## mutex_action(&group->mutex); \ else \ spin_ ## action(&group->lock);
I don't see this nonatomic flag set anywhere in the sound/soc/tegra code, so it's possible that in your case the spin_lock/spin_lock_irq is used before triggering the FE, and using a mutex before the BE trigger throws errors? I think Takashi mentioned that the BEs inherit the properties of the FE to some extent.
Looking at the code, I see that only Intel legacy, SOF and Qualcomm drivers set nonatomic=1, so maybe we can assume that that FEs for a card are either all atomic or all non-atomic, maybe we could then use a spinlock or a mutex. But if the FEs can be different then I am not sure what locking model might work, we can't have a BE protected by a spinlock or a mutex depending on the property of the FE. We need one method only to guarantee a mutual exclusion.
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?
As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.
How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?
thanks,
Takashi
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?
As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.
A BE may have more than one FEs. That's precisely the point of mixers/demux, and if the BE has FEs with different 'atomicity' then I don't see how locking can work: the BE operations are run in the context of each of its FE, typically using the following loop:
for_each_dpcm_be(fe, stream, dpcm) { do_something(); }
Applications will view multiple FEs as completely independent. They may be opened/prepared/started/stopped/paused as needed. When the BE is shared, then there is a need for consistency, such as starting the BE when the first FE becomes active and stopping it when the last FE stops.
How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?
We already have a dpcm_lock at the card level, which protects the addition of BEs and BE states. That spin_lock is fine for most cases.
The only real problem is the trigger, which is currently completely unprotected: we have to serialize the BE triggers, otherwise you could STOP before you START due to scheduling, or other problems that I saw in my SoundWire tests with two START triggers, or the STOP never sent.
But how to do this serialization is unclear...
A lateral thinking approach would be to decouple the BEs entirely, and have the FEs 'signal' their change of state. The BE 'thread' run in the BE context would then serialize the requests and perform all the BE operations, and the same approach could be chained. I am afraid that would be a complete rewrite of DPCM, but maybe we have to do so anyways if we can't support a basic case of a mixer with 2 streams :-)
On Thu, 07 Oct 2021 15:31:49 +0200, Pierre-Louis Bossart wrote:
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?
As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.
A BE may have more than one FEs. That's precisely the point of mixers/demux, and if the BE has FEs with different 'atomicity' then I don't see how locking can work: the BE operations are run in the context of each of its FE, typically using the following loop:
for_each_dpcm_be(fe, stream, dpcm) { do_something(); }
Do we really have the cases where FEs have different atomicity? I don't think it's a valid configuration, and we should catch it via WARN_ON() or such.
Applications will view multiple FEs as completely independent. They may be opened/prepared/started/stopped/paused as needed. When the BE is shared, then there is a need for consistency, such as starting the BE when the first FE becomes active and stopping it when the last FE stops.
How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?
We already have a dpcm_lock at the card level, which protects the addition of BEs and BE states. That spin_lock is fine for most cases.
The only real problem is the trigger, which is currently completely unprotected: we have to serialize the BE triggers, otherwise you could STOP before you START due to scheduling, or other problems that I saw in my SoundWire tests with two START triggers, or the STOP never sent.
So it's about calling triggers to the same BE stream concurrently? If that's the case, can't we simply protect the trigger handling of each BE like below?
But how to do this serialization is unclear...
A lateral thinking approach would be to decouple the BEs entirely, and have the FEs 'signal' their change of state. The BE 'thread' run in the BE context would then serialize the requests and perform all the BE operations, and the same approach could be chained. I am afraid that would be a complete rewrite of DPCM, but maybe we have to do so anyways if we can't support a basic case of a mixer with 2 streams :-)
Well, let's see whether we can get some improvements by a simple change at first.
Takashi
--- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -1998,6 +1998,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, { struct snd_soc_pcm_runtime *be; struct snd_soc_dpcm *dpcm; + unsigned long flags; int ret = 0;
for_each_dpcm_be(fe, stream, dpcm) { @@ -2006,9 +2007,11 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, be = dpcm->be; be_substream = snd_soc_dpcm_get_substream(be, stream);
+ snd_pcm_stream_lock_irqsave(be_substream, flags); + /* is this op for this BE ? */ if (!snd_soc_dpcm_be_can_update(fe, be, stream)) - continue; + goto unlock;
dev_dbg(be->dev, "ASoC: trigger BE %s cmd %d\n", be->dai_link->name, cmd); @@ -2018,77 +2021,81 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) - continue; + goto unlock;
ret = soc_pcm_trigger(be_substream, cmd); if (ret) - goto end; + goto unlock;
be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; case SNDRV_PCM_TRIGGER_RESUME: if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_SUSPEND)) - continue; + goto unlock;
ret = soc_pcm_trigger(be_substream, cmd); if (ret) - goto end; + goto unlock;
be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) - continue; + goto unlock;
ret = soc_pcm_trigger(be_substream, cmd); if (ret) - goto end; + goto unlock;
be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; case SNDRV_PCM_TRIGGER_STOP: if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) - continue; + goto unlock;
if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) - continue; + goto unlock;
ret = soc_pcm_trigger(be_substream, cmd); if (ret) - goto end; + goto unlock;
be->dpcm[stream].state = SND_SOC_DPCM_STATE_STOP; break; case SNDRV_PCM_TRIGGER_SUSPEND: if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) - continue; + goto unlock;
if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) - continue; + goto unlock;
ret = soc_pcm_trigger(be_substream, cmd); if (ret) - goto end; + goto unlock;
be->dpcm[stream].state = SND_SOC_DPCM_STATE_SUSPEND; break; case SNDRV_PCM_TRIGGER_PAUSE_PUSH: if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) - continue; + goto unlock;
if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) - continue; + goto unlock;
ret = soc_pcm_trigger(be_substream, cmd); if (ret) - goto end; + goto unlock;
be->dpcm[stream].state = SND_SOC_DPCM_STATE_PAUSED; break; } + unlock: + snd_pcm_stream_unlock_irqrestore(be_substream, flags); + if (ret < 0) + break; } -end: + if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, ret);
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?
As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.
A BE may have more than one FEs. That's precisely the point of mixers/demux, and if the BE has FEs with different 'atomicity' then I don't see how locking can work: the BE operations are run in the context of each of its FE, typically using the following loop:
for_each_dpcm_be(fe, stream, dpcm) { do_something(); }
Do we really have the cases where FEs have different atomicity? I don't think it's a valid configuration, and we should catch it via WARN_ON() or such.
I don't think we have this configuration today, that's why I suggested making the assumption it's an unsupported configuration.
That would allow us to use the relevant locking mechanism, as done for PCM streams.
Applications will view multiple FEs as completely independent. They may be opened/prepared/started/stopped/paused as needed. When the BE is shared, then there is a need for consistency, such as starting the BE when the first FE becomes active and stopping it when the last FE stops.
How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?
We already have a dpcm_lock at the card level, which protects the addition of BEs and BE states. That spin_lock is fine for most cases.
The only real problem is the trigger, which is currently completely unprotected: we have to serialize the BE triggers, otherwise you could STOP before you START due to scheduling, or other problems that I saw in my SoundWire tests with two START triggers, or the STOP never sent.
So it's about calling triggers to the same BE stream concurrently? If that's the case, can't we simply protect the trigger handling of each BE like below?
Using snd_pcm_stream_lock_irqsave(be_substream, flags); will prevent multiple triggers indeed, but the state management is handled by dpcm_lock, so I think we have to use dpcm_lock/mutex in all BE transitions.
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
The position of the lock also matters, we may have to take the lock before walking through the list, since that list can be modified. that's what Gyeongtaek Lee reported with a separate patch, I was hoping that we can fix all BE state handling in a consistent manner.
On Thu, 07 Oct 2021 17:24:45 +0200, Pierre-Louis Bossart wrote:
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?
As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.
A BE may have more than one FEs. That's precisely the point of mixers/demux, and if the BE has FEs with different 'atomicity' then I don't see how locking can work: the BE operations are run in the context of each of its FE, typically using the following loop:
for_each_dpcm_be(fe, stream, dpcm) { do_something(); }
Do we really have the cases where FEs have different atomicity? I don't think it's a valid configuration, and we should catch it via WARN_ON() or such.
I don't think we have this configuration today, that's why I suggested making the assumption it's an unsupported configuration.
That would allow us to use the relevant locking mechanism, as done for PCM streams.
Applications will view multiple FEs as completely independent. They may be opened/prepared/started/stopped/paused as needed. When the BE is shared, then there is a need for consistency, such as starting the BE when the first FE becomes active and stopping it when the last FE stops.
How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?
We already have a dpcm_lock at the card level, which protects the addition of BEs and BE states. That spin_lock is fine for most cases.
The only real problem is the trigger, which is currently completely unprotected: we have to serialize the BE triggers, otherwise you could STOP before you START due to scheduling, or other problems that I saw in my SoundWire tests with two START triggers, or the STOP never sent.
So it's about calling triggers to the same BE stream concurrently? If that's the case, can't we simply protect the trigger handling of each BE like below?
Using snd_pcm_stream_lock_irqsave(be_substream, flags); will prevent multiple triggers indeed, but the state management is handled by dpcm_lock, so I think we have to use dpcm_lock/mutex in all BE transitions.
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
The stream lock can be put around those appropriate places, I suppose?
The position of the lock also matters, we may have to take the lock before walking through the list, since that list can be modified. that's what Gyeongtaek Lee reported with a separate patch, I was hoping that we can fix all BE state handling in a consistent manner.
The list belongs to FE, so possibly we can take FE's stream lock while manipulating the list for protecting from the racy trigger access.
But beware that the stream lock would be needed only if nonatomic is false. In the nonatomic PCM, the stream lock is a mutex and it's used for all PCM ops. OTOH, in normal PCM mode, the spinlock is used for trigger and a few others while mutex is used for prepare and co, hence an extra stream lock is needed there.
Takashi
Using snd_pcm_stream_lock_irqsave(be_substream, flags); will prevent multiple triggers indeed, but the state management is handled by dpcm_lock, so I think we have to use dpcm_lock/mutex in all BE transitions.
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
The stream lock can be put around those appropriate places, I suppose?
I doubled checked the code a bit more, and all functions using be->dpcm[stream].state and be->dpcm[stream].users are protected by the card->mutex.
The exceptions are dpcm_be_dai_trigger() and dpcm_show_state() so we probably don't need to worry too much about these fields.
I am more nervous about that the dpcm_lock was supposed to protect. It was added in "ASoC: dpcm: prevent snd_soc_dpcm use after free" to solve a race condition, according to the commit log between
void dpcm_be_disconnect( ... list_del(&dpcm->list_be); list_del(&dpcm->list_fe); kfree(dpcm); ...
and for_each_dpcm_fe() for_each_dpcm_be*()
That would suggest that every one of those loops should be protected, but that's not the case at all. In some cases the spinlock is taken *inside* of the loops.
I think this is what explains the problem reported by Gyeongtaek Lee in
https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...
the for_each_dpcm_be() loop in dpcm_be_dai_trigger() is NOT protected.
But if we add a spin-lock in there, the atomicity remains a problem.
I think the only solution is to follow the example of the PCM case, where the type of lock depends on the FE types, with the assumption that there are no mixed atomic/non-atomic FE configurations.
On Thu, 07 Oct 2021 20:13:22 +0200, Pierre-Louis Bossart wrote:
Using snd_pcm_stream_lock_irqsave(be_substream, flags); will prevent multiple triggers indeed, but the state management is handled by dpcm_lock, so I think we have to use dpcm_lock/mutex in all BE transitions.
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
The stream lock can be put around those appropriate places, I suppose?
I doubled checked the code a bit more, and all functions using be->dpcm[stream].state and be->dpcm[stream].users are protected by the card->mutex.
The exceptions are dpcm_be_dai_trigger() and dpcm_show_state() so we probably don't need to worry too much about these fields.
I am more nervous about that the dpcm_lock was supposed to protect. It was added in "ASoC: dpcm: prevent snd_soc_dpcm use after free" to solve a race condition, according to the commit log between
void dpcm_be_disconnect( ... list_del(&dpcm->list_be); list_del(&dpcm->list_fe); kfree(dpcm); ...
and for_each_dpcm_fe() for_each_dpcm_be*()
That would suggest that every one of those loops should be protected, but that's not the case at all. In some cases the spinlock is taken *inside* of the loops.
I think this is what explains the problem reported by Gyeongtaek Lee in
https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...
the for_each_dpcm_be() loop in dpcm_be_dai_trigger() is NOT protected.
But if we add a spin-lock in there, the atomicity remains a problem.
I think the only solution is to follow the example of the PCM case, where the type of lock depends on the FE types, with the assumption that there are no mixed atomic/non-atomic FE configurations.
Yes, and I guess we can simply replace those all card->dpcm_lock with FE's stream lock. It'll solve the atomicity problem, too, and the FE stream lock can be applied outside the loop of dpcm_be_disconnect() gracefully.
And, this should solve the race with dpcm_be_dai_trigger() as well, because it's called from dpcm_fe_dai_trigger() that is called already inside the FE's stream lock held by PCM core. A PoC is something like below. (I replaced the superfluous *_irqsave with *_irq there)
The scenario above (using the FE stream lock) is one missing piece, though: the compress API seems using the DPCM, and this might not work. It needs more verification.
BTW, looking through the current code, I wonder how the snd_pcm_stream_lock_irq() call in dpcm_set_fe_update_state() doesn't deadlock when nonatomic=true. The function may be called from dpcm_fe_dai_prepare(), which is a PCM prepare callback for FE. And, a PCM prepare callback is called always inside the stream mutex, which is *the* stream lock in the case of nonatomic mode. Maybe I might overlook something obvious...
thanks,
Takashi
--- diff --git a/include/sound/soc.h b/include/sound/soc.h index 8e6dd8a257c5..5872a8864f3b 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -893,8 +893,6 @@ struct snd_soc_card { struct mutex pcm_mutex; enum snd_soc_pcm_subclass pcm_subclass;
- spinlock_t dpcm_lock; - int (*probe)(struct snd_soc_card *card); int (*late_probe)(struct snd_soc_card *card); int (*remove)(struct snd_soc_card *card); diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c830e96afba2..51ef9917ca98 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2339,7 +2339,6 @@ int snd_soc_register_card(struct snd_soc_card *card) mutex_init(&card->mutex); mutex_init(&card->dapm_mutex); mutex_init(&card->pcm_mutex); - spin_lock_init(&card->dpcm_lock);
return snd_soc_bind_card(card); } diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 48f71bb81a2f..c171e5f431b9 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -38,6 +38,15 @@ static inline const char *soc_codec_dai_name(struct snd_soc_pcm_runtime *rtd) return (rtd)->num_codecs == 1 ? asoc_rtd_to_codec(rtd, 0)->name : "multicodec"; }
+#define dpcm_fe_stream_lock_irq(fe, stream) \ + snd_pcm_stream_lock_irq(snd_soc_dpcm_get_substream(fe, stream)) +#define dpcm_fe_stream_unlock_irq(fe, stream) \ + snd_pcm_stream_unlock_irq(snd_soc_dpcm_get_substream(fe, stream)) +#define dpcm_fe_stream_lock_irqsave(fe, stream, flags) \ + snd_pcm_stream_lock_irq(snd_soc_dpcm_get_substream(fe, stream), flags) +#define dpcm_fe_stream_unlock_irqrestore(fe, stream, flags) \ + snd_pcm_stream_unlock_irq(snd_soc_dpcm_get_substream(fe, stream), flags) + #ifdef CONFIG_DEBUG_FS static const char *dpcm_state_string(enum snd_soc_dpcm_state state) { @@ -73,7 +82,6 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, struct snd_pcm_hw_params *params = &fe->dpcm[stream].hw_params; struct snd_soc_dpcm *dpcm; ssize_t offset = 0; - unsigned long flags;
/* FE state */ offset += scnprintf(buf + offset, size - offset, @@ -101,7 +109,7 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, goto out; }
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_lock_irq(fe, stream); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; params = &dpcm->hw_params; @@ -122,7 +130,7 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, params_channels(params), params_rate(params)); } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_unlock_irq(fe, stream); out: return offset; } @@ -1129,7 +1137,6 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { struct snd_soc_dpcm *dpcm; - unsigned long flags;
/* only add new dpcms */ for_each_dpcm_be(fe, stream, dpcm) { @@ -1141,14 +1148,14 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, if (!dpcm) return -ENOMEM;
+ dpcm_fe_stream_lock_irq(fe, stream); dpcm->be = be; dpcm->fe = fe; be->dpcm[stream].runtime = fe->dpcm[stream].runtime; dpcm->state = SND_SOC_DPCM_LINK_STATE_NEW; - spin_lock_irqsave(&fe->card->dpcm_lock, flags); list_add(&dpcm->list_be, &fe->dpcm[stream].be_clients); list_add(&dpcm->list_fe, &be->dpcm[stream].fe_clients); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_unlock_irq(fe, stream);
dev_dbg(fe->dev, "connected new DPCM %s path %s %s %s\n", stream ? "capture" : "playback", fe->dai_link->name, @@ -1191,8 +1198,8 @@ static void dpcm_be_reparent(struct snd_soc_pcm_runtime *fe, void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm, *d; - unsigned long flags;
+ dpcm_fe_stream_lock_irq(fe, stream); for_each_dpcm_be_safe(fe, stream, dpcm, d) { dev_dbg(fe->dev, "ASoC: BE %s disconnect check for %s\n", stream ? "capture" : "playback", @@ -1210,12 +1217,11 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream)
dpcm_remove_debugfs_state(dpcm);
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); list_del(&dpcm->list_be); list_del(&dpcm->list_fe); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); kfree(dpcm); } + dpcm_fe_stream_unlock_irq(fe, stream); }
/* get BE for DAI widget and stream */ @@ -1429,12 +1435,11 @@ int dpcm_process_paths(struct snd_soc_pcm_runtime *fe, void dpcm_clear_pending_state(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm; - unsigned long flags;
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_lock_irq(fe, stream); for_each_dpcm_be(fe, stream, dpcm) dpcm_set_be_update_state(dpcm->be, stream, SND_SOC_DPCM_UPDATE_NO); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_unlock_irq(fe, stream); }
void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, @@ -2352,7 +2357,6 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) struct snd_soc_dpcm *dpcm; enum snd_soc_dpcm_trigger trigger = fe->dai_link->trigger[stream]; int ret = 0; - unsigned long flags;
dev_dbg(fe->dev, "ASoC: runtime %s open on FE %s\n", stream ? "capture" : "playback", fe->dai_link->name); @@ -2421,7 +2425,7 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) dpcm_be_dai_shutdown(fe, stream); disconnect: /* disconnect any pending BEs */ - spin_lock_irqsave(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_lock_irq(fe, stream); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be;
@@ -2433,7 +2437,7 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state == SND_SOC_DPCM_STATE_NEW) dpcm->state = SND_SOC_DPCM_LINK_STATE_FREE; } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_unlock_irq(fe, stream);
if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed (%d)\n", __func__, ret); @@ -2843,10 +2847,9 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, struct snd_soc_dpcm *dpcm; int state; int ret = 1; - unsigned long flags; int i;
- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_lock_irq(fe, stream); for_each_dpcm_fe(be, stream, dpcm) {
if (dpcm->fe == fe) @@ -2860,7 +2863,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, } } } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + dpcm_fe_stream_unlock_irq(fe, stream);
/* it's safe to do this BE DAI */ return ret;
I think the only solution is to follow the example of the PCM case, where the type of lock depends on the FE types, with the assumption that there are no mixed atomic/non-atomic FE configurations.
Yes, and I guess we can simply replace those all card->dpcm_lock with FE's stream lock. It'll solve the atomicity problem, too, and the FE stream lock can be applied outside the loop of dpcm_be_disconnect() gracefully.
And, this should solve the race with dpcm_be_dai_trigger() as well, because it's called from dpcm_fe_dai_trigger() that is called already inside the FE's stream lock held by PCM core. A PoC is something like below. (I replaced the superfluous *_irqsave with *_irq there)
No I don't think so. The code starts from an FE and loops for all the BEs connected to that FE, but we want to serialize at the BE level! we really need a dpcm lock at the card level, not the FE/stream level.
I am testing a quick set of changes at https://github.com/thesofproject/linux/pull/3201, where I also replaced irqsave by irq btw.
On Thu, 07 Oct 2021 23:27:50 +0200, Pierre-Louis Bossart wrote:
I think the only solution is to follow the example of the PCM case, where the type of lock depends on the FE types, with the assumption that there are no mixed atomic/non-atomic FE configurations.
Yes, and I guess we can simply replace those all card->dpcm_lock with FE's stream lock. It'll solve the atomicity problem, too, and the FE stream lock can be applied outside the loop of dpcm_be_disconnect() gracefully.
And, this should solve the race with dpcm_be_dai_trigger() as well, because it's called from dpcm_fe_dai_trigger() that is called already inside the FE's stream lock held by PCM core. A PoC is something like below. (I replaced the superfluous *_irqsave with *_irq there)
No I don't think so. The code starts from an FE and loops for all the BEs connected to that FE, but we want to serialize at the BE level! we really need a dpcm lock at the card level, not the FE/stream level.
The FE lock prevents the race between dpcm_be_dai_trigger() and dpcm_be_disconnect(), i.e. the problem Gyeongtaek showed.
The race among concurrent dpcm_be_dai_trigger() calls itself can be addressed by BE stream locks as suggested earlier.
Takashi
I think the only solution is to follow the example of the PCM case, where the type of lock depends on the FE types, with the assumption that there are no mixed atomic/non-atomic FE configurations.
Yes, and I guess we can simply replace those all card->dpcm_lock with FE's stream lock. It'll solve the atomicity problem, too, and the FE stream lock can be applied outside the loop of dpcm_be_disconnect() gracefully.
And, this should solve the race with dpcm_be_dai_trigger() as well, because it's called from dpcm_fe_dai_trigger() that is called already inside the FE's stream lock held by PCM core. A PoC is something like below. (I replaced the superfluous *_irqsave with *_irq there)
No I don't think so. The code starts from an FE and loops for all the BEs connected to that FE, but we want to serialize at the BE level! we really need a dpcm lock at the card level, not the FE/stream level.
The FE lock prevents the race between dpcm_be_dai_trigger() and dpcm_be_disconnect(), i.e. the problem Gyeongtaek showed.
The race among concurrent dpcm_be_dai_trigger() calls itself can be addressed by BE stream locks as suggested earlier.
I am not following your line of thought Takashi.
dpcm_be_disconnect already uses a spin_lock around
list_del(&dpcm->list_be); list_del(&dpcm->list_fe);
and in some other places, are you suggesting we change those to the FE lock?
Otherwise, I understood your proposal as using three locks (existing spinlock, FE lock, BE lock) to deal with DPCM. If the existing spinlock and FE lock are combined, we'd still have two locks.
I was suggesting we use only one ;-)
On Fri, 08 Oct 2021 16:41:59 +0200, Pierre-Louis Bossart wrote:
I think the only solution is to follow the example of the PCM case, where the type of lock depends on the FE types, with the assumption that there are no mixed atomic/non-atomic FE configurations.
Yes, and I guess we can simply replace those all card->dpcm_lock with FE's stream lock. It'll solve the atomicity problem, too, and the FE stream lock can be applied outside the loop of dpcm_be_disconnect() gracefully.
And, this should solve the race with dpcm_be_dai_trigger() as well, because it's called from dpcm_fe_dai_trigger() that is called already inside the FE's stream lock held by PCM core. A PoC is something like below. (I replaced the superfluous *_irqsave with *_irq there)
No I don't think so. The code starts from an FE and loops for all the BEs connected to that FE, but we want to serialize at the BE level! we really need a dpcm lock at the card level, not the FE/stream level.
The FE lock prevents the race between dpcm_be_dai_trigger() and dpcm_be_disconnect(), i.e. the problem Gyeongtaek showed.
The race among concurrent dpcm_be_dai_trigger() calls itself can be addressed by BE stream locks as suggested earlier.
I am not following your line of thought Takashi.
dpcm_be_disconnect already uses a spin_lock around
list_del(&dpcm->list_be); list_del(&dpcm->list_fe);
and in some other places, are you suggesting we change those to the FE lock?
Basically yes.
Otherwise, I understood your proposal as using three locks (existing spinlock, FE lock, BE lock) to deal with DPCM. If the existing spinlock and FE lock are combined, we'd still have two locks.
Stream locks are more fine-grained, hence more efficient :) The card-level spinlock is superfluous and it can go away.
I was suggesting we use only one ;-)
Basically we need to protect two things: - The BE links - The concurrent accesses to BEs The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
Takashi
dpcm_be_disconnect already uses a spin_lock around
list_del(&dpcm->list_be); list_del(&dpcm->list_fe);
and in some other places, are you suggesting we change those to the FE lock?
Basically yes.
Otherwise, I understood your proposal as using three locks (existing spinlock, FE lock, BE lock) to deal with DPCM. If the existing spinlock and FE lock are combined, we'd still have two locks.
Stream locks are more fine-grained, hence more efficient :) The card-level spinlock is superfluous and it can go away.
I was suggesting we use only one ;-)
Basically we need to protect two things:
- The BE links
- The concurrent accesses to BEs
The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
famous last words "nothing wrong can happen." :-)
I already added a helper to do this FE lock, I can easily replace the implementation to remove the spin_lock and use the FE PCM lock. we might even add the lock in the definition of for_each_dpcm_be() to avoid misses.
Let me try this out today, thanks for the suggestions.
Basically we need to protect two things:
- The BE links
- The concurrent accesses to BEs
The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
famous last words "nothing wrong can happen." :-)
I already added a helper to do this FE lock, I can easily replace the implementation to remove the spin_lock and use the FE PCM lock. we might even add the lock in the definition of for_each_dpcm_be() to avoid misses.
Let me try this out today, thanks for the suggestions.
well, it's not successful at all...
When I replace the existing dpcm_lock with the FE PCM lock as you suggested, without any additional changes, speaker-test produces valid audio on the endpoints, but if I try a Ctrl-C or limit the number of loops with the '-l' option, I hear an endless loop on the same buffer and I have to power cycle my test device to stop the sound.
See 2 patches attached, the first patch with the introduction of the helper works fine, the second with the use of the FE PCM lock doesn't. In hindsight I am glad I worked on minimal patches, one after the other, to identify problems.
And when I add the BE lock, then nothing happens. Device stuck and no audio...
There must be something we're missing related to the locking...
My work version is at https://github.com/plbossart/sound/tree/fix/dpcm-lock5 if anyone wants to take a look.
Basically we need to protect two things:
- The BE links
- The concurrent accesses to BEs
The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
famous last words "nothing wrong can happen." :-)
I already added a helper to do this FE lock, I can easily replace the implementation to remove the spin_lock and use the FE PCM lock. we might even add the lock in the definition of for_each_dpcm_be() to avoid misses.
Let me try this out today, thanks for the suggestions.
well, it's not successful at all...
When I replace the existing dpcm_lock with the FE PCM lock as you suggested, without any additional changes, speaker-test produces valid audio on the endpoints, but if I try a Ctrl-C or limit the number of loops with the '-l' option, I hear an endless loop on the same buffer and I have to power cycle my test device to stop the sound.
See 2 patches attached, the first patch with the introduction of the helper works fine, the second with the use of the FE PCM lock doesn't. In hindsight I am glad I worked on minimal patches, one after the other, to identify problems.
And when I add the BE lock, then nothing happens. Device stuck and no audio...
There must be something we're missing related to the locking...
And indeed there's a deadlock!
snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call snd_pcm_trigger. So if we also take the pcm stream lock in the BE trigger, there's a conceptual deadlock: we painted ourselves in a corner by using the same lock twice.
Takashi, are you really sure we should protect these for_each_dpcm_be() loops with the same pcm lock? it seems like asking for trouble to revisit the ALSA core just to walking through a list of BEs? Would you object to changing dpcm_lock as I suggested, but not interfering with stream handling?
See the traces below based on the hack in https://github.com/plbossart/sound/tree/test/dpcm-lock-loop-on-stop
[ 67.892082] skl_hda_dsp_generic skl_hda_dsp_generic: snd_sof_pcm_period_elapsed_work: plb taking lock [ 67.892088] skl_hda_dsp_generic skl_hda_dsp_generic: snd_sof_pcm_period_elapsed_work: plb taking lock - done [ 67.892092] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_period_elapsed_under_stream_lock: start [ 67.892096] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_period_elapsed_under_stream_lock: check running [ 67.892099] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_period_elapsed_under_stream_lock: check update_hw_ptr0 [ 67.892103] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: start [ 67.892110] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: delta [ 67.892113] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: check1 [ 67.892116] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: no_delta_check [ 67.892119] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: playback silence [ 67.892123] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: checks 3 [ 67.892126] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_hw_ptr0: done [ 67.892129] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_state: start [ 67.892132] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_state: before draining [ 67.892136] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_state: before draining2 [ 67.892139] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_update_state: before snd_pcm_drain_done [ 67.892143] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_do_stop: start [ 67.892147] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_do_stop: before TRIGGER_STOP
<<< we never reach the end of this TRIGGER_STOP
[ 67.892153] sof-audio-pci-intel-cnl 0000:00:1f.3: pcm: trigger stream 0 dir 0 cmd 0 [ 67.892166] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x60050000: GLB_STREAM_MSG: TRIG_STOP [ 67.892396] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x60050000: GLB_STREAM_MSG: TRIG_STOP [ 67.892408] sof-audio-pci-intel-cnl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x140000 successful [ 67.892418] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x60030000: GLB_STREAM_MSG: PCM_FREE [ 67.892564] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x60030000: GLB_STREAM_MSG: PCM_FREE [ 67.892571] HDA Analog: dpcm_be_dai_trigger: BE TRIGGER_STOP start [ 67.892575] HDA Analog: dpcm_be_dai_trigger: BE TRIGGER_STOP check can_be_free_stop [ 67.892579] HDA Analog: snd_soc_dpcm_check_state: plb: taking fe lock_irqsave from snd_soc_dpcm_check_state
<<< checkmate at this point, we're trying to take the same lock twice.
On Mon, 11 Oct 2021 22:06:51 +0200, Pierre-Louis Bossart wrote:
Basically we need to protect two things:
- The BE links
- The concurrent accesses to BEs
The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
famous last words "nothing wrong can happen." :-)
I already added a helper to do this FE lock, I can easily replace the implementation to remove the spin_lock and use the FE PCM lock. we might even add the lock in the definition of for_each_dpcm_be() to avoid misses.
Let me try this out today, thanks for the suggestions.
well, it's not successful at all...
When I replace the existing dpcm_lock with the FE PCM lock as you suggested, without any additional changes, speaker-test produces valid audio on the endpoints, but if I try a Ctrl-C or limit the number of loops with the '-l' option, I hear an endless loop on the same buffer and I have to power cycle my test device to stop the sound.
See 2 patches attached, the first patch with the introduction of the helper works fine, the second with the use of the FE PCM lock doesn't. In hindsight I am glad I worked on minimal patches, one after the other, to identify problems.
And when I add the BE lock, then nothing happens. Device stuck and no audio...
There must be something we're missing related to the locking...
And indeed there's a deadlock!
snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call snd_pcm_trigger.
Indeed, this would deadlock.
So if we also take the pcm stream lock in the BE trigger, there's a conceptual deadlock: we painted ourselves in a corner by using the same lock twice.
Takashi, are you really sure we should protect these for_each_dpcm_be() loops with the same pcm lock?
The call within the FE lock is done only in dpcm_dai_trigger_fe_be(), and this should call dpcm_be_dai_trigger() as is. In other places, the calls are without FE lock, hence they can take the lock, e.g. create a variant dpcm_dai_trigger_fe_be_lock().
it seems like asking for trouble to revisit the ALSA core just to walking through a list of BEs? Would you object to changing dpcm_lock as I suggested, but not interfering with stream handling?
That would work, too, it's just a pity to degrade the fine-grained locks that have been already taken into global locks...
Takashi
On Tue, 12 Oct 2021 08:34:07 +0200, Takashi Iwai wrote:
On Mon, 11 Oct 2021 22:06:51 +0200, Pierre-Louis Bossart wrote:
Basically we need to protect two things:
- The BE links
- The concurrent accesses to BEs
The former belongs to each FE that holds the links, and the FE stream lock would cover. The latter is rather a per-BE business.
An oft-seen risk of multiple locks is deadlocking, but in this case, as long as we keep the lock order FE->BE, nothing wrong can happen.
famous last words "nothing wrong can happen." :-)
I already added a helper to do this FE lock, I can easily replace the implementation to remove the spin_lock and use the FE PCM lock. we might even add the lock in the definition of for_each_dpcm_be() to avoid misses.
Let me try this out today, thanks for the suggestions.
well, it's not successful at all...
When I replace the existing dpcm_lock with the FE PCM lock as you suggested, without any additional changes, speaker-test produces valid audio on the endpoints, but if I try a Ctrl-C or limit the number of loops with the '-l' option, I hear an endless loop on the same buffer and I have to power cycle my test device to stop the sound.
See 2 patches attached, the first patch with the introduction of the helper works fine, the second with the use of the FE PCM lock doesn't. In hindsight I am glad I worked on minimal patches, one after the other, to identify problems.
And when I add the BE lock, then nothing happens. Device stuck and no audio...
There must be something we're missing related to the locking...
And indeed there's a deadlock!
snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call snd_pcm_trigger.
Indeed, this would deadlock.
So if we also take the pcm stream lock in the BE trigger, there's a conceptual deadlock: we painted ourselves in a corner by using the same lock twice.
Takashi, are you really sure we should protect these for_each_dpcm_be() loops with the same pcm lock?
The call within the FE lock is done only in dpcm_dai_trigger_fe_be(), and this should call dpcm_be_dai_trigger() as is. In other places, the calls are without FE lock, hence they can take the lock, e.g. create a variant dpcm_dai_trigger_fe_be_lock().
Or rather it was the code path of snd_soc_dpcm_check_state()? The function could be called from dpcm_be_dai_trigger(). Currently dpcm_lock seems to be added at places casually covering some of the for_each_dpcm_be() or whatever... The whole lock scheme has to be revisited.
it seems like asking for trouble to revisit the ALSA core just to walking through a list of BEs? Would you object to changing dpcm_lock as I suggested, but not interfering with stream handling?
That would work, too, it's just a pity to degrade the fine-grained locks that have been already taken into global locks...
For the performance problem, at least, you can make it rwlock and rwsem types (depending on nonatomic) so that the concurrent accesses would work. The only exclusive lock is the case for adding and removing the entries, which should be done with write lock / sem down, while other link traversals can be executed in the read lock / sem.
Takashi
And indeed there's a deadlock!
snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call snd_pcm_trigger.
Indeed, this would deadlock.
So if we also take the pcm stream lock in the BE trigger, there's a conceptual deadlock: we painted ourselves in a corner by using the same lock twice.
Takashi, are you really sure we should protect these for_each_dpcm_be() loops with the same pcm lock?
The call within the FE lock is done only in dpcm_dai_trigger_fe_be(), and this should call dpcm_be_dai_trigger() as is. In other places, the calls are without FE lock, hence they can take the lock, e.g. create a variant dpcm_dai_trigger_fe_be_lock().
Or rather it was the code path of snd_soc_dpcm_check_state()? The function could be called from dpcm_be_dai_trigger(). Currently dpcm_lock seems to be added at places casually covering some of the for_each_dpcm_be() or whatever... The whole lock scheme has to be revisited.
it seems like asking for trouble to revisit the ALSA core just to walking through a list of BEs? Would you object to changing dpcm_lock as I suggested, but not interfering with stream handling?
That would work, too, it's just a pity to degrade the fine-grained locks that have been already taken into global locks...
For the performance problem, at least, you can make it rwlock and rwsem types (depending on nonatomic) so that the concurrent accesses would work. The only exclusive lock is the case for adding and removing the entries, which should be done with write lock / sem down, while other link traversals can be executed in the read lock / sem.
Thanks Takashi for your feedback.
Let's first get the locking right. We can optimize performance later.
I will continue with the idea of using existing fine-grained locks a bit more, I am with you that this dpcm_lock was not added in a consistent way and reusing the concept is really building on sand.
We can remove the lock in snd_soc_dpcm_check_state, I already did the change in other versions. But what I'll need to check further is if indeed dpcm_be_dai_trigger() is called with the FE lock taken already. Adding a lockdep_assert_help() or something would help I guess.
The part where I am still not clear is that snd_pcm_period_elapsed uses the irqsave/irqrestore version, but in the initial code you suggested the vanilla irq version is fine. I am not sure I get the nuance here, and btw in the case of SOF the snd_pcm_period_elapsed is called from a workqueue, not an interrupt handler, as a work-around to avoid an IPC deadlock, so we probably don't need the irqsave/irqrestore version anyways.
On Tue, 12 Oct 2021 15:45:41 +0200, Pierre-Louis Bossart wrote:
And indeed there's a deadlock!
snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call snd_pcm_trigger.
Indeed, this would deadlock.
So if we also take the pcm stream lock in the BE trigger, there's a conceptual deadlock: we painted ourselves in a corner by using the same lock twice.
Takashi, are you really sure we should protect these for_each_dpcm_be() loops with the same pcm lock?
The call within the FE lock is done only in dpcm_dai_trigger_fe_be(), and this should call dpcm_be_dai_trigger() as is. In other places, the calls are without FE lock, hence they can take the lock, e.g. create a variant dpcm_dai_trigger_fe_be_lock().
Or rather it was the code path of snd_soc_dpcm_check_state()? The function could be called from dpcm_be_dai_trigger(). Currently dpcm_lock seems to be added at places casually covering some of the for_each_dpcm_be() or whatever... The whole lock scheme has to be revisited.
it seems like asking for trouble to revisit the ALSA core just to walking through a list of BEs? Would you object to changing dpcm_lock as I suggested, but not interfering with stream handling?
That would work, too, it's just a pity to degrade the fine-grained locks that have been already taken into global locks...
For the performance problem, at least, you can make it rwlock and rwsem types (depending on nonatomic) so that the concurrent accesses would work. The only exclusive lock is the case for adding and removing the entries, which should be done with write lock / sem down, while other link traversals can be executed in the read lock / sem.
Thanks Takashi for your feedback.
Let's first get the locking right. We can optimize performance later.
I will continue with the idea of using existing fine-grained locks a bit more, I am with you that this dpcm_lock was not added in a consistent way and reusing the concept is really building on sand.
We can remove the lock in snd_soc_dpcm_check_state, I already did the change in other versions. But what I'll need to check further is if indeed dpcm_be_dai_trigger() is called with the FE lock taken already. Adding a lockdep_assert_help() or something would help I guess.
The part where I am still not clear is that snd_pcm_period_elapsed uses the irqsave/irqrestore version, but in the initial code you suggested the vanilla irq version is fine. I am not sure I get the nuance here, and btw in the case of SOF the snd_pcm_period_elapsed is called from a workqueue, not an interrupt handler, as a work-around to avoid an IPC deadlock, so we probably don't need the irqsave/irqrestore version anyways.
In a nutshell: * in the code paths that are already with the stream lock (i.e. trigger, pointer and ack PCM callbacks), you need no extra lock for the stream itself. But if you want additional locks (e.g. for BE), use either *_spin_lock() or *_spin_lock_irqsave(), but not *_spin_lock_irq().
* In other code paths, *_spin_lock_irq().
In doubt, you can use always use *_irqsave(), of course.
Takashi
participants (3)
-
Pierre-Louis Bossart
-
Sameer Pujar
-
Takashi Iwai