[RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE

newer
[PATCH] ALSA: hda: avoid write to...

Pierre-Louis Bossart

5 Oct 2021 5 Oct '21

12:54 a.m.

We've been adding a 'deep buffer' PCM device to several SOF topologies in order to reduce power consumption. The typical use-case would be music playback over a headset: this additional PCM device provides more buffering and longer latencies, leaving the rest of the system sleep for longer periods. Notifications and 'regular' low-latency audio playback would still use the 'normal' PCM device and be mixed with the 'deep buffer' before rendering on the headphone endpoint. The tentative direction would be to expose this alternate device to PulseAudio/PipeWire/CRAS via the UCM SectionModifier definitions.

That seemed a straightforward topology change until our automated validation stress tests started reporting issues on SoundWire platforms, when e.g. two START triggers might be send and conversely the STOP trigger is never sent. The SoundWire stream state management flagged inconsistent states when the two 'normal' and 'deep buffer' devices are used concurrently with rapid play/stop/pause monkey testing.

Looking at the soc-pcm.c code, it seems that the BE state management needs a lot of love.

a) there is no consistent protection for the BE state. In some parts of the code, the state updates are protected by a spinlock but in the trigger they are not. When we open/play/close the two PCM devices in stress tests, we end-up testing a state that is being modified. That can't be good.

b) there is a conceptual deadlock: on stop we check the FE states to see if a shared BE can be stopped, but since we trigger the BE first the FE states have not been modified yet, so the TRIGGER_STOP is never sent.

This patchset suggests a transition from a spinlock to a mutex, an extended protection when walking through the BE list, and the use of a refcount to decide when to trigger the BE. With these patches I am able to run our entire validation suite without any issues with this new 'deep buffer' topology, and no regressions on existing solutions [1]

One might ask 'how come we didn't see this earlier'? The answer is probably that the .trigger callbacks in most implementations seems to perform DAPM operations, and sending the triggers multiple times is not an issue. In the case of SoundWire, we do use the .trigger callback to reconfigure the bus using the 'bank switch' mechanism. It could be acceptable to tolerate a trigger multiple times, but the deadlock on stop cannot be fixed at the SoundWire layer alone.

I chose to send this patchset as an RFCv2 to gather more feedback and make use others know about DPCM issues. We're going to spend more time on this but if others can provide feedback/test results it would be greatly appreciated.

Opens:

1) is this the right solution? The DPCM code is far from simple, has notions such as SND_SOC_DPCM_UPDATE_NO and 'trigger_pending' that I have no background on.

2) There are other reports of kernel oopses [2] that seem related to the lack of protection. I'd be good to confirm if this patchset solve these problems as well.

[1] https://github.com/thesofproject/linux/pull/3146 [2] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...

changes since RFC v1: Removed unused function Removed exported symbols only used in soc-pcm.c, used static instead Use a mutex instead of a spinlock Protect all for_each_dpcm_be() loops Fix bugs introduced in the refcount

Pierre-Louis Bossart (5): ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update() ASoC: soc-pcm: don't export local functions, use static ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex ASoC: soc-pcm: protect for_each_dpcm_be() loops with dpcm_mutex ASoC: soc-pcm: test refcount before triggering

include/sound/soc-dpcm.h | 17 +---- include/sound/soc.h | 2 +- sound/soc/soc-core.c | 2 +- sound/soc/soc-pcm.c | 153 ++++++++++++++++++++++++++------------- 4 files changed, 108 insertions(+), 66 deletions(-)

-- 2.25.1

Show replies by date

Pierre-Louis Bossart

5 Oct 5 Oct

12:54 a.m.

New subject: [RFC PATCH v2 1/5] ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update()

This function is not used anywhere, including soc-pcm.c

Remove dead code.

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc-dpcm.h | 3 --- sound/soc/soc-pcm.c | 9 --------- 2 files changed, 12 deletions(-)

diff --git a/include/sound/soc-dpcm.h b/include/sound/soc-dpcm.h index bc7af90099a8..72d45ad47ee3 100644 --- a/include/sound/soc-dpcm.h +++ b/include/sound/soc-dpcm.h @@ -121,9 +121,6 @@ int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream);

-/* is the current PCM operation for this FE ? */ -int snd_soc_dpcm_fe_can_update(struct snd_soc_pcm_runtime *fe, int stream); - /* is the current PCM operation for this BE ? */ int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream); diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index fc1854e3e43f..17476e3219ea 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2805,15 +2805,6 @@ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num) return ret; }

-/* is the current PCM operation for this FE ? */ -int snd_soc_dpcm_fe_can_update(struct snd_soc_pcm_runtime *fe, int stream) -{ - if (fe->dpcm[stream].runtime_update == SND_SOC_DPCM_UPDATE_FE) - return 1; - return 0; -} -EXPORT_SYMBOL_GPL(snd_soc_dpcm_fe_can_update); - /* is the current PCM operation for this BE ? */ int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream)

-- 2.25.1

Pierre-Louis Bossart

12:54 a.m.

New subject: [RFC PATCH v2 2/5] ASoC: soc-pcm: don't export local functions, use static

No one uses the following functions outside of soc-pcm.c

snd_soc_dpcm_can_be_free_stop() snd_soc_dpcm_can_be_params() snd_soc_dpcm_be_can_update()

In preparation for locking changes, move them to static functions and remove the EXPORT_SYMBOL_GPL()

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc-dpcm.h | 12 ------------ sound/soc/soc-pcm.c | 21 +++++++++++++++------ 2 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/include/sound/soc-dpcm.h b/include/sound/soc-dpcm.h index 72d45ad47ee3..9c00118603e7 100644 --- a/include/sound/soc-dpcm.h +++ b/include/sound/soc-dpcm.h @@ -113,18 +113,6 @@ struct snd_soc_dpcm_runtime { #define for_each_dpcm_be_rollback(fe, stream, _dpcm) \ list_for_each_entry_continue_reverse(_dpcm, &(fe)->dpcm[stream].be_clients, list_be)

-/* can this BE stop and free */ -int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, - struct snd_soc_pcm_runtime *be, int stream); - -/* can this BE perform a hw_params() */ -int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, - struct snd_soc_pcm_runtime *be, int stream); - -/* is the current PCM operation for this BE ? */ -int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, - struct snd_soc_pcm_runtime *be, int stream); - /* get the substream for this BE */ struct snd_pcm_substream * snd_soc_dpcm_get_substream(struct snd_soc_pcm_runtime *be, int stream); diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 17476e3219ea..360811f8a18c 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -29,6 +29,18 @@

#define DPCM_MAX_BE_USERS 8

+/* can this BE stop and free */ +static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, + struct snd_soc_pcm_runtime *be, int stream); + +/* can this BE perform a hw_params() */ +static int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, + struct snd_soc_pcm_runtime *be, int stream); + +/* is the current PCM operation for this BE ? */ +static int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, + struct snd_soc_pcm_runtime *be, int stream); + static inline const char *soc_cpu_dai_name(struct snd_soc_pcm_runtime *rtd) { return (rtd)->num_cpus == 1 ? asoc_rtd_to_cpu(rtd, 0)->name : "multicpu"; @@ -2806,7 +2818,7 @@ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num) }

/* is the current PCM operation for this BE ? */ -int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, +static int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { if ((fe->dpcm[stream].runtime_update == SND_SOC_DPCM_UPDATE_FE) || @@ -2815,7 +2827,6 @@ int snd_soc_dpcm_be_can_update(struct snd_soc_pcm_runtime *fe, return 1; return 0; } -EXPORT_SYMBOL_GPL(snd_soc_dpcm_be_can_update);

/* get the substream for this BE */ struct snd_pcm_substream * @@ -2861,7 +2872,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, * We can only hw_free, stop, pause or suspend a BE DAI if any of it's FE * are not running, paused or suspended for the specified stream direction. */ -int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, +static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { const enum snd_soc_dpcm_state state[] = { @@ -2872,13 +2883,12 @@ int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe,

return snd_soc_dpcm_check_state(fe, be, stream, state, ARRAY_SIZE(state)); } -EXPORT_SYMBOL_GPL(snd_soc_dpcm_can_be_free_stop);

/* * We can only change hw params a BE DAI if any of it's FE are not prepared, * running, paused or suspended for the specified stream direction. */ -int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, +static int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { const enum snd_soc_dpcm_state state[] = { @@ -2890,4 +2900,3 @@ int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe,

return snd_soc_dpcm_check_state(fe, be, stream, state, ARRAY_SIZE(state)); } -EXPORT_SYMBOL_GPL(snd_soc_dpcm_can_be_params);

-- 2.25.1

Pierre-Louis Bossart

12:54 a.m.

New subject: [RFC PATCH v2 3/5] ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex

It's not clear why a spinlock was used, and even less why the 'dpcm_lock' is used with the irqsave/irqrestore: DPCM functions are typically not used in interrupt context.

Move to a mutex which will allow us to sleep for longer periods of time and handle non-atomic triggers.

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc.h | 2 +- sound/soc/soc-core.c | 2 +- sound/soc/soc-pcm.c | 30 ++++++++++++------------------ 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/include/sound/soc.h b/include/sound/soc.h index 8e6dd8a257c5..c13d8ec72d62 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -893,7 +893,7 @@ struct snd_soc_card { struct mutex pcm_mutex; enum snd_soc_pcm_subclass pcm_subclass;

- spinlock_t dpcm_lock; + struct mutex dpcm_mutex; /* protect all dpcm states and updates */

int (*probe)(struct snd_soc_card *card); int (*late_probe)(struct snd_soc_card *card); diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index c830e96afba2..e94d43c392e0 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2339,7 +2339,7 @@ int snd_soc_register_card(struct snd_soc_card *card) mutex_init(&card->mutex); mutex_init(&card->dapm_mutex); mutex_init(&card->pcm_mutex); - spin_lock_init(&card->dpcm_lock); + mutex_init(&card->dpcm_mutex);

return snd_soc_bind_card(card); } diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 360811f8a18c..af12593b033a 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -85,7 +85,6 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, struct snd_pcm_hw_params *params = &fe->dpcm[stream].hw_params; struct snd_soc_dpcm *dpcm; ssize_t offset = 0; - unsigned long flags;

/* FE state */ offset += scnprintf(buf + offset, size - offset, @@ -113,7 +112,7 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, goto out; }

- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; params = &dpcm->hw_params; @@ -134,7 +133,7 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, params_channels(params), params_rate(params)); } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex); out: return offset; } @@ -1141,7 +1140,6 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) { struct snd_soc_dpcm *dpcm; - unsigned long flags;

/* only add new dpcms */ for_each_dpcm_be(fe, stream, dpcm) { @@ -1157,10 +1155,10 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, dpcm->fe = fe; be->dpcm[stream].runtime = fe->dpcm[stream].runtime; dpcm->state = SND_SOC_DPCM_LINK_STATE_NEW; - spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); list_add(&dpcm->list_be, &fe->dpcm[stream].be_clients); list_add(&dpcm->list_fe, &be->dpcm[stream].fe_clients); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex);

dev_dbg(fe->dev, "connected new DPCM %s path %s %s %s\n", stream ? "capture" : "playback", fe->dai_link->name, @@ -1203,7 +1201,6 @@ static void dpcm_be_reparent(struct snd_soc_pcm_runtime *fe, void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm, *d; - unsigned long flags;

for_each_dpcm_be_safe(fe, stream, dpcm, d) { dev_dbg(fe->dev, "ASoC: BE %s disconnect check for %s\n", @@ -1222,10 +1219,10 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream)

dpcm_remove_debugfs_state(dpcm);

- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); list_del(&dpcm->list_be); list_del(&dpcm->list_fe); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex); kfree(dpcm); } } @@ -1441,12 +1438,11 @@ int dpcm_process_paths(struct snd_soc_pcm_runtime *fe, void dpcm_clear_pending_state(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm; - unsigned long flags;

- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) dpcm_set_be_update_state(dpcm->be, stream, SND_SOC_DPCM_UPDATE_NO); - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex); }

void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, @@ -2364,7 +2360,6 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) struct snd_soc_dpcm *dpcm; enum snd_soc_dpcm_trigger trigger = fe->dai_link->trigger[stream]; int ret = 0; - unsigned long flags;

dev_dbg(fe->dev, "ASoC: runtime %s open on FE %s\n", stream ? "capture" : "playback", fe->dai_link->name); @@ -2433,7 +2428,7 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) dpcm_be_dai_shutdown(fe, stream); disconnect: /* disconnect any pending BEs */ - spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be;

@@ -2445,7 +2440,7 @@ static int dpcm_run_update_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state == SND_SOC_DPCM_STATE_NEW) dpcm->state = SND_SOC_DPCM_LINK_STATE_FREE; } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex);

if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed (%d)\n", __func__, ret); @@ -2845,10 +2840,9 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, struct snd_soc_dpcm *dpcm; int state; int ret = 1; - unsigned long flags; int i;

- spin_lock_irqsave(&fe->card->dpcm_lock, flags); + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_fe(be, stream, dpcm) {

if (dpcm->fe == fe) @@ -2862,7 +2856,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, } } } - spin_unlock_irqrestore(&fe->card->dpcm_lock, flags); + mutex_unlock(&fe->card->dpcm_mutex);

/* it's safe to do this BE DAI */ return ret;

-- 2.25.1

Pierre-Louis Bossart

12:54 a.m.

New subject: [RFC PATCH v2 4/5] ASoC: soc-pcm: protect for_each_dpcm_be() loops with dpcm_mutex

When more than one FE is connected to a BE, e.g. in a mixing use case, the BE can be triggered multiple times when the FE are opened/started concurrently. This race condition is problematic in the case of SoundWire BE dailinks, and this is not desirable in a general case. The code carefully checks when the BE can be stopped or hw_free'ed, but the trigger code does not use any mutual exclusion.

This can be fixed by using the dpcm_mutex.

As discussed on the alsa-mailing list [1], there is an additional issue that the BEs can be disconnected dynamically, leading to potential accesses to freed memory.

This patch suggests a general protection of the for_each_dpcm_be() loop with the dpcm_mutex, to solve both the problem of multiple triggers and BE dynamic addition/removal.

[1] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/soc-pcm.c | 48 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index af12593b033a..6089e0c1bf9f 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -320,6 +320,7 @@ int dpcm_dapm_stream_event(struct snd_soc_pcm_runtime *fe, int dir, { struct snd_soc_dpcm *dpcm;

+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, dir, dpcm) {

struct snd_soc_pcm_runtime *be = dpcm->be; @@ -333,6 +334,8 @@ int dpcm_dapm_stream_event(struct snd_soc_pcm_runtime *fe, int dir,

snd_soc_dapm_stream_event(be, dir, event); } + mutex_unlock(&fe->card->dpcm_mutex); +

snd_soc_dapm_stream_event(fe, dir, event);

@@ -1142,10 +1145,14 @@ static int dpcm_be_connect(struct snd_soc_pcm_runtime *fe, struct snd_soc_dpcm *dpcm;

/* only add new dpcms */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { - if (dpcm->be == be && dpcm->fe == fe) + if (dpcm->be == be && dpcm->fe == fe) { + mutex_unlock(&fe->card->dpcm_mutex); return 0; + } } + mutex_unlock(&fe->card->dpcm_mutex);

dpcm = kzalloc(sizeof(struct snd_soc_dpcm), GFP_KERNEL); if (!dpcm) @@ -1202,6 +1209,7 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream) { struct snd_soc_dpcm *dpcm, *d;

+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be_safe(fe, stream, dpcm, d) { dev_dbg(fe->dev, "ASoC: BE %s disconnect check for %s\n", stream ? "capture" : "playback", @@ -1219,12 +1227,11 @@ void dpcm_be_disconnect(struct snd_soc_pcm_runtime *fe, int stream)

dpcm_remove_debugfs_state(dpcm);

- mutex_lock(&fe->card->dpcm_mutex); list_del(&dpcm->list_be); list_del(&dpcm->list_fe); - mutex_unlock(&fe->card->dpcm_mutex); kfree(dpcm); } + mutex_unlock(&fe->card->dpcm_mutex); }

/* get BE for DAI widget and stream */ @@ -1351,6 +1358,7 @@ static int dpcm_prune_paths(struct snd_soc_pcm_runtime *fe, int stream, int prune = 0;

/* Destroy any old FE <--> BE connections */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { if (dpcm_be_is_active(dpcm, stream, *list_)) continue; @@ -1362,6 +1370,7 @@ static int dpcm_prune_paths(struct snd_soc_pcm_runtime *fe, int stream, dpcm_set_be_update_state(dpcm->be, stream, SND_SOC_DPCM_UPDATE_BE); prune++; } + mutex_unlock(&fe->card->dpcm_mutex);

dev_dbg(fe->dev, "ASoC: found %d old BE paths for pruning\n", prune); return prune; @@ -1451,13 +1460,16 @@ void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, struct snd_soc_dpcm *dpcm;

/* disable any enabled and non active backends */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_pcm_substream *be_substream = snd_soc_dpcm_get_substream(be, stream);

- if (dpcm == last) + if (dpcm == last) { + mutex_unlock(&fe->card->dpcm_mutex); return; + }

/* is this op for this BE ? */ if (!snd_soc_dpcm_be_can_update(fe, be, stream)) @@ -1487,6 +1499,7 @@ void dpcm_be_dai_stop(struct snd_soc_pcm_runtime *fe, int stream, be_substream->runtime = NULL; be->dpcm[stream].state = SND_SOC_DPCM_STATE_CLOSE; } + mutex_unlock(&fe->card->dpcm_mutex); }

int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) @@ -1496,6 +1509,7 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) int err, count = 0;

/* only startup BE DAIs that are either sinks or sources to this FE DAI */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_pcm_substream *be_substream;

@@ -1546,11 +1560,13 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state = SND_SOC_DPCM_STATE_OPEN; count++; } + mutex_unlock(&fe->card->dpcm_mutex);

return count;

unwind: dpcm_be_dai_startup_rollback(fe, stream, dpcm); + mutex_unlock(&fe->card->dpcm_mutex);

dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, err); @@ -1605,6 +1621,7 @@ static void dpcm_runtime_setup_be_format(struct snd_pcm_substream *substream) * if FE want to use it (= dpcm_merged_format) */

+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_soc_pcm_stream *codec_stream; @@ -1623,6 +1640,7 @@ static void dpcm_runtime_setup_be_format(struct snd_pcm_substream *substream) soc_pcm_hw_update_format(hw, codec_stream); } } + mutex_unlock(&fe->card->dpcm_mutex); }

static void dpcm_runtime_setup_be_chan(struct snd_pcm_substream *substream) @@ -1640,7 +1658,7 @@ static void dpcm_runtime_setup_be_chan(struct snd_pcm_substream *substream) * It returns merged BE codec channel; * if FE want to use it (= dpcm_merged_chan) */ - + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_soc_pcm_stream *cpu_stream; @@ -1671,6 +1689,7 @@ static void dpcm_runtime_setup_be_chan(struct snd_pcm_substream *substream) soc_pcm_hw_update_chan(hw, codec_stream); } } + mutex_unlock(&fe->card->dpcm_mutex); }

static void dpcm_runtime_setup_be_rate(struct snd_pcm_substream *substream) @@ -1688,7 +1707,7 @@ static void dpcm_runtime_setup_be_rate(struct snd_pcm_substream *substream) * It returns merged BE codec channel; * if FE want to use it (= dpcm_merged_chan) */ - + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_soc_pcm_stream *pcm; @@ -1708,6 +1727,7 @@ static void dpcm_runtime_setup_be_rate(struct snd_pcm_substream *substream) soc_pcm_hw_update_rate(hw, pcm); } } + mutex_unlock(&fe->card->dpcm_mutex); }

static int dpcm_apply_symmetry(struct snd_pcm_substream *fe_substream, @@ -1730,6 +1750,7 @@ static int dpcm_apply_symmetry(struct snd_pcm_substream *fe_substream, }

/* apply symmetry for BE */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_soc_pcm_runtime *be = dpcm->be; struct snd_pcm_substream *be_substream = @@ -1755,6 +1776,7 @@ static int dpcm_apply_symmetry(struct snd_pcm_substream *fe_substream, } } error: + mutex_unlock(&fe->card->dpcm_mutex); if (err < 0) dev_err(fe->dev, "ASoC: %s failed (%d)\n", __func__, err);

@@ -1830,6 +1852,7 @@ void dpcm_be_dai_hw_free(struct snd_soc_pcm_runtime *fe, int stream)

/* only hw_params backends that are either sinks or sources * to this frontend DAI */ + mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) {

struct snd_soc_pcm_runtime *be = dpcm->be; @@ -1842,7 +1865,7 @@ void dpcm_be_dai_hw_free(struct snd_soc_pcm_runtime *fe, int stream)

/* only free hw when no longer used - check all FEs */ if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) - continue; + continue;

/* do not free hw if this BE is used by other FE */ if (be->dpcm[stream].users > 1) @@ -1863,6 +1886,7 @@ void dpcm_be_dai_hw_free(struct snd_soc_pcm_runtime *fe, int stream)

be->dpcm[stream].state = SND_SOC_DPCM_STATE_HW_FREE; } + mutex_unlock(&fe->card->dpcm_mutex); }

static int dpcm_fe_dai_hw_free(struct snd_pcm_substream *substream) @@ -1896,6 +1920,7 @@ int dpcm_be_dai_hw_params(struct snd_soc_pcm_runtime *fe, int stream) struct snd_soc_dpcm *dpcm; int ret;

+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { be = dpcm->be; be_substream = snd_soc_dpcm_get_substream(be, stream); @@ -1935,6 +1960,7 @@ int dpcm_be_dai_hw_params(struct snd_soc_pcm_runtime *fe, int stream)

be->dpcm[stream].state = SND_SOC_DPCM_STATE_HW_PARAMS; } + mutex_unlock(&fe->card->dpcm_mutex); return 0;

unwind: @@ -1961,6 +1987,7 @@ int dpcm_be_dai_hw_params(struct snd_soc_pcm_runtime *fe, int stream)

soc_pcm_hw_free(be_substream); } + mutex_unlock(&fe->card->dpcm_mutex);

return ret; } @@ -2008,6 +2035,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, struct snd_soc_dpcm *dpcm; int ret = 0;

+ mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_pcm_substream *be_substream;

@@ -2097,6 +2125,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, } } end: + mutex_unlock(&fe->card->dpcm_mutex); if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, ret); @@ -2831,6 +2860,7 @@ struct snd_pcm_substream * } EXPORT_SYMBOL_GPL(snd_soc_dpcm_get_substream);

+/* This must be called with fe->card->dpcm_mutex held */ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream, @@ -2842,7 +2872,6 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, int ret = 1; int i;

- mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_fe(be, stream, dpcm) {

if (dpcm->fe == fe) @@ -2856,7 +2885,6 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, } } } - mutex_unlock(&fe->card->dpcm_mutex);

/* it's safe to do this BE DAI */ return ret; @@ -2865,6 +2893,7 @@ static int snd_soc_dpcm_check_state(struct snd_soc_pcm_runtime *fe, /* * We can only hw_free, stop, pause or suspend a BE DAI if any of it's FE * are not running, paused or suspended for the specified stream direction. + * This must be called with fe->card->dpcm_mutex held. */ static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream) @@ -2881,6 +2910,7 @@ static int snd_soc_dpcm_can_be_free_stop(struct snd_soc_pcm_runtime *fe, /* * We can only change hw params a BE DAI if any of it's FE are not prepared, * running, paused or suspended for the specified stream direction. + * This must be called with fe->card->dpcm_mutex held. */ static int snd_soc_dpcm_can_be_params(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be, int stream)

-- 2.25.1

Pierre-Louis Bossart

12:54 a.m.

New subject: [RFC PATCH v2 5/5] ASoC: soc-pcm: test refcount before triggering

On start/pause_release/resume, when more than one FE is connected to the same BE, it's possible that the trigger is sent more than once. This is not desirable, we only want to trigger a BE once, which is straightforward to implement with a refcount.

For stop/pause/suspend, the problem is more complicated: the check implemented in snd_soc_dpcm_can_be_free_stop() may fail due to a conceptual deadlock when we trigger the BE before the FE. In this case, the FE states have not yet changed, so there are corner cases where the TRIGGER_STOP is never sent - the dual case of start where multiple triggers might be sent.

This patch suggests an unconditional trigger in all cases, without checking the FE states, using a refcount protected by a mutex.

Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- include/sound/soc-dpcm.h | 2 ++ sound/soc/soc-pcm.c | 53 +++++++++++++++++++++++++++++++--------- 2 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/include/sound/soc-dpcm.h b/include/sound/soc-dpcm.h index 9c00118603e7..6d84e2c60ca8 100644 --- a/include/sound/soc-dpcm.h +++ b/include/sound/soc-dpcm.h @@ -101,6 +101,8 @@ struct snd_soc_dpcm_runtime { enum snd_soc_dpcm_state state;

int trigger_pending; /* trigger cmd + 1 if pending, 0 if not */ + + int be_start; /* refcount protected by card's dpcm_mutex */ };

#define for_each_dpcm_fe(be, stream, _dpcm) \ diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 6089e0c1bf9f..ed898d83fe17 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -1556,7 +1556,7 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime *fe, int stream) be->dpcm[stream].state = SND_SOC_DPCM_STATE_CLOSE; goto unwind; } - + be->dpcm[stream].be_start = 0; be->dpcm[stream].state = SND_SOC_DPCM_STATE_OPEN; count++; } @@ -2051,14 +2051,21 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream,

switch (cmd) { case SNDRV_PCM_TRIGGER_START: - if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && + if (!be->dpcm[stream].be_start && + (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) continue;

+ be->dpcm[stream].be_start++; + if (be->dpcm[stream].be_start != 1) + continue; + ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start--; goto end; + }

be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; @@ -2066,9 +2073,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_SUSPEND)) continue;

+ be->dpcm[stream].be_start++; + if (be->dpcm[stream].be_start != 1) + continue; + ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start--; goto end; + }

be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; @@ -2076,9 +2089,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) continue;

+ be->dpcm[stream].be_start++; + if (be->dpcm[stream].be_start != 1) + continue; + ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start--; goto end; + }

be->dpcm[stream].state = SND_SOC_DPCM_STATE_START; break; @@ -2087,12 +2106,18 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) continue;

- if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) + if (be->dpcm[stream].state == SND_SOC_DPCM_STATE_START) + be->dpcm[stream].be_start--; + + if (be->dpcm[stream].be_start != 0) continue;

ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + if (be->dpcm[stream].state == SND_SOC_DPCM_STATE_START) + be->dpcm[stream].be_start++; goto end; + }

be->dpcm[stream].state = SND_SOC_DPCM_STATE_STOP; break; @@ -2100,12 +2125,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) continue;

- if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) + be->dpcm[stream].be_start--; + if (be->dpcm[stream].be_start != 0) continue;

ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start++; goto end; + }

be->dpcm[stream].state = SND_SOC_DPCM_STATE_SUSPEND; break; @@ -2113,12 +2141,15 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) continue;

- if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream)) + be->dpcm[stream].be_start--; + if (be->dpcm[stream].be_start != 0) continue;

ret = soc_pcm_trigger(be_substream, cmd); - if (ret) + if (ret) { + be->dpcm[stream].be_start++; goto end; + }

be->dpcm[stream].state = SND_SOC_DPCM_STATE_PAUSED; break;

-- 2.25.1

Sameer Pujar

8:36 a.m.

On 10/5/2021 4:24 AM, Pierre-Louis Bossart wrote:

...

External email: Use caution opening links or attachments

We've been adding a 'deep buffer' PCM device to several SOF topologies in order to reduce power consumption. The typical use-case would be music playback over a headset: this additional PCM device provides more buffering and longer latencies, leaving the rest of the system sleep for longer periods. Notifications and 'regular' low-latency audio playback would still use the 'normal' PCM device and be mixed with the 'deep buffer' before rendering on the headphone endpoint. The tentative direction would be to expose this alternate device to PulseAudio/PipeWire/CRAS via the UCM SectionModifier definitions.

That seemed a straightforward topology change until our automated validation stress tests started reporting issues on SoundWire platforms, when e.g. two START triggers might be send and conversely the STOP trigger is never sent. The SoundWire stream state management flagged inconsistent states when the two 'normal' and 'deep buffer' devices are used concurrently with rapid play/stop/pause monkey testing.

Looking at the soc-pcm.c code, it seems that the BE state management needs a lot of love.

a) there is no consistent protection for the BE state. In some parts of the code, the state updates are protected by a spinlock but in the trigger they are not. When we open/play/close the two PCM devices in stress tests, we end-up testing a state that is being modified. That can't be good.

b) there is a conceptual deadlock: on stop we check the FE states to see if a shared BE can be stopped, but since we trigger the BE first the FE states have not been modified yet, so the TRIGGER_STOP is never sent.

This patchset suggests a transition from a spinlock to a mutex, an extended protection when walking through the BE list, and the use of a refcount to decide when to trigger the BE. With these patches I am able to run our entire validation suite without any issues with this new 'deep buffer' topology, and no regressions on existing solutions [1]

One might ask 'how come we didn't see this earlier'? The answer is probably that the .trigger callbacks in most implementations seems to perform DAPM operations, and sending the triggers multiple times is not an issue. In the case of SoundWire, we do use the .trigger callback to reconfigure the bus using the 'bank switch' mechanism. It could be acceptable to tolerate a trigger multiple times, but the deadlock on stop cannot be fixed at the SoundWire layer alone.

I chose to send this patchset as an RFCv2 to gather more feedback and make use others know about DPCM issues. We're going to spend more time on this but if others can provide feedback/test results it would be greatly appreciated.

Opens:

is this the right solution? The DPCM code is far from simple, has

notions such as SND_SOC_DPCM_UPDATE_NO and 'trigger_pending' that I have no background on.

There are other reports of kernel oopses [2] that seem related to

the lack of protection. I'd be good to confirm if this patchset solve these problems as well.

[1] https://github.com/thesofproject/linux/pull/3146 [2] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.c...

changes since RFC v1: Removed unused function Removed exported symbols only used in soc-pcm.c, used static instead Use a mutex instead of a spinlock Protect all for_each_dpcm_be() loops Fix bugs introduced in the refcount

...

Pierre-Louis Bossart (5): ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update() ASoC: soc-pcm: don't export local functions, use static ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex ASoC: soc-pcm: protect for_each_dpcm_be() loops with dpcm_mutex ASoC: soc-pcm: test refcount before triggering

Thank you Pierre-Louis for adding me to this thread.

I did a quick test of your patches on my Tegra board and seeing issues with multiple streams. For instance, I ran it for a 2x1 mixer configuration and hitting below:

[ 277.661886] BUG: scheduling while atomic: aplay/1306/0x00000100 [ 287.713193] BUG: spinlock cpu recursion on CPU#0, aplay/1307 [ 287.719138] lock: 0xffff00008cc820f0, .magic: dead4ead, .owner: aplay/1306, .owner_cpu: 0 [ 287.727319] CPU: 0 PID: 1307 Comm: aplay Tainted: G W 5.15.0-rc3-next-20210927-00026-gffdabce987b1 #12 [ 287.737783] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) [ 287.744228] Call trace: [ 287.746656] dump_backtrace+0x0/0x1c0 [ 287.750300] show_stack+0x18/0x28 [ 287.753604] dump_stack_lvl+0x7c/0xa8 [ 287.757236] dump_stack+0x18/0x34 [ 287.760536] spin_dump+0x70/0x90 [ 287.763732] do_raw_spin_lock+0xd8/0x120 [ 287.767615] _raw_spin_lock_irq+0x60/0x80 [ 287.771581] snd_pcm_stream_lock_irq+0x20/0x48 [snd_pcm] [ 287.776853] snd_pcm_drain+0x1ec/0x348 [snd_pcm] [ 287.781421] snd_pcm_common_ioctl+0xacc/0x1938 [snd_pcm] [ 287.786685] snd_pcm_ioctl+0x2c/0x48 [snd_pcm] [ 287.791101] __arm64_sys_ioctl+0xb0/0xf0 [ 287.794982] invoke_syscall+0x44/0x108 [ 287.798683] el0_svc_common.constprop.3+0x74/0x100 [ 287.803416] do_el0_svc+0x24/0x90 [ 287.806687] el0_svc+0x20/0x60 [ 287.809705] el0t_64_sync_handler+0x94/0xb8 [ 287.813839] el0t_64_sync+0x180/0x184

And in some case just below:

[ 1074.212276] BUG: scheduling while atomic: aplay/12327/0x00000100 [ 1095.227509] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 1095.233443] rcu: 0-...0: (1 GPs behind) idle=4af/1/0x4000000000000004 softirq=19902/19902 fqs=2626 [ 1095.242528] rcu: 2-...0: (1 GPs behind) idle=9d5/1/0x4000000000000000 softirq=22707/22707 fqs=262

Pierre-Louis Bossart

3:17 p.m.

...

I did a quick test of your patches on my Tegra board and seeing issues with multiple streams. For instance, I ran it for a 2x1 mixer configuration and hitting below:

[ 277.661886] BUG: scheduling while atomic: aplay/1306/0x00000100 [ 287.713193] BUG: spinlock cpu recursion on CPU#0, aplay/1307 [ 287.719138] lock: 0xffff00008cc820f0, .magic: dead4ead, .owner: aplay/1306, .owner_cpu: 0 [ 287.727319] CPU: 0 PID: 1307 Comm: aplay Tainted: G W 5.15.0-rc3-next-20210927-00026-gffdabce987b1 #12 [ 287.737783] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) [ 287.744228] Call trace: [ 287.746656] dump_backtrace+0x0/0x1c0 [ 287.750300] show_stack+0x18/0x28 [ 287.753604] dump_stack_lvl+0x7c/0xa8 [ 287.757236] dump_stack+0x18/0x34 [ 287.760536] spin_dump+0x70/0x90 [ 287.763732] do_raw_spin_lock+0xd8/0x120 [ 287.767615] _raw_spin_lock_irq+0x60/0x80 [ 287.771581] snd_pcm_stream_lock_irq+0x20/0x48 [snd_pcm] [ 287.776853] snd_pcm_drain+0x1ec/0x348 [snd_pcm] [ 287.781421] snd_pcm_common_ioctl+0xacc/0x1938 [snd_pcm] [ 287.786685] snd_pcm_ioctl+0x2c/0x48 [snd_pcm] [ 287.791101] __arm64_sys_ioctl+0xb0/0xf0 [ 287.794982] invoke_syscall+0x44/0x108 [ 287.798683] el0_svc_common.constprop.3+0x74/0x100 [ 287.803416] do_el0_svc+0x24/0x90 [ 287.806687] el0_svc+0x20/0x60 [ 287.809705] el0t_64_sync_handler+0x94/0xb8 [ 287.813839] el0t_64_sync+0x180/0x184

And in some case just below:

[ 1074.212276] BUG: scheduling while atomic: aplay/12327/0x00000100 [ 1095.227509] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 1095.233443] rcu: 0-...0: (1 GPs behind) idle=4af/1/0x4000000000000004 softirq=19902/19902 fqs=2626 [ 1095.242528] rcu: 2-...0: (1 GPs behind) idle=9d5/1/0x4000000000000000 softirq=22707/22707 fqs=262

Thanks Sameer for the overnight tests, much appreciated.

My patches don't change anything related to a spinlock or pcm stream management, so not sure what could cause this rather spectacular failure. That hints at a fundamental configuration difference, possibly caused by your component chaining?

Since in your case you have a 1:1 mapping between FE and BE, would you mind testing by backtracking, one patch at a time to see which one of the three last patches could cause a problem on your board?

Thanks again for your time! -Pierre

Sameer Pujar

6 Oct 6 Oct

4:22 p.m.

Hi Pierre,

On 10/5/2021 6:47 PM, Pierre-Louis Bossart wrote:

...

My patches don't change anything related to a spinlock or pcm stream management, so not sure what could cause this rather spectacular failure. That hints at a fundamental configuration difference, possibly caused by your component chaining?

Since in your case you have a 1:1 mapping between FE and BE, would you mind testing by backtracking, one patch at a time to see which one of the three last patches could cause a problem on your board?

I tested this further. It appears that things work fine till 'patch 3/5' of yours. After I take 'patch 4/5', the print "BUG: scheduling while atomic: aplay" started showing up and I see a hang. This failure was seen for 2x1 mixer test itself.

The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger(). This seems to be the problem, since trigger() runs in atomic context depending on the PCM 'nonatomic' flag. I am not sure if your design sets 'nonatomic' flag by default and that is why the issue is not seen at your end?

With below (just for testing purpose), tests ran well. I was able to run 2x1, 3x1 ... 10x1 mixer tests.

diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index e5df898..2ce30d1 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2045,7 +2045,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, struct snd_soc_dpcm *dpcm; int ret = 0;

- mutex_lock(&fe->card->dpcm_mutex); + //mutex_lock(&fe->card->dpcm_mutex); for_each_dpcm_be(fe, stream, dpcm) { struct snd_pcm_substream *be_substream;

@@ -2166,7 +2166,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, } } end: - mutex_unlock(&fe->card->dpcm_mutex); + //mutex_unlock(&fe->card->dpcm_mutex); if (ret < 0) dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n", __func__, be->dai_link->name, ret);

In fact I picked up all of your patches + above test patch, it worked fine.

Thanks, Sameer.

Pierre-Louis Bossart

9:47 p.m.

...

I tested this further. It appears that things work fine till 'patch 3/5' of yours. After I take 'patch 4/5', the print "BUG: scheduling while atomic: aplay" started showing up and I see a hang. This failure was seen for 2x1 mixer test itself.

The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger(). This seems to be the problem, since trigger() runs in atomic context depending on the PCM 'nonatomic' flag. I am not sure if your design sets 'nonatomic' flag by default and that is why the issue is not seen at your end?

With below (just for testing purpose), tests ran well. I was able to run 2x1, 3x1 ... 10x1 mixer tests.

This is useful information, thanks a lot Sameer for your time.

Unfortunately removing the lines below will not work, that's precisely what's needed to prevent multiple triggers from being sent to the same shared BE.

I don't have a clear idea why we see different results, and now I have even less understanding of the ALSA/ASoC/DPCM locking model. We use card->mutex, card->pcm_mutex, card->dpcm_lock, substream->self_group.mutex or lock, client_mutex, dapm_mutex....

I don't think any of the DPCM code is ever called from an interrupt context, so the errors you reported seem more like a card configuration, specifically the interaction with 'aplay'/userspace will happen for FEs. BEs are mostly hidden to userspace.

One possible difference is that all our DPCM solutions are based on non-atomic FEs (since they all involve an IPC with a DSP), so we would always use the top banch of these tests:

if (nonatomic) \ mutex_ ## mutex_action(&group->mutex); \ else \ spin_ ## action(&group->lock);

I don't see this nonatomic flag set anywhere in the sound/soc/tegra code, so it's possible that in your case the spin_lock/spin_lock_irq is used before triggering the FE, and using a mutex before the BE trigger throws errors? I think Takashi mentioned that the BEs inherit the properties of the FE to some extent.

Looking at the code, I see that only Intel legacy, SOF and Qualcomm drivers set nonatomic=1, so maybe we can assume that that FEs for a card are either all atomic or all non-atomic, maybe we could then use a spinlock or a mutex. But if the FEs can be different then I am not sure what locking model might work, we can't have a BE protected by a spinlock or a mutex depending on the property of the FE. We need one method only to guarantee a mutual exclusion.

Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?

Takashi Iwai

7 Oct 7 Oct

1:06 p.m.

On Wed, 06 Oct 2021 21:47:33 +0200, Pierre-Louis Bossart wrote:

...

...
I tested this further. It appears that things work fine till 'patch 3/5' of yours. After I take 'patch 4/5', the print "BUG: scheduling while atomic: aplay" started showing up and I see a hang. This failure was seen for 2x1 mixer test itself.

The 'patch 4/5' introduces mutex_lock/unlock() in dpcm_be_dai_trigger(). This seems to be the problem, since trigger() runs in atomic context depending on the PCM 'nonatomic' flag. I am not sure if your design sets 'nonatomic' flag by default and that is why the issue is not seen at your end?

With below (just for testing purpose), tests ran well. I was able to run 2x1, 3x1 ... 10x1 mixer tests.

This is useful information, thanks a lot Sameer for your time.

Unfortunately removing the lines below will not work, that's precisely what's needed to prevent multiple triggers from being sent to the same shared BE.

I don't have a clear idea why we see different results, and now I have even less understanding of the ALSA/ASoC/DPCM locking model. We use card->mutex, card->pcm_mutex, card->dpcm_lock, substream->self_group.mutex or lock, client_mutex, dapm_mutex....

I don't think any of the DPCM code is ever called from an interrupt context, so the errors you reported seem more like a card configuration, specifically the interaction with 'aplay'/userspace will happen for FEs. BEs are mostly hidden to userspace.

One possible difference is that all our DPCM solutions are based on non-atomic FEs (since they all involve an IPC with a DSP), so we would always use the top banch of these tests:

if (nonatomic) \ mutex_ ## mutex_action(&group->mutex); \ else \ spin_ ## action(&group->lock);

I don't see this nonatomic flag set anywhere in the sound/soc/tegra code, so it's possible that in your case the spin_lock/spin_lock_irq is used before triggering the FE, and using a mutex before the BE trigger throws errors? I think Takashi mentioned that the BEs inherit the properties of the FE to some extent.

Looking at the code, I see that only Intel legacy, SOF and Qualcomm drivers set nonatomic=1, so maybe we can assume that that FEs for a card are either all atomic or all non-atomic, maybe we could then use a spinlock or a mutex. But if the FEs can be different then I am not sure what locking model might work, we can't have a BE protected by a spinlock or a mutex depending on the property of the FE. We need one method only to guarantee a mutual exclusion.

Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?

As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.

How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?

thanks,

Takashi

Pierre-Louis Bossart

3:31 p.m.

...

...
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?

As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.

A BE may have more than one FEs. That's precisely the point of mixers/demux, and if the BE has FEs with different 'atomicity' then I don't see how locking can work: the BE operations are run in the context of each of its FE, typically using the following loop:

for_each_dpcm_be(fe, stream, dpcm) { do_something(); }

Applications will view multiple FEs as completely independent. They may be opened/prepared/started/stopped/paused as needed. When the BE is shared, then there is a need for consistency, such as starting the BE when the first FE becomes active and stopping it when the last FE stops.

...

How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?

We already have a dpcm_lock at the card level, which protects the addition of BEs and BE states. That spin_lock is fine for most cases.

The only real problem is the trigger, which is currently completely unprotected: we have to serialize the BE triggers, otherwise you could STOP before you START due to scheduling, or other problems that I saw in my SoundWire tests with two START triggers, or the STOP never sent.

But how to do this serialization is unclear...

A lateral thinking approach would be to decouple the BEs entirely, and have the FEs 'signal' their change of state. The BE 'thread' run in the BE context would then serialize the requests and perform all the BE operations, and the same approach could be chained. I am afraid that would be a complete rewrite of DPCM, but maybe we have to do so anyways if we can't support a basic case of a mixer with 2 streams :-)

Takashi Iwai

4:59 p.m.

On Thu, 07 Oct 2021 15:31:49 +0200, Pierre-Louis Bossart wrote:

...

...
...
Takashi, Mark, do you think that an all/none assumption on FE nonatomic properties would make sense?

As long as BE's are updated from FE's PCM callback, BE has to follow the atomicity of its FE, so we can't assume all or none globally.

A BE may have more than one FEs. That's precisely the point of mixers/demux, and if the BE has FEs with different 'atomicity' then I don't see how locking can work: the BE operations are run in the context of each of its FE, typically using the following loop:

for_each_dpcm_be(fe, stream, dpcm) { do_something(); }

Do we really have the cases where FEs have different atomicity? I don't think it's a valid configuration, and we should catch it via WARN_ON() or such.

...

Applications will view multiple FEs as completely independent. They may be opened/prepared/started/stopped/paused as needed. When the BE is shared, then there is a need for consistency, such as starting the BE when the first FE becomes active and stopping it when the last FE stops.

...
How is the expected lock granularity and the protection context? Do we need a card global lock/mutex, or is it specific to each BE substream?

We already have a dpcm_lock at the card level, which protects the addition of BEs and BE states. That spin_lock is fine for most cases.

The only real problem is the trigger, which is currently completely unprotected: we have to serialize the BE triggers, otherwise you could STOP before you START due to scheduling, or other problems that I saw in my SoundWire tests with two START triggers, or the STOP never sent.

So it's about calling triggers to the same BE stream concurrently? If that's the case, can't we simply protect the trigger handling of each BE like below?

...

But how to do this serialization is unclear...

A lateral thinking approach would be to decouple the BEs entirely, and have the FEs 'signal' their change of state. The BE 'thread' run in the BE context would then serialize the requests and perform all the BE operations, and the same approach could be chained. I am afraid that would be a complete rewrite of DPCM, but maybe we have to do so anyways if we can't support a basic case of a mixer with 2 streams :-)

Well, let's see whether we can get some improvements by a simple change at first.

Takashi

--- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -1998,6 +1998,7 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, { struct snd_soc_pcm_runtime *be; struct snd_soc_dpcm *dpcm; + unsigned long flags; int ret = 0;

for_each_dpcm_be(fe, stream, dpcm) { @@ -2006,9 +2007,11 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, be = dpcm->be; be_substream = snd_soc_dpcm_get_substream(be, stream);

+ snd_pcm_stream_lock_irqsave(be_substream, flags); + /* is this op for this BE ? */ if (!snd_soc_dpcm_be_can_update(fe, be, stream)) - continue; + goto unlock;

dev_dbg(be->dev, "ASoC: trigger BE %s cmd %d\n", be->dai_link->name, cmd); @@ -2018,77 +2021,81 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream, if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PREPARE) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) && (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED)) - continue; + goto unlock;