Re: [PATCH] [RFC] ASoC: Conditional PCM support

22 Aug 2024

On 2024-08-21 2:43 PM, Pierre-Louis Bossart wrote:
...
On 8/21/24 12:18, Cezary Rojewski wrote:
...
Conditional PCM (condpcm) helps facilitate modern audio usecases such as
Echo Cancellations and Noise Reduction. These are not invoked by the
means of userspace application opening an endpoint (FrontEnd) but are a
"side effect" of selected PCMs running simultaneously e.g.: if both
Speaker (source) and Microphone Array (sink) are running, reference
data from the Speaker and take it into account when processing capture
for better voice command detection ratio.
After reading the review, it's important for me to highlight that the 
quality of the response is high and required that much effort to write 
it. Thank you.
...
The point about dependencies between capture/playback usages is
certainly valid, and we've faced it multiple times for SOF - and even
before in the mobile phone days. I am not convinced however that the
graph management suggested here solves the well-known DPCM routing
problems? See notes in no specific order below.
While at it, do we (Mark perhaps?) have some kind of a list with major 
problems troubling ASoC? I keep seeing "DPCM is problematic" on the 
mailing-list. If DPCM is indeed in such bad state, perhaps we should 
address this sooner rather than later.
...
I am not following what the 'source' and 'sink' concepts refer to in
this context. It looks like you are referring to regular PCM devices,
i.e. Front Ends in soc-pcm parlance but examples and code make
references to Back Ends.
There are also complicated cases where the amplifiers can provide an
echo reference for AEC and I/V sensing for speaker protection. You would
want to capture both even if there's no capture happening at the
userspace level. This is a well-know DPCM routing issue where we have to
rely on a Front-End being opened and some tags in UCM to deal with loose
coupling.
It would help if you added precisions on your assumptions of where the
processing takes place. In some cases Echo cancellation is handled in
userspace, others in SOC firmware and others externally in a codec.
The notion of source/sink is also problematic when the same BE provides
two sources of information that will be split, again same problem with
amplifier feedback being used for two separate functions. What happens
if you have multiple sinks for one source?
Same for the cases where the mic input is split multiple ways with
different processing added on different PCM capture devices, e.g. for
WebRTC there's an ask for a raw input, an AEC-processed input and
AEC+NS-processed input. That's typically implemented with two splitters,
  the echo reference would be used by an intermediate node inside a
firmware graph, not at the DAI/BE or host/FE levels, and such
intermediate nodes are usually not handled by soc-pcm. We really need
more than the notion of FE and BE, a two-layer solution is no longer
sufficient.
The other thing that looks weird is the dependency on both sink and
source sharing a common state. For a noise reduction there are cases
where you'd want the mic input to be stored in a history buffer so that
the noise parameters can be estimated as soon as the actual capture starts.
I'm used to environment where most of the processing is done by the SOC 
firmware so that would be one of the design philosophies.
The reason I've opted out from using "FE/BE" is to avoid naming 
confusion. FE/BEs are paired with dai_links and explicitly state the 
value of ->no_pcm flag. Condpcm does not care about that flag at all and 
given snd_soc_pcm_runtime (rtd) instance can be utilized simultaneously 
as data provider _and_ data consumer. The existing approach allows for 
source -> sink models: BE -> BE and FE -> FE both, I believe this helps 
in amplifier case.
You've shared many scenarios, I do not think we can cover all of them 
here and while I could agree that current FE/BE (DPCM?) design did not 
age well, we're entering "rewrite how-to-pcm-in-linux" area.
If general opinion is:
    it's too much, we have to rewrite for the framework to scale
    into the next 20 years of audio in Linux
then my thoughts regarding current review are:
    if the avs-driver needs sideband interface, so be it, but do it
    locally rather than polluting entire framework. Switch to the
    framework-solution once its rewritten.
...
...
Which PCMs are needed for given conditional PCM to be spawned is
determinated by the driver when registering the condpcm.
Presumably such links should be described by a topology file? It would
be odd for a driver to have to guess when to connect processing elements.
Indeed, topology can help here. Of course if a driver is utilizing 
static connections, one could register condpcm using pre-defined values.
...
...
The functionality was initially proposed for the avs-driver [1] and,
depending on feedback and review may either go back into avs -or- become
a ASoC-core feature. Implementation present here is an example of how
such functionality could look and work on the ASoC side. Compared to
what was provided initially, the patch carries simplified version of the
feature: no priority/overriding for already running conditional PCMs.
Whatever is spawned is treated as a non-conflicting entity.
Assumptions and design decisions:

existence and outcome of condpcm operations is entirely optional and
 shall not impact the runtime flow of PCMs that spawned given condpcm,
 e.g.: fail in cpcm->hw_params() shall not impact fe->hw_params() or
 be->hw_params() negatively. Think of it as of debugfs. Useful? Yes.
 Required for system to operate? No.

that's debatable, if the AEC setup isn't successful then is the
functionality implemented correctly? My take is no, don't fail silently
if the AEC doesn't work.
If this functionality is listed as a product requirement then it cannot
be treated as a debugfs optional thing.
Exhibit A for this is the countless cases where validation reported a
problem with a path remaining active or conversely not being setup, or a
voice quality issue. Those are not optional...
Well, as you mentioned, that's debatable. Perhaps an opt-in flag is 
needed here - I'd like not to put all the users into same basket. Some 
users may not be happy with AEC failing but would like their 
speakers/mics to keep working nonetheless. Basic functionality better 
than no functionality.
Either flag or 'return 0' if you do not care methodology.
...
...

a condpcm is a runtime entity that's audio format independent - since
 certain FE/BEs are its dependencies already, that's no need to do
 format ruling twice. Driver may still do custom checks thanks to
 ->match() operation.

a condpcm allows for additional processing of data that flows from
 data-source - a substream instance acting as data provider -
 to sink - a substream acting as data consumer. At the same time,
 regardless of substream->stream, given substream may act as data
 source for one condpcm and data sink for another, simultaneously.

while condpcm's behaviour mimics standard PCM one, there is no
 ->open() and ->close() - FE/BEs are treated as operational starting
 with successful ->hw_params(), when hw_ruling is done and hardware is
 configured.

cpcm->prepare() gets called only when both data source and sink are
 prepared

cpcm->trigger(START) gets called only when both data source and sink
 are running

cpcm->trigger(STOP) gets called when either data source or sink is
 stopped


Simplified state machine:
	     |

register_condpcm()   |
   		     v
   		  +--+-------------+
   		  |  DISCONNECTED  |<-+
   		  +--+-------------+  |
   		     |		      |
   condpcm_hw_params()  |		      |
   		     v		      |
   		  +--+-------------+  |
   		  |     SETUP      |  |	condpcm_hw_free()
   		  +--+-------------+  |
   		     |		      |
   condpcm_prepare()    |		      |
   		     v		      |
   		  +--+----+--------+  |
   		  |    PREPARED    |--+
   		  +--+----------+--+
   		     |          ^
   condpcm_start()	     |		|	condpcm_stop()
   		     v		|
   		  +--+----------+--+
   		  |    RUNNING     |
   		  +----------------+
What's missing?
I've not covered the locking part yet. While some operations are covered
by default thanks to snd_soc_dpcm_mutex(), it is insufficient.
If feature goes back to the avs-driver, then I'm set due to path->mutex.
The locking is one of the reasons I'm leaning towards leaving the
condpcm within the avs-driver. For soc_condpcm_find_match() to be
precise and do no harm, a lock must prepend the list_for_each_entry().
Entries (substreams) of that list may be part of number of different
components and the search may negatively impact runtime flow of
substreams that do not care about condpcms at all.
Has this been tested?
Unit-like only. Typical case below with avs_condpcm_ops representing
bunch of stubs with printfs.
static struct snd_soc_condpcm_pred pred1 = {
   .card_name = "ssp0-loopback",
   .link_name = "SSP0-Codec",	/* BE */
   .direction = SNDRV_PCM_STREAM_PLAYBACK,
};
static struct snd_soc_condpcm_pred pred2 = {
   .card_name = "hdaudioB0D2",
   .link_name = "HDMI1",		/* FE */
   .direction = SNDRV_PCM_STREAM_PLAYBACK,
};
It's not intuitive to follow what HDMI and SSP might have to do with
each other, nor why one is a BE and one is an FE?
If I follow the code below, the SSP loopback is a source feeds into an
HDMI sink, and SSP is a BE and HDMI an FE? Confusing example...
The intention is to show that condpcm does not care what exactly 
source/sink represent. It just connects rtds which may lie on completely 
different sound cards. This very example will treat BE dai_link named 
"SSP0-Codec" as data source for FE dai_link named "HDMI1" which consumes 
the data (sink).
...
...
static void avs_condpcms_register(struct avs_dev *adev)
{
   (...)
   snd_soc_register_condpcm(&pred1, &pred2, &avs_condpcm_ops, adev);
}
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
include/sound/pcm.h     |   1 +
  include/sound/soc.h     |  65 ++++++++
  sound/core/pcm.c        |   1 +
  sound/soc/Makefile      |   2 +-
  sound/soc/soc-condpcm.c | 348 ++++++++++++++++++++++++++++++++++++++++
  sound/soc/soc-condpcm.h |  17 ++
  sound/soc/soc-core.c    |   2 +
  sound/soc/soc-pcm.c     |  11 ++
  8 files changed, 446 insertions(+), 1 deletion(-)
  create mode 100644 sound/soc/soc-condpcm.c
  create mode 100644 sound/soc/soc-condpcm.h

diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index ac8f3aef9205..7e635b3103a2 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -482,6 +482,7 @@ struct snd_pcm_substream {
   struct list_head link_list;	/* linked list member */
   struct snd_pcm_group self_group;	/* fake group for non linked substream (with substream lock inside) */
   struct snd_pcm_group *group;		/* pointer to current group */

struct list_head cpcm_candidate_node;

It wouldn't hurt to describe what 'candidate' might mean here?
Ack.
...
...
...
+/* Conditional PCM operations called by soc-pcm.c. */
+struct snd_soc_condpcm_ops {

int (*match)(struct snd_soc_condpcm *, struct snd_pcm_substream *,
     struct snd_pcm_substream *);


int (*hw_params)(struct snd_soc_condpcm *, struct snd_pcm_hw_params *);
int (*hw_free)(struct snd_soc_condpcm *);
int (*prepare)(struct snd_soc_condpcm *, struct snd_pcm_substream *);
int (*trigger)(struct snd_soc_condpcm *, struct snd_pcm_substream *, int);

+};



+/**


struct snd_soc_condpcm_pred - Predicate, describes source or sink (substream)



                          dependency for given conditional PCM.









@card_name: Name of card owning substream to find.



@link_name: Name of DAI LINK owning substream to find.



@direction: Whether its SNDRV_PCM_STREAM_PLAYBACK or CAPTURE.


*/

+struct snd_soc_condpcm_pred {

const char *card_name;

Please tell me the runtimes and links are in the same card...
If not, there's all kinds of power management and probe/remove issues...
Well, this have been kind of mentioned by me in "What's missing?". I've 
focused more on the locking part though. However, register() and 
unregister() functions are explicit, the condpcm-owning driver should be 
responsible for handling the problematic pieces highlighted above.
...
...

const char *link_name;

dai link name?
Meh, .card_name and .link_name have the exact same length.
...
...

int direction;

+};



+/**


struct snd_soc_condpcm - Conditional PCM descriptor.







@ops: custom PCM operations.



@preds: predicates for identifying source and sink for given conditional PCM.



predicate is a verb and a noun, not clear what you are trying to document.
In this context 'predicate' is a descriptor, selected set of features to 
test in order to answer the question: Does this substream match given 
condpcm?
...
...






@source: substreaming acting as a data source, assigned at runtime.



@sink: substreaming acting as a data sink, assigned at runtime.



@state: current runtime state.



presumably this state is already defined that the state of sink/source?
The state keeps things sane - avoid duplicate calls etc. It does not 
represent state of sink/source directly, one has to acccess dpcm do 
obtain that information.
...
...
...
+static int soc_condpcm_hw_params(struct snd_soc_condpcm *cpcm,

		 struct snd_pcm_hw_params *params)



+{

struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(cpcm->source);
struct snd_soc_pcm_runtime *rtd2 = snd_soc_substream_to_rtd(cpcm->sink);

how are the 'params' defined?
I read above
"
a condpcm is a runtime entity that's audio format independent - since
certain FE/BEs are its dependencies already, that's no need to do
format ruling twice.
"
That doesn't tell us how this 'params' is determined. This is important
for cases where the speaker output is e.g. 2ch 48kHz and the mic input
is 4ch 96kHz. If this condpcm is not managed by any usersapce action,
then what is the logic for selecting the settings in 'params'?
soc-condpcm.c tries to be cohesive with the rest of soc-pcm.c code. I do 
not see a reason to create _new_ naming scheme, so here I just mimic 
->hw_params() behaviour. The callee may choose to ignore 'params' or may 
take it into account. In regard to your question, the callee may check 
rtd->dpcm[stream].hw_params when servicing ->match(cpcm, source, sink) 
to do necessary examination.
...
...

int ret;

ret = cpcm->ops->hw_params(cpcm, params);
if (ret)
return ret;



list_add_tail(&cpcm->source_node, &rtd->cpcm_source_list);
list_add_tail(&cpcm->sink_node, &rtd2->cpcm_sink_list);
cpcm->state = SNDRV_PCM_STATE_SETUP;
return 0;

+}
There's also the well-known problem that hw_params can be called
multiple times. Here  this wouldn't work with the same source/sink added
multiple times in a list.
soc_condpcm_walk() invokes soc_condpcm_hw_params() only if cpcm->state 
equals DISCONNECTED. If I misunderstood, please elaborate.
...
...



+static void soc_condpcm_hw_free(struct snd_soc_condpcm *cpcm)
+{

cpcm->ops->hw_free(cpcm);
list_del(&cpcm->source_node);
list_del(&cpcm->sink_node);
cpcm->state = SNDRV_PCM_STATE_DISCONNECTED;

+}



+static void soc_condpcm_prepare(struct snd_soc_condpcm *cpcm,

		struct snd_pcm_substream *substream)



+{

int ret;

ret = cpcm->ops->prepare(cpcm, substream);
if (!ret)
cpcm->state = SNDRV_PCM_STATE_PREPARED;



+}
you probably need to look at the xruns and resume cases, where prepare()
is used for vastly different purposes.
My initial idea was to cut prepare() step entirely. Perhaps that's the 
way to go.
...
...
...
+static int soc_condpcm_walk(struct snd_soc_pcm_runtime *rtd,

	    struct snd_pcm_substream *substream,


	    struct snd_pcm_hw_params *params, int dir)



+{

/* Temporary source/sink cache. */
struct snd_pcm_substream *substreams[2];
struct snd_soc_condpcm *cpcm;
int ret;

substreams[dir] = substream;

list_for_each_entry(cpcm, &condpcm_list, node) {
if (cpcm->state != SNDRV_PCM_STATE_DISCONNECTED)


	continue;



/* Does this cpcm match @substream? */


if (!soc_condpcm_test(cpcm, substream, dir))


	continue;



/* Find pair for the @substream. */


substreams[!dir] = soc_condpcm_find_match(cpcm, substream, !dir);


if (!substreams[!dir])


	continue;



/* Allow driver to have the final word. */


ret = cpcm->ops->match(cpcm, substreams[0], substreams[1]);


if (ret)


	continue;


cpcm->source = substreams[0];


cpcm->sink = substreams[1];



ret = soc_condpcm_hw_params(cpcm, params);


if (ret) {


	cpcm->source = NULL;


	cpcm->sink = NULL;


	return ret;


}


}

return 0;

+}



+/* Called by soc-pcm.c after each successful hw_params(). */
+int snd_soc_condpcms_walk_all(struct snd_soc_pcm_runtime *rtd,

	      struct snd_pcm_substream *substream,


	      struct snd_pcm_hw_params *params)



+{

int ret;

list_add_tail(&substream->cpcm_candidate_node, &condpcm_candidate_list);

/* Spawn all condpcms this substream is the missing source of. */
ret = soc_condpcm_walk(rtd, substream, params, SNDRV_PCM_STREAM_CAPTURE);
if (ret)
return ret;



/* Spawn all condpcms this substream is the missing sink of. */
return soc_condpcm_walk(rtd, substream, params, SNDRV_PCM_STREAM_PLAYBACK);

+}
Are loops supported?
Is the order between capture and playback intentional?
Is the notion of playback/capture even relevant when trying to add
loopbacks?
Lots of questions...
The order selected here is more of a habbit of mine, not a must-be.
Loopback scenario implies a real capture endpoint which is sourced from 
certain playback stream. If that's the ask, yes, it's one of the usecases.