On 2024-08-21 2:43 PM, Pierre-Louis Bossart wrote:
On 8/21/24 12:18, Cezary Rojewski wrote:
Conditional PCM (condpcm) helps facilitate modern audio usecases such as Echo Cancellations and Noise Reduction. These are not invoked by the means of userspace application opening an endpoint (FrontEnd) but are a "side effect" of selected PCMs running simultaneously e.g.: if both Speaker (source) and Microphone Array (sink) are running, reference data from the Speaker and take it into account when processing capture for better voice command detection ratio.
After reading the review, it's important for me to highlight that the quality of the response is high and required that much effort to write it. Thank you.
The point about dependencies between capture/playback usages is certainly valid, and we've faced it multiple times for SOF - and even before in the mobile phone days. I am not convinced however that the graph management suggested here solves the well-known DPCM routing problems? See notes in no specific order below.
While at it, do we (Mark perhaps?) have some kind of a list with major problems troubling ASoC? I keep seeing "DPCM is problematic" on the mailing-list. If DPCM is indeed in such bad state, perhaps we should address this sooner rather than later.
I am not following what the 'source' and 'sink' concepts refer to in this context. It looks like you are referring to regular PCM devices, i.e. Front Ends in soc-pcm parlance but examples and code make references to Back Ends.
There are also complicated cases where the amplifiers can provide an echo reference for AEC and I/V sensing for speaker protection. You would want to capture both even if there's no capture happening at the userspace level. This is a well-know DPCM routing issue where we have to rely on a Front-End being opened and some tags in UCM to deal with loose coupling.
It would help if you added precisions on your assumptions of where the processing takes place. In some cases Echo cancellation is handled in userspace, others in SOC firmware and others externally in a codec.
The notion of source/sink is also problematic when the same BE provides two sources of information that will be split, again same problem with amplifier feedback being used for two separate functions. What happens if you have multiple sinks for one source?
Same for the cases where the mic input is split multiple ways with different processing added on different PCM capture devices, e.g. for WebRTC there's an ask for a raw input, an AEC-processed input and AEC+NS-processed input. That's typically implemented with two splitters, the echo reference would be used by an intermediate node inside a firmware graph, not at the DAI/BE or host/FE levels, and such intermediate nodes are usually not handled by soc-pcm. We really need more than the notion of FE and BE, a two-layer solution is no longer sufficient.
The other thing that looks weird is the dependency on both sink and source sharing a common state. For a noise reduction there are cases where you'd want the mic input to be stored in a history buffer so that the noise parameters can be estimated as soon as the actual capture starts.
I'm used to environment where most of the processing is done by the SOC firmware so that would be one of the design philosophies.
The reason I've opted out from using "FE/BE" is to avoid naming confusion. FE/BEs are paired with dai_links and explicitly state the value of ->no_pcm flag. Condpcm does not care about that flag at all and given snd_soc_pcm_runtime (rtd) instance can be utilized simultaneously as data provider _and_ data consumer. The existing approach allows for source -> sink models: BE -> BE and FE -> FE both, I believe this helps in amplifier case.
You've shared many scenarios, I do not think we can cover all of them here and while I could agree that current FE/BE (DPCM?) design did not age well, we're entering "rewrite how-to-pcm-in-linux" area. If general opinion is: it's too much, we have to rewrite for the framework to scale into the next 20 years of audio in Linux
then my thoughts regarding current review are: if the avs-driver needs sideband interface, so be it, but do it locally rather than polluting entire framework. Switch to the framework-solution once its rewritten.
Which PCMs are needed for given conditional PCM to be spawned is determinated by the driver when registering the condpcm.
Presumably such links should be described by a topology file? It would be odd for a driver to have to guess when to connect processing elements.
Indeed, topology can help here. Of course if a driver is utilizing static connections, one could register condpcm using pre-defined values.
The functionality was initially proposed for the avs-driver [1] and, depending on feedback and review may either go back into avs -or- become a ASoC-core feature. Implementation present here is an example of how such functionality could look and work on the ASoC side. Compared to what was provided initially, the patch carries simplified version of the feature: no priority/overriding for already running conditional PCMs. Whatever is spawned is treated as a non-conflicting entity.
Assumptions and design decisions:
- existence and outcome of condpcm operations is entirely optional and shall not impact the runtime flow of PCMs that spawned given condpcm, e.g.: fail in cpcm->hw_params() shall not impact fe->hw_params() or be->hw_params() negatively. Think of it as of debugfs. Useful? Yes. Required for system to operate? No.
that's debatable, if the AEC setup isn't successful then is the functionality implemented correctly? My take is no, don't fail silently if the AEC doesn't work.
If this functionality is listed as a product requirement then it cannot be treated as a debugfs optional thing.
Exhibit A for this is the countless cases where validation reported a problem with a path remaining active or conversely not being setup, or a voice quality issue. Those are not optional...
Well, as you mentioned, that's debatable. Perhaps an opt-in flag is needed here - I'd like not to put all the users into same basket. Some users may not be happy with AEC failing but would like their speakers/mics to keep working nonetheless. Basic functionality better than no functionality. Either flag or 'return 0' if you do not care methodology.
a condpcm is a runtime entity that's audio format independent - since certain FE/BEs are its dependencies already, that's no need to do format ruling twice. Driver may still do custom checks thanks to ->match() operation.
a condpcm allows for additional processing of data that flows from data-source - a substream instance acting as data provider - to sink - a substream acting as data consumer. At the same time, regardless of substream->stream, given substream may act as data source for one condpcm and data sink for another, simultaneously.
while condpcm's behaviour mimics standard PCM one, there is no ->open() and ->close() - FE/BEs are treated as operational starting with successful ->hw_params(), when hw_ruling is done and hardware is configured.
cpcm->prepare() gets called only when both data source and sink are prepared
cpcm->trigger(START) gets called only when both data source and sink are running
cpcm->trigger(STOP) gets called when either data source or sink is stopped
Simplified state machine:
|
register_condpcm() | v +--+-------------+ | DISCONNECTED |<-+ +--+-------------+ | | | condpcm_hw_params() | | v | +--+-------------+ | | SETUP | | condpcm_hw_free() +--+-------------+ | | | condpcm_prepare() | | v | +--+----+--------+ | | PREPARED |--+ +--+----------+--+ | ^ condpcm_start() | | condpcm_stop() v | +--+----------+--+ | RUNNING | +----------------+
What's missing? I've not covered the locking part yet. While some operations are covered by default thanks to snd_soc_dpcm_mutex(), it is insufficient. If feature goes back to the avs-driver, then I'm set due to path->mutex.
The locking is one of the reasons I'm leaning towards leaving the condpcm within the avs-driver. For soc_condpcm_find_match() to be precise and do no harm, a lock must prepend the list_for_each_entry(). Entries (substreams) of that list may be part of number of different components and the search may negatively impact runtime flow of substreams that do not care about condpcms at all.
Has this been tested?
Unit-like only. Typical case below with avs_condpcm_ops representing bunch of stubs with printfs.
static struct snd_soc_condpcm_pred pred1 = { .card_name = "ssp0-loopback", .link_name = "SSP0-Codec", /* BE */ .direction = SNDRV_PCM_STREAM_PLAYBACK, };
static struct snd_soc_condpcm_pred pred2 = { .card_name = "hdaudioB0D2", .link_name = "HDMI1", /* FE */ .direction = SNDRV_PCM_STREAM_PLAYBACK, };
It's not intuitive to follow what HDMI and SSP might have to do with each other, nor why one is a BE and one is an FE?
If I follow the code below, the SSP loopback is a source feeds into an HDMI sink, and SSP is a BE and HDMI an FE? Confusing example...
The intention is to show that condpcm does not care what exactly source/sink represent. It just connects rtds which may lie on completely different sound cards. This very example will treat BE dai_link named "SSP0-Codec" as data source for FE dai_link named "HDMI1" which consumes the data (sink).
static void avs_condpcms_register(struct avs_dev *adev) { (...) snd_soc_register_condpcm(&pred1, &pred2, &avs_condpcm_ops, adev); }
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
include/sound/pcm.h | 1 + include/sound/soc.h | 65 ++++++++ sound/core/pcm.c | 1 + sound/soc/Makefile | 2 +- sound/soc/soc-condpcm.c | 348 ++++++++++++++++++++++++++++++++++++++++ sound/soc/soc-condpcm.h | 17 ++ sound/soc/soc-core.c | 2 + sound/soc/soc-pcm.c | 11 ++ 8 files changed, 446 insertions(+), 1 deletion(-) create mode 100644 sound/soc/soc-condpcm.c create mode 100644 sound/soc/soc-condpcm.h
diff --git a/include/sound/pcm.h b/include/sound/pcm.h index ac8f3aef9205..7e635b3103a2 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -482,6 +482,7 @@ struct snd_pcm_substream { struct list_head link_list; /* linked list member */ struct snd_pcm_group self_group; /* fake group for non linked substream (with substream lock inside) */ struct snd_pcm_group *group; /* pointer to current group */
- struct list_head cpcm_candidate_node;
It wouldn't hurt to describe what 'candidate' might mean here?
Ack.
...
+/* Conditional PCM operations called by soc-pcm.c. */ +struct snd_soc_condpcm_ops {
- int (*match)(struct snd_soc_condpcm *, struct snd_pcm_substream *,
struct snd_pcm_substream *);
- int (*hw_params)(struct snd_soc_condpcm *, struct snd_pcm_hw_params *);
- int (*hw_free)(struct snd_soc_condpcm *);
- int (*prepare)(struct snd_soc_condpcm *, struct snd_pcm_substream *);
- int (*trigger)(struct snd_soc_condpcm *, struct snd_pcm_substream *, int);
+};
+/**
- struct snd_soc_condpcm_pred - Predicate, describes source or sink (substream)
dependency for given conditional PCM.
- @card_name: Name of card owning substream to find.
- @link_name: Name of DAI LINK owning substream to find.
- @direction: Whether its SNDRV_PCM_STREAM_PLAYBACK or CAPTURE.
- */
+struct snd_soc_condpcm_pred {
- const char *card_name;
Please tell me the runtimes and links are in the same card... If not, there's all kinds of power management and probe/remove issues...
Well, this have been kind of mentioned by me in "What's missing?". I've focused more on the locking part though. However, register() and unregister() functions are explicit, the condpcm-owning driver should be responsible for handling the problematic pieces highlighted above.
- const char *link_name;
dai link name?
Meh, .card_name and .link_name have the exact same length.
- int direction;
+};
+/**
- struct snd_soc_condpcm - Conditional PCM descriptor.
- @ops: custom PCM operations.
- @preds: predicates for identifying source and sink for given conditional PCM.
predicate is a verb and a noun, not clear what you are trying to document.
In this context 'predicate' is a descriptor, selected set of features to test in order to answer the question: Does this substream match given condpcm?
- @source: substreaming acting as a data source, assigned at runtime.
- @sink: substreaming acting as a data sink, assigned at runtime.
- @state: current runtime state.
presumably this state is already defined that the state of sink/source?
The state keeps things sane - avoid duplicate calls etc. It does not represent state of sink/source directly, one has to acccess dpcm do obtain that information.
...
+static int soc_condpcm_hw_params(struct snd_soc_condpcm *cpcm,
struct snd_pcm_hw_params *params)
+{
- struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(cpcm->source);
- struct snd_soc_pcm_runtime *rtd2 = snd_soc_substream_to_rtd(cpcm->sink);
how are the 'params' defined?
I read above
" a condpcm is a runtime entity that's audio format independent - since certain FE/BEs are its dependencies already, that's no need to do format ruling twice. "
That doesn't tell us how this 'params' is determined. This is important for cases where the speaker output is e.g. 2ch 48kHz and the mic input is 4ch 96kHz. If this condpcm is not managed by any usersapce action, then what is the logic for selecting the settings in 'params'?
soc-condpcm.c tries to be cohesive with the rest of soc-pcm.c code. I do not see a reason to create _new_ naming scheme, so here I just mimic ->hw_params() behaviour. The callee may choose to ignore 'params' or may take it into account. In regard to your question, the callee may check rtd->dpcm[stream].hw_params when servicing ->match(cpcm, source, sink) to do necessary examination.
- int ret;
- ret = cpcm->ops->hw_params(cpcm, params);
- if (ret)
return ret;
- list_add_tail(&cpcm->source_node, &rtd->cpcm_source_list);
- list_add_tail(&cpcm->sink_node, &rtd2->cpcm_sink_list);
- cpcm->state = SNDRV_PCM_STATE_SETUP;
- return 0;
+}
There's also the well-known problem that hw_params can be called multiple times. Here this wouldn't work with the same source/sink added multiple times in a list.
soc_condpcm_walk() invokes soc_condpcm_hw_params() only if cpcm->state equals DISCONNECTED. If I misunderstood, please elaborate.
+static void soc_condpcm_hw_free(struct snd_soc_condpcm *cpcm) +{
- cpcm->ops->hw_free(cpcm);
- list_del(&cpcm->source_node);
- list_del(&cpcm->sink_node);
- cpcm->state = SNDRV_PCM_STATE_DISCONNECTED;
+}
+static void soc_condpcm_prepare(struct snd_soc_condpcm *cpcm,
struct snd_pcm_substream *substream)
+{
- int ret;
- ret = cpcm->ops->prepare(cpcm, substream);
- if (!ret)
cpcm->state = SNDRV_PCM_STATE_PREPARED;
+}
you probably need to look at the xruns and resume cases, where prepare() is used for vastly different purposes.
My initial idea was to cut prepare() step entirely. Perhaps that's the way to go.
...
+static int soc_condpcm_walk(struct snd_soc_pcm_runtime *rtd,
struct snd_pcm_substream *substream,
struct snd_pcm_hw_params *params, int dir)
+{
- /* Temporary source/sink cache. */
- struct snd_pcm_substream *substreams[2];
- struct snd_soc_condpcm *cpcm;
- int ret;
- substreams[dir] = substream;
- list_for_each_entry(cpcm, &condpcm_list, node) {
if (cpcm->state != SNDRV_PCM_STATE_DISCONNECTED)
continue;
/* Does this cpcm match @substream? */
if (!soc_condpcm_test(cpcm, substream, dir))
continue;
/* Find pair for the @substream. */
substreams[!dir] = soc_condpcm_find_match(cpcm, substream, !dir);
if (!substreams[!dir])
continue;
/* Allow driver to have the final word. */
ret = cpcm->ops->match(cpcm, substreams[0], substreams[1]);
if (ret)
continue;
cpcm->source = substreams[0];
cpcm->sink = substreams[1];
ret = soc_condpcm_hw_params(cpcm, params);
if (ret) {
cpcm->source = NULL;
cpcm->sink = NULL;
return ret;
}
- }
- return 0;
+}
+/* Called by soc-pcm.c after each successful hw_params(). */ +int snd_soc_condpcms_walk_all(struct snd_soc_pcm_runtime *rtd,
struct snd_pcm_substream *substream,
struct snd_pcm_hw_params *params)
+{
- int ret;
- list_add_tail(&substream->cpcm_candidate_node, &condpcm_candidate_list);
- /* Spawn all condpcms this substream is the missing source of. */
- ret = soc_condpcm_walk(rtd, substream, params, SNDRV_PCM_STREAM_CAPTURE);
- if (ret)
return ret;
- /* Spawn all condpcms this substream is the missing sink of. */
- return soc_condpcm_walk(rtd, substream, params, SNDRV_PCM_STREAM_PLAYBACK);
+}
Are loops supported? Is the order between capture and playback intentional? Is the notion of playback/capture even relevant when trying to add loopbacks?
Lots of questions...
The order selected here is more of a habbit of mine, not a must-be. Loopback scenario implies a real capture endpoint which is sourced from certain playback stream. If that's the ask, yes, it's one of the usecases.