[alsa-devel] [PATCH] ALSA: PCM: check if ops are defined before suspending PCM
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
BE dai links only have internal PCM's and their substream ops may not be set. Suspending these PCM's will result in their ops->trigger() being invoked and cause a kernel oops. So skip suspending PCM's if their ops are NULL.
Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/core/pcm_native.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue; + + /* + * Skip BE dai link PCM's that are internal and may + * not have their substream ops set. + */ + if (!substream->ops) + continue; + err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;
On Sat, 09 Feb 2019 00:29:53 +0100, Pierre-Louis Bossart wrote:
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
BE dai links only have internal PCM's and their substream ops may not be set. Suspending these PCM's will result in their ops->trigger() being invoked and cause a kernel oops. So skip suspending PCM's if their ops are NULL.
Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
sound/core/pcm_native.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue;
/*
* Skip BE dai link PCM's that are internal and may
* not have their substream ops set.
*/
if (!substream->ops)
continue;
err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;
Basically it's OK and safe to apply this check. We may need to add such sanity checks in more places if this really hits.
But I still wonder how this can go through. Is substream->runtime set even if substream->ops is NULL? The substream->runtime is assigned dynamically at opening a substream via snd_pcm_attach_substream(), so without opening it, it must be NULL.
thanks,
Takashi
On 2/9/19 3:27 AM, Takashi Iwai wrote:
On Sat, 09 Feb 2019 00:29:53 +0100, Pierre-Louis Bossart wrote:
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
BE dai links only have internal PCM's and their substream ops may not be set. Suspending these PCM's will result in their ops->trigger() being invoked and cause a kernel oops. So skip suspending PCM's if their ops are NULL.
Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
sound/core/pcm_native.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue;
/*
* Skip BE dai link PCM's that are internal and may
* not have their substream ops set.
*/
if (!substream->ops)
continue;
err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;
Basically it's OK and safe to apply this check. We may need to add such sanity checks in more places if this really hits.
But I still wonder how this can go through. Is substream->runtime set even if substream->ops is NULL? The substream->runtime is assigned dynamically at opening a substream via snd_pcm_attach_substream(), so without opening it, it must be NULL.
This error case was exposed when we tried to get rid of snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() instead to do the work for us.
In the case of back-ends, all initializations are bypassed in soc_new_pcm() - see below a code snippet - and the ops aren't set before suspend is called.
The complete thread where we discussed this is at https://github.com/thesofproject/linux/pull/582
if (rtd->dai_link->no_pcm) { if (playback) pcm->streams[SNDRV_PCM_STREAM_PLAYBACK].substream->private_data = rtd; if (capture) pcm->streams[SNDRV_PCM_STREAM_CAPTURE].substream->private_data = rtd; goto out; //<<<< this bypasses all the ops initializations!! }
/* ASoC PCM operations */ if (rtd->dai_link->dynamic) { rtd->ops.open = dpcm_fe_dai_open; rtd->ops.hw_params = dpcm_fe_dai_hw_params; rtd->ops.prepare = dpcm_fe_dai_prepare; rtd->ops.trigger = dpcm_fe_dai_trigger; rtd->ops.hw_free = dpcm_fe_dai_hw_free; rtd->ops.close = dpcm_fe_dai_close; rtd->ops.pointer = soc_pcm_pointer; rtd->ops.ioctl = soc_pcm_ioctl; } else { rtd->ops.open = soc_pcm_open; rtd->ops.hw_params = soc_pcm_hw_params; rtd->ops.prepare = soc_pcm_prepare; rtd->ops.trigger = soc_pcm_trigger; rtd->ops.hw_free = soc_pcm_hw_free; rtd->ops.close = soc_pcm_close; rtd->ops.pointer = soc_pcm_pointer; rtd->ops.ioctl = soc_pcm_ioctl; }
for_each_rtdcom(rtd, rtdcom) { const struct snd_pcm_ops *ops = rtdcom->component->driver->ops;
if (!ops) continue;
if (ops->ack) rtd->ops.ack = soc_rtdcom_ack; if (ops->copy_user) rtd->ops.copy_user = soc_rtdcom_copy_user; if (ops->copy_kernel) rtd->ops.copy_kernel = soc_rtdcom_copy_kernel; if (ops->fill_silence) rtd->ops.fill_silence = soc_rtdcom_fill_silence; if (ops->page) rtd->ops.page = soc_rtdcom_page; if (ops->mmap) rtd->ops.mmap = soc_rtdcom_mmap; }
if (playback) snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK, &rtd->ops);
if (capture) snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE, &rtd->ops);
for_each_rtdcom(rtd, rtdcom) { component = rtdcom->component;
if (!component->driver->pcm_new) continue;
ret = component->driver->pcm_new(rtd); if (ret < 0) { dev_err(component->dev, "ASoC: pcm constructor failed: %d\n", ret); return ret; } }
pcm->private_free = soc_pcm_private_free; out: dev_info(rtd->card->dev, "%s <-> %s mapping ok\n", (rtd->num_codecs > 1) ? "multicodec" : rtd->codec_dai->name, cpu_dai->name); return ret;
On Mon, 11 Feb 2019 16:41:31 +0100, Pierre-Louis Bossart wrote:
On 2/9/19 3:27 AM, Takashi Iwai wrote:
On Sat, 09 Feb 2019 00:29:53 +0100, Pierre-Louis Bossart wrote:
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
BE dai links only have internal PCM's and their substream ops may not be set. Suspending these PCM's will result in their ops->trigger() being invoked and cause a kernel oops. So skip suspending PCM's if their ops are NULL.
Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
sound/core/pcm_native.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue;
/*
* Skip BE dai link PCM's that are internal and may
* not have their substream ops set.
*/
if (!substream->ops)
continue;
err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;
Basically it's OK and safe to apply this check. We may need to add such sanity checks in more places if this really hits.
But I still wonder how this can go through. Is substream->runtime set even if substream->ops is NULL? The substream->runtime is assigned dynamically at opening a substream via snd_pcm_attach_substream(), so without opening it, it must be NULL.
This error case was exposed when we tried to get rid of snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() instead to do the work for us.
In the case of back-ends, all initializations are bypassed in soc_new_pcm() - see below a code snippet - and the ops aren't set before suspend is called. The complete thread where we discussed this is at https://github.com/thesofproject/linux/pull/582
Thanks, now I took a look at the code. And, this surfaced that the another part of the problem is that DPCM does the substream open handling by itself in soc-pcm.c. Oh well. I'm afraid that we have some hidden bugs there that may lead to a crash easily. (Fortunately (or unfortunately) fuzzer isn't performed on ASoC because we have no virtual device driver :)
IMO, some of DPCM code should be raised to the upper level, to ALSA PCM core. The current code is still in a rough form of early plumbing.
In anyway, I merged the patch now with a bit more comments.
Thanks!
Takashi
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue;
/*
* Skip BE dai link PCM's that are internal and may
* not have their substream ops set.
*/
if (!substream->ops)
continue;
err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;
Basically it's OK and safe to apply this check. We may need to add such sanity checks in more places if this really hits.
But I still wonder how this can go through. Is substream->runtime set even if substream->ops is NULL? The substream->runtime is assigned dynamically at opening a substream via snd_pcm_attach_substream(), so without opening it, it must be NULL.
This error case was exposed when we tried to get rid of snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() instead to do the work for us.
In the case of back-ends, all initializations are bypassed in soc_new_pcm() - see below a code snippet - and the ops aren't set before suspend is called. The complete thread where we discussed this is at https://github.com/thesofproject/linux/pull/582
Thanks, now I took a look at the code. And, this surfaced that the another part of the problem is that DPCM does the substream open handling by itself in soc-pcm.c. Oh well. I'm afraid that we have some hidden bugs there that may lead to a crash easily. (Fortunately (or unfortunately) fuzzer isn't performed on ASoC because we have no virtual device driver :)
IMO, some of DPCM code should be raised to the upper level, to ALSA PCM core. The current code is still in a rough form of early plumbing.
Can't disagree, we were surprised to hit this issue knowing that the SOF code isn't the first to use DPCM at all (same with some topology issues). It's very likely that there are specific initialization flows that aren't quite right, hopefully we'll fix them one after the other :-)
In anyway, I merged the patch now with a bit more comments.
Thanks!
Takashi
Thanks for the review+additional comments, much appreciated.
On Mon, Feb 11, 2019 at 10:59:49AM -0600, Pierre-Louis Bossart wrote:
Can't disagree, we were surprised to hit this issue knowing that the SOF code isn't the first to use DPCM at all (same with some topology issues). It's very likely that there are specific initialization flows that aren't quite right, hopefully we'll fix them one after the other :-)
I'm fairly sure that DPCM has only been tested in very restricted use cases, and then mostly at the system level rather than directly. Obviously long term the component refactoring and digital domains should improve matters here but that's going to take a while :(
On Sat, 2019-02-09 at 10:27 +0100, Takashi Iwai wrote:
On Sat, 09 Feb 2019 00:29:53 +0100, Pierre-Louis Bossart wrote:
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
BE dai links only have internal PCM's and their substream ops may not be set. Suspending these PCM's will result in their ops->trigger() being invoked and cause a kernel oops. So skip suspending PCM's if their ops are NULL.
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com
Signed-off-by: Pierre-Louis Bossart < pierre-louis.bossart@linux.intel.com>
sound/core/pcm_native.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue;
/*
* Skip BE dai link PCM's that are internal and
may
* not have their substream ops set.
*/
if (!substream->ops)
continue;
err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;
Basically it's OK and safe to apply this check. We may need to add such sanity checks in more places if this really hits.
But I still wonder how this can go through. Is substream->runtime set even if substream->ops is NULL? The substream->runtime is assigned dynamically at opening a substream via snd_pcm_attach_substream(), so without opening it, it must be NULL.
Hi Takashi,
My guess is that this happens during dpcm_be_connect(fe, be, stream) in dpcm_add_paths().
The reason this wasnt exposed before was that the fe dai link pcm's were suspended first. So when it was BE dai links' turn, the pcm was already suspended. In the case of SOF, the order of dai links in the rtd_list is BE dai links first and then the FE dai links.
Thanks, Ranjani
thanks,
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
participants (4)
-
Mark Brown
-
Pierre-Louis Bossart
-
Ranjani Sridharan
-
Takashi Iwai