[PATCH v2 0/2] ASoC: Intel: avs: DSP recovery and resume fixes
Two fixes that are result of the recent discussions [1][2].
First adds missing locking around snd_pcm_stop() while the second fix sets substream state to DISCONNECTED if any suspend/resume related operation fails so that userspace has means to be aware that something went wrong during said operation.
Changes in v2: - __snd_pcm_set_state() replaced direct assignments of PCM state
[1]: https://lore.kernel.org/alsa-devel/73e6425f-8e51-e26f-ad83-ccc5df517a43@inte... [2]: https://lore.kernel.org/alsa-devel/20221104131244.3920179-1-cezary.rojewski@...
Cezary Rojewski (2): ASoC: Intel: avs: Lock substream before snd_pcm_stop() ASoC: Intel: avs: Disconnect substream if suspend or resume fails
sound/soc/intel/avs/ipc.c | 2 ++ sound/soc/intel/avs/pcm.c | 10 ++++++++-- 2 files changed, 10 insertions(+), 2 deletions(-)
snd_pcm_stop() shall be called with stream lock held to prevent any races between nonatomic streaming operations.
Fixes: 2f1f570cd730 ("ASoC: Intel: avs: Coredump and recovery flow") Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com --- sound/soc/intel/avs/ipc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/intel/avs/ipc.c b/sound/soc/intel/avs/ipc.c index 152f8d0bdf8e..07655298b6c7 100644 --- a/sound/soc/intel/avs/ipc.c +++ b/sound/soc/intel/avs/ipc.c @@ -123,7 +123,9 @@ static void avs_dsp_recovery(struct avs_dev *adev) if (!substream || !substream->runtime) continue;
+ snd_pcm_stream_lock(substream); snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED); + snd_pcm_stream_unlock(substream); } } }
On Mon, 14 Nov 2022 12:37:28 +0100, Cezary Rojewski wrote:
snd_pcm_stop() shall be called with stream lock held to prevent any races between nonatomic streaming operations.
Fixes: 2f1f570cd730 ("ASoC: Intel: avs: Coredump and recovery flow") Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
sound/soc/intel/avs/ipc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/intel/avs/ipc.c b/sound/soc/intel/avs/ipc.c index 152f8d0bdf8e..07655298b6c7 100644 --- a/sound/soc/intel/avs/ipc.c +++ b/sound/soc/intel/avs/ipc.c @@ -123,7 +123,9 @@ static void avs_dsp_recovery(struct avs_dev *adev) if (!substream || !substream->runtime) continue;
snd_pcm_stream_lock(substream); snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED);
snd_pcm_stream_unlock(substream);
Isn't it in the interruptible context? If so, you need the _irq() variant.
Takashi
On Mon, 14 Nov 2022 14:00:12 +0100, Takashi Iwai wrote:
On Mon, 14 Nov 2022 12:37:28 +0100, Cezary Rojewski wrote:
snd_pcm_stop() shall be called with stream lock held to prevent any races between nonatomic streaming operations.
Fixes: 2f1f570cd730 ("ASoC: Intel: avs: Coredump and recovery flow") Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
sound/soc/intel/avs/ipc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/intel/avs/ipc.c b/sound/soc/intel/avs/ipc.c index 152f8d0bdf8e..07655298b6c7 100644 --- a/sound/soc/intel/avs/ipc.c +++ b/sound/soc/intel/avs/ipc.c @@ -123,7 +123,9 @@ static void avs_dsp_recovery(struct avs_dev *adev) if (!substream || !substream->runtime) continue;
snd_pcm_stream_lock(substream); snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED);
snd_pcm_stream_unlock(substream);
Isn't it in the interruptible context? If so, you need the _irq() variant.
Correction: when it's a non-atomic stream, that doesn't matter, both are identical.
Takashi
On 2022-11-14 2:16 PM, Takashi Iwai wrote:
On Mon, 14 Nov 2022 14:00:12 +0100, Takashi Iwai wrote:
On Mon, 14 Nov 2022 12:37:28 +0100, Cezary Rojewski wrote:
...
@@ -123,7 +123,9 @@ static void avs_dsp_recovery(struct avs_dev *adev) if (!substream || !substream->runtime) continue;
snd_pcm_stream_lock(substream); snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED);
snd_pcm_stream_unlock(substream);
Isn't it in the interruptible context? If so, you need the _irq() variant.
Thanks for paying attention to detail, Takashi.
Correction: when it's a non-atomic stream, that doesn't matter, both are identical.
In regard to the point found above, that's precisely why I decided to use variant with shorten name. Matter of taste I guess. Can switch to _irq() if you believe it's the correct approach.
Regards, Czarek
On Mon, 14 Nov 2022 14:25:31 +0100, Cezary Rojewski wrote:
On 2022-11-14 2:16 PM, Takashi Iwai wrote:
On Mon, 14 Nov 2022 14:00:12 +0100, Takashi Iwai wrote:
On Mon, 14 Nov 2022 12:37:28 +0100, Cezary Rojewski wrote:
...
@@ -123,7 +123,9 @@ static void avs_dsp_recovery(struct avs_dev *adev) if (!substream || !substream->runtime) continue;
snd_pcm_stream_lock(substream); snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED);
snd_pcm_stream_unlock(substream);
Isn't it in the interruptible context? If so, you need the _irq() variant.
Thanks for paying attention to detail, Takashi.
Correction: when it's a non-atomic stream, that doesn't matter, both are identical.
In regard to the point found above, that's precisely why I decided to use variant with shorten name. Matter of taste I guess. Can switch to _irq() if you believe it's the correct approach.
It's fine in both ways. For the snd_pcm_stream_lock(), maybe worth to put a comment, though.
thanks,
Takashi
To improve performance and overall system stability, suspend/resume operations for ASoC cards always return success status and defer the actual work.
Because of that, if a substream fails to resume, userspace may still attempt to invoke commands on it as from their perspective the operation completed successfully. Set substream's state to DISCONNECTED to ensure no further commands are attempted.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com ---
Changes in v2: - __snd_pcm_set_state() replaced direct assignments of PCM state
sound/soc/intel/avs/pcm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/sound/soc/intel/avs/pcm.c b/sound/soc/intel/avs/pcm.c index ca624fbb5c0d..70442b5212c2 100644 --- a/sound/soc/intel/avs/pcm.c +++ b/sound/soc/intel/avs/pcm.c @@ -934,8 +934,11 @@ static int avs_component_pm_op(struct snd_soc_component *component, bool be, rtd = snd_pcm_substream_chip(data->substream); if (rtd->dai_link->no_pcm == be && !rtd->dai_link->ignore_suspend) { ret = op(dai, data); - if (ret < 0) + if (ret < 0) { + __snd_pcm_set_state(data->substream->runtime, + SNDRV_PCM_STATE_DISCONNECTED); return ret; + } } }
@@ -944,8 +947,11 @@ static int avs_component_pm_op(struct snd_soc_component *component, bool be, rtd = snd_pcm_substream_chip(data->substream); if (rtd->dai_link->no_pcm == be && !rtd->dai_link->ignore_suspend) { ret = op(dai, data); - if (ret < 0) + if (ret < 0) { + __snd_pcm_set_state(data->substream->runtime, + SNDRV_PCM_STATE_DISCONNECTED); return ret; + } } } }
On Mon, 14 Nov 2022 12:37:29 +0100, Cezary Rojewski wrote:
To improve performance and overall system stability, suspend/resume operations for ASoC cards always return success status and defer the actual work.
Because of that, if a substream fails to resume, userspace may still attempt to invoke commands on it as from their perspective the operation completed successfully. Set substream's state to DISCONNECTED to ensure no further commands are attempted.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com
Hm, that might work, but note that, once when the stream is set with the disconnected state, it won't be taken back to the normal state any longer on most sound backends. That is, it'll be gone and won't revive unless you completely unload and reload the whole stuff. If that's the intended behavior, that's fine, of course. So just to make sure.
thanks,
Takashi
On 2022-11-14 2:02 PM, Takashi Iwai wrote:
On Mon, 14 Nov 2022 12:37:29 +0100, Cezary Rojewski wrote:
To improve performance and overall system stability, suspend/resume operations for ASoC cards always return success status and defer the actual work.
Because of that, if a substream fails to resume, userspace may still attempt to invoke commands on it as from their perspective the operation completed successfully. Set substream's state to DISCONNECTED to ensure no further commands are attempted.
...
Hm, that might work, but note that, once when the stream is set with the disconnected state, it won't be taken back to the normal state any longer on most sound backends. That is, it'll be gone and won't revive unless you completely unload and reload the whole stuff. If that's the intended behavior, that's fine, of course. So just to make sure.
Good point.
Our intention: if we fail e.g.: on resume, we would like the framework to invoke ->hw_free() and close us. Right now, if we pretend that everything is okay, invalid actions can be performed on our streams. This all comes down to userspace calling "just" snd_pcm_resume(). If we had an option to opt-in to a _hw_params() + _prepare() + _resume() path, then things would look differently.
On Mon, 14 Nov 2022 15:08:17 +0100, Cezary Rojewski wrote:
On 2022-11-14 2:02 PM, Takashi Iwai wrote:
On Mon, 14 Nov 2022 12:37:29 +0100, Cezary Rojewski wrote:
To improve performance and overall system stability, suspend/resume operations for ASoC cards always return success status and defer the actual work.
Because of that, if a substream fails to resume, userspace may still attempt to invoke commands on it as from their perspective the operation completed successfully. Set substream's state to DISCONNECTED to ensure no further commands are attempted.
...
Hm, that might work, but note that, once when the stream is set with the disconnected state, it won't be taken back to the normal state any longer on most sound backends. That is, it'll be gone and won't revive unless you completely unload and reload the whole stuff. If that's the intended behavior, that's fine, of course. So just to make sure.
Good point.
Our intention: if we fail e.g.: on resume, we would like the framework to invoke ->hw_free() and close us. Right now, if we pretend that everything is okay, invalid actions can be performed on our streams. This all comes down to userspace calling "just" snd_pcm_resume(). If we had an option to opt-in to a _hw_params() + _prepare() + _resume() path, then things would look differently.
So you'd rather like to make the stream appearing and working after re-opening the stream again? Then DISCONNECTED state might be too strong.
If the broken state could be recovered by the PCM PREPARE phase, then we may (ab-)use XRUN state, instead. Then application shall re-setup via PCM prepare call. But if hw_free/hw_params is required, it won't work.
Other than that, there is no such PCM state that forces to close and re-open, I'm afraid. You can have an internal error state and let the stream returning an error at each operation, instead, too.
And, creating a new PCM state is very difficult. It would influence way too much, IMO, as each application code has to be modified to handle the new PCM state.
Takashi
On 2022-11-14 3:26 PM, Takashi Iwai wrote:
On Mon, 14 Nov 2022 15:08:17 +0100, Cezary Rojewski wrote:
On 2022-11-14 2:02 PM, Takashi Iwai wrote:
Hm, that might work, but note that, once when the stream is set with the disconnected state, it won't be taken back to the normal state any longer on most sound backends. That is, it'll be gone and won't revive unless you completely unload and reload the whole stuff. If that's the intended behavior, that's fine, of course. So just to make sure.
Good point.
Our intention: if we fail e.g.: on resume, we would like the framework to invoke ->hw_free() and close us. Right now, if we pretend that everything is okay, invalid actions can be performed on our streams. This all comes down to userspace calling "just" snd_pcm_resume(). If we had an option to opt-in to a _hw_params() + _prepare() + _resume() path, then things would look differently.
So you'd rather like to make the stream appearing and working after re-opening the stream again? Then DISCONNECTED state might be too strong.
If the broken state could be recovered by the PCM PREPARE phase, then we may (ab-)use XRUN state, instead. Then application shall re-setup via PCM prepare call. But if hw_free/hw_params is required, it won't work.
To resume properly an AudioDSP stream, operations typically found in hw_params() and prepare() need to be re-executed _before_ actual _TRIGGER_RESUME can be called on a stream.
Since implementations found in userspace apps such as aplay and pulseaudio first invoke snd_pcm_resume() and then snd_pcm_prepare() if the former fails, one could abuse this flow by doing hw_params-related operations in _resume() and returning a specific error code e.g.: -ESTRPIPE when they succeed (if they fail, the internal error code would be returned instead of course). Prepare tasks are left to snd_pcm_prepare() just like in standard case.
I believe such code is a good example of _code smell_ though. First we abuse the error path, second we basically drop the _TRIGGER_RESUME entirely - after all, in such approach it mimics hw_params() and will always fail, either with an internal error code or hardcoded -ESTRPIPE. _TRIGGER_START/STOP would be reused once snd_pcm_prepare() succeeds causing another problem - no differentiation between start and resume and thus lack of information when to or when not to poll DRSM.
Thus, we ended up with 'state = DISCONNECTED' if we encounter a firmware issue during PCM-pm. Keeps things sane.
Other than that, there is no such PCM state that forces to close and re-open, I'm afraid. You can have an internal error state and let the stream returning an error at each operation, instead, too.
And, creating a new PCM state is very difficult. It would influence way too much, IMO, as each application code has to be modified to handle the new PCM state.
Agree, this basically screams not to follow that path.
Regards, Czarek
participants (2)
-
Cezary Rojewski
-
Takashi Iwai