[alsa-devel] [RFC] ASoC: pcm: Trigger all commands on error

If a start command is triggered and an error occures in the first called function, e.g. DMA, all other functions are not executed and the error value is returned immediately.
In case of a start command, the calling function will call undo_start, calling this trigger function with the stop command. It will call stop for each function. But only the first function was started previously. The other functions may fail in the assumption that a stop command always comes after a start command.
As the API does not specify the behaviour of trigger functionpointers, I think this should be fixed in the function calling the trigger functionpointers.
This patch changes the behaviour. The trigger function calls all functionpointers independent of their returncodes. The first error-code is returned.
Signed-off-by: Markus Pargmann mpa@pengutronix.de --- sound/soc/soc-pcm.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 330c9a6..23fc25b 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -616,26 +616,28 @@ static int soc_pcm_trigger(struct snd_pcm_substream *substream, int cmd) struct snd_soc_platform *platform = rtd->platform; struct snd_soc_dai *cpu_dai = rtd->cpu_dai; struct snd_soc_dai *codec_dai = rtd->codec_dai; - int ret; + int ret = 0; + int err = 0;
if (codec_dai->driver->ops->trigger) { ret = codec_dai->driver->ops->trigger(substream, cmd, codec_dai); - if (ret < 0) - return ret; + if (!err && ret) + err = ret; }
if (platform->driver->ops && platform->driver->ops->trigger) { ret = platform->driver->ops->trigger(substream, cmd); - if (ret < 0) - return ret; + if (!err && ret) + err = ret; }
if (cpu_dai->driver->ops->trigger) { ret = cpu_dai->driver->ops->trigger(substream, cmd, cpu_dai); - if (ret < 0) - return ret; + if (!err && ret) + err = ret; } - return 0; + + return err; }
static int soc_pcm_bespoke_trigger(struct snd_pcm_substream *substream,

On Fri, Oct 04, 2013 at 12:53:17PM +0200, Markus Pargmann wrote:
In case of a start command, the calling function will call undo_start, calling this trigger function with the stop command. It will call stop for each function. But only the first function was started previously. The other functions may fail in the assumption that a stop command always comes after a start command.
As the API does not specify the behaviour of trigger functionpointers, I think this should be fixed in the function calling the trigger functionpointers.
This patch changes the behaviour. The trigger function calls all functionpointers independent of their returncodes. The first error-code is returned.
I'm not sure if this will resolve the problem robustly - if something is going to get confused about this it seems just as likely that the trigger that failed will get upset because it gets a _STOP after returning an error from _START. I think what I'd expect here is that the unwind on error would unwind only the triggers that it successfully ran. Equally well I'd be a bit surprised if a trigger function actually had a problem with extra _STOPs since they mostly just do register writes... did you see this from code inspection or were you resolving a practical problem in your system?

On Mon, Oct 07, 2013 at 06:27:06PM +0100, Mark Brown wrote:
On Fri, Oct 04, 2013 at 12:53:17PM +0200, Markus Pargmann wrote:
In case of a start command, the calling function will call undo_start, calling this trigger function with the stop command. It will call stop for each function. But only the first function was started previously. The other functions may fail in the assumption that a stop command always comes after a start command.
As the API does not specify the behaviour of trigger functionpointers, I think this should be fixed in the function calling the trigger functionpointers.
This patch changes the behaviour. The trigger function calls all functionpointers independent of their returncodes. The first error-code is returned.
I'm not sure if this will resolve the problem robustly - if something is going to get confused about this it seems just as likely that the trigger that failed will get upset because it gets a _STOP after returning an error from _START. I think what I'd expect here is that the unwind on error would unwind only the triggers that it successfully ran. Equally well I'd be a bit surprised if a trigger function actually had a problem with extra _STOPs since they mostly just do register writes... did you see this from code inspection or were you resolving a practical problem in your system?
Yes this is a practical problem on mx28. mxs-saif enables/disables clocks when starting/stopping. I debugged a problem where the DMA engine returned an error on channel preparation and clk_disabled in mxs-saif was called twice.
To make proper error handling and only STOP functions that successfully started, we have to store the state of each function. The command error handling is done in pcm_native.c so we don't have any influence on that.
Another possibility is to explicitly allow multiple _STOPs. Then I could fix the driver to store the clock state. But I would prefer a generic solution.
Regards,
Markus

On Tue, Oct 08, 2013 at 10:29:33AM +0200, Markus Pargmann wrote:
To make proper error handling and only STOP functions that successfully started, we have to store the state of each function. The command error handling is done in pcm_native.c so we don't have any influence on that.
I meant just unwinding the things done in the trigger call.
Another possibility is to explicitly allow multiple _STOPs. Then I could fix the driver to store the clock state. But I would prefer a generic solution.
You're going to have to handle this anyway to be robust - what happens if it's the clock enable that fails?

On Tue, Oct 08, 2013 at 10:34:13AM +0100, Mark Brown wrote:
On Tue, Oct 08, 2013 at 10:29:33AM +0200, Markus Pargmann wrote:
To make proper error handling and only STOP functions that successfully started, we have to store the state of each function. The command error handling is done in pcm_native.c so we don't have any influence on that.
I meant just unwinding the things done in the trigger call.
You mean unwinding when we detect the error? If we return the error code from the trigger function, the pcm_native.c will still trigger a STOP command which is executed on all components.
Another possibility is to explicitly allow multiple _STOPs. Then I could fix the driver to store the clock state. But I would prefer a generic solution.
You're going to have to handle this anyway to be robust - what happens if it's the clock enable that fails?
If clk_enable fails, we can return an error code in the mxs-saif trigger function. Of course the error handling is missing here, but it is not necessary to store the state of the clock to handle errors properly.
Regards,
Markus

On Tue, Oct 08, 2013 at 11:48:39AM +0200, Markus Pargmann wrote:
On Tue, Oct 08, 2013 at 10:34:13AM +0100, Mark Brown wrote:
I meant just unwinding the things done in the trigger call.
You mean unwinding when we detect the error? If we return the error code from the trigger function, the pcm_native.c will still trigger a STOP command which is executed on all components.
Sure, but it's another way of getting consistency and does seem a bit more obvious.
Another possibility is to explicitly allow multiple _STOPs. Then I could fix the driver to store the clock state. But I would prefer a generic solution.
You're going to have to handle this anyway to be robust - what happens if it's the clock enable that fails?
If clk_enable fails, we can return an error code in the mxs-saif trigger function. Of course the error handling is missing here, but it is not necessary to store the state of the clock to handle errors properly.
Right, but then _STOP will be called without the clock having been successfully enabled so you'll have an unbalanced clk_disable().

On Tue, Oct 08, 2013 at 11:44:57AM +0100, Mark Brown wrote:
On Tue, Oct 08, 2013 at 11:48:39AM +0200, Markus Pargmann wrote:
On Tue, Oct 08, 2013 at 10:34:13AM +0100, Mark Brown wrote:
I meant just unwinding the things done in the trigger call.
You mean unwinding when we detect the error? If we return the error code from the trigger function, the pcm_native.c will still trigger a STOP command which is executed on all components.
Sure, but it's another way of getting consistency and does seem a bit more obvious.
It is still not consistent as we get an additional STOP after cleanly unwinding the START:
snd_pcm_do_start() soc_pcm_trigger() # START codec_dai->driver->ops->trigger() # START platform->driver->ops->trigger() # Assuming an error here
unwind: codec_dai->driver->ops->trigger() # STOP
A failed snd_pcm_do_start() is detected, so undo_start is called:
snd_pcm_undo_start() soc_pcm_trigger() # STOP codec_dai->driver->ops->trigger() # STOP platform->driver->ops->trigger() # STOP cpu_dai->driver->ops->trigger() # STOP
Now all component trigger functions are called without a matching START command.
Another possibility is to explicitly allow multiple _STOPs. Then I could fix the driver to store the clock state. But I would prefer a generic solution.
You're going to have to handle this anyway to be robust - what happens if it's the clock enable that fails?
If clk_enable fails, we can return an error code in the mxs-saif trigger function. Of course the error handling is missing here, but it is not necessary to store the state of the clock to handle errors properly.
Right, but then _STOP will be called without the clock having been successfully enabled so you'll have an unbalanced clk_disable().
_STOP is only called as long as the error handling is not working properly. Otherwise STOP should not be called for a failed START.
Regards,
Markus

On Wed, Oct 09, 2013 at 03:05:53PM +0200, Markus Pargmann wrote:
On Tue, Oct 08, 2013 at 11:44:57AM +0100, Mark Brown wrote:
Right, but then _STOP will be called without the clock having been successfully enabled so you'll have an unbalanced clk_disable().
_STOP is only called as long as the error handling is not working properly. Otherwise STOP should not be called for a failed START.
I'm not sure what you mean here. How would the error be handled without calling _STOP?

On Wed, Oct 09, 2013 at 02:20:04PM +0100, Mark Brown wrote:
On Wed, Oct 09, 2013 at 03:05:53PM +0200, Markus Pargmann wrote:
On Tue, Oct 08, 2013 at 11:44:57AM +0100, Mark Brown wrote:
Right, but then _STOP will be called without the clock having been successfully enabled so you'll have an unbalanced clk_disable().
_STOP is only called as long as the error handling is not working properly. Otherwise STOP should not be called for a failed START.
I'm not sure what you mean here. How would the error be handled without calling _STOP?
The START error should be handled in mxs_saif and there shouldn't be a _STOP call for mxs_saif_trigger afterwards to handle the failed START.
Regards,
Markus

Store the state of the components to detect not matching START/STOP command pairs. Otherwise a component's trigger function may be called twice with the same command which can lead to problems.
Signed-off-by: Markus Pargmann mpa@pengutronix.de ---
This is a patch which stores the state of platform, cpu_dai and codec_dai to call their trigger functions only if they are not already in the correct state.
Regards,
Markus
include/sound/soc.h | 12 +++++++++++ sound/soc/soc-pcm.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 66 insertions(+), 3 deletions(-)
diff --git a/include/sound/soc.h b/include/sound/soc.h index d22cb0a..3b42b99 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -1053,6 +1053,13 @@ struct snd_soc_card { void *drvdata; };
+enum soc_pcm_cmd_state { + SOC_PCM_STOPPED, + SOC_PCM_RUNNING, + SOC_PCM_PAUSED, + SOC_PCM_SUSPENDED, +}; + /* SoC machine DAI configuration, glues a codec and cpu DAI together */ struct snd_soc_pcm_runtime { struct device *dev; @@ -1079,6 +1086,11 @@ struct snd_soc_pcm_runtime { struct snd_soc_dai *cpu_dai;
struct delayed_work delayed_work; + + enum soc_pcm_cmd_state cmd_platform_state; + enum soc_pcm_cmd_state cmd_cpu_dai_state; + enum soc_pcm_cmd_state cmd_codec_dai_state; + #ifdef CONFIG_DEBUG_FS struct dentry *debugfs_dpcm_root; struct dentry *debugfs_dpcm_state; diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 330c9a6..f10bd8d 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -610,6 +610,51 @@ static int soc_pcm_hw_free(struct snd_pcm_substream *substream) return 0; }
+static int soc_pcm_cmd_check(enum soc_pcm_cmd_state state, int cmd) +{ + switch (cmd) { + case SNDRV_PCM_TRIGGER_START: + case SNDRV_PCM_TRIGGER_RESUME: + case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: + if (state == SOC_PCM_RUNNING) + return 0; + break; + case SNDRV_PCM_TRIGGER_STOP: + if (state == SOC_PCM_STOPPED) + return 0; + break; + case SNDRV_PCM_TRIGGER_SUSPEND: + if (state == SOC_PCM_SUSPENDED) + return 0; + break; + case SNDRV_PCM_TRIGGER_PAUSE_PUSH: + if (state == SOC_PCM_PAUSED) + return 0; + break; + } + return 1; +} + +static void soc_pcm_cmd_state_update(enum soc_pcm_cmd_state *state, int cmd) +{ + switch (cmd) { + case SNDRV_PCM_TRIGGER_START: + case SNDRV_PCM_TRIGGER_RESUME: + case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: + *state = SOC_PCM_RUNNING; + break; + case SNDRV_PCM_TRIGGER_STOP: + *state = SOC_PCM_STOPPED; + break; + case SNDRV_PCM_TRIGGER_SUSPEND: + *state = SOC_PCM_SUSPENDED; + break; + case SNDRV_PCM_TRIGGER_PAUSE_PUSH: + *state = SOC_PCM_PAUSED; + break; + } +} + static int soc_pcm_trigger(struct snd_pcm_substream *substream, int cmd) { struct snd_soc_pcm_runtime *rtd = substream->private_data; @@ -618,22 +663,28 @@ static int soc_pcm_trigger(struct snd_pcm_substream *substream, int cmd) struct snd_soc_dai *codec_dai = rtd->codec_dai; int ret;
- if (codec_dai->driver->ops->trigger) { + if (codec_dai->driver->ops->trigger && + soc_pcm_cmd_check(rtd->cmd_codec_dai_state, cmd)) { ret = codec_dai->driver->ops->trigger(substream, cmd, codec_dai); if (ret < 0) return ret; + soc_pcm_cmd_state_update(&rtd->cmd_codec_dai_state, cmd); }
- if (platform->driver->ops && platform->driver->ops->trigger) { + if (platform->driver->ops && platform->driver->ops->trigger && + soc_pcm_cmd_check(rtd->cmd_platform_state, cmd)) { ret = platform->driver->ops->trigger(substream, cmd); if (ret < 0) return ret; + soc_pcm_cmd_state_update(&rtd->cmd_platform_state, cmd); }
- if (cpu_dai->driver->ops->trigger) { + if (cpu_dai->driver->ops->trigger && + soc_pcm_cmd_check(rtd->cmd_cpu_dai_state, cmd)) { ret = cpu_dai->driver->ops->trigger(substream, cmd, cpu_dai); if (ret < 0) return ret; + soc_pcm_cmd_state_update(&rtd->cmd_cpu_dai_state, cmd); } return 0; }

On Wed, Oct 09, 2013 at 04:31:16PM +0200, Markus Pargmann wrote:
Store the state of the components to detect not matching START/STOP command pairs. Otherwise a component's trigger function may be called twice with the same command which can lead to problems.
This is a patch which stores the state of platform, cpu_dai and codec_dai to call their trigger functions only if they are not already in the correct state.
No, this seems way too complex and fragile. The original idea was better though I'm still not convinced it actually fixes the problem you think it fixes.

On Wed, Oct 09, 2013 at 04:12:42PM +0200, Markus Pargmann wrote:
On Wed, Oct 09, 2013 at 02:20:04PM +0100, Mark Brown wrote:
I'm not sure what you mean here. How would the error be handled without calling _STOP?
The START error should be handled in mxs_saif and there shouldn't be a _STOP call for mxs_saif_trigger afterwards to handle the failed START.
How would the framework know that the error has been handled - what tells it that the error returned is for an error that has been handled as opposed to an error that has not been handled?

On Wed, Oct 09, 2013 at 03:59:17PM +0100, Mark Brown wrote:
On Wed, Oct 09, 2013 at 04:12:42PM +0200, Markus Pargmann wrote:
On Wed, Oct 09, 2013 at 02:20:04PM +0100, Mark Brown wrote:
I'm not sure what you mean here. How would the error be handled without calling _STOP?
The START error should be handled in mxs_saif and there shouldn't be a _STOP call for mxs_saif_trigger afterwards to handle the failed START.
How would the framework know that the error has been handled - what tells it that the error returned is for an error that has been handled as opposed to an error that has not been handled?
I finally understand what you mean. Yes, RFCv1 fixes the behavior for the components without _START failures. But it doesn't fix it for the failing component.
For mxs-saif, we still have to store if the last START failed to handle the following STOP correctly which is not very different from handling double STOPs correctly. I don't have an idea how to fix RFCv1 without using some state as in RFCv2. Perhaps we should simply fix it in mxs-saif and other drivers then?
Regards,
Markus

On Thu, Oct 10, 2013 at 10:40:40AM +0200, Markus Pargmann wrote:
For mxs-saif, we still have to store if the last START failed to handle the following STOP correctly which is not very different from handling double STOPs correctly. I don't have an idea how to fix RFCv1 without using some state as in RFCv2. Perhaps we should simply fix it in mxs-saif and other drivers then?
I think we have to - it seems like it's going to be more trouble than it's worth to try to keep track of this in the core, there seems to be a bit of an assumption that the triggers can just be called and for things that aren't refcounting it's probably more robust to run the trigger extra times rather than adding a state machine that we might get wrong.
participants (2)
-
Mark Brown
-
Markus Pargmann