[PATCH] sound: sof: ioc4-topology: avoid extra dai_params copy
From: Arnd Bergmann arnd@arndb.de
The snd_pcm_hw_params structure is really too large to fit on the stack. Because of the way that clang inlines functions, it ends up twice in one function, which exceeds the 1024 byte limit for 32-bit architecutes:
sound/soc/sof/ipc4-topology.c:1700:1: error: stack frame size (1304) exceeds limit (1024) in 'sof_ipc4_prepare_copier_module' [-Werror,-Wframe-larger-than] sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget,
From what I can tell, this was unintentional, as both sof_ipc4_prepare_dai_copier() and sof_ipc4_prepare_copier_module() make a copy for the same purpose, but copying it once has the exact same effect.
Remove the extra copy and change the direct struct assignment to an explicit memcpy() call to make it clearer to the reader that this is what happens. Note that gcc treats struct assignment as a memcpy() that may be inlined anyway, so the resulting object code is the same.
Fixes: f9209644ae76 ("ASoC: SOF: ipc4-topology: Correct DAI copier config and NHLT blob request") Signed-off-by: Arnd Bergmann arnd@arndb.de --- sound/soc/sof/ipc4-topology.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 87be7f16e8c2..77f6850010f6 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1651,7 +1651,6 @@ sof_ipc4_prepare_dai_copier(struct snd_sof_dev *sdev, struct snd_sof_dai *dai, struct snd_pcm_hw_params *params, int dir) { struct sof_ipc4_available_audio_format *available_fmt; - struct snd_pcm_hw_params dai_params = *params; struct sof_ipc4_copier_data *copier_data; struct sof_ipc4_pin_format *pin_fmts; struct sof_ipc4_copier *ipc4_copier; @@ -1676,7 +1675,7 @@ sof_ipc4_prepare_dai_copier(struct snd_sof_dev *sdev, struct snd_sof_dai *dai, num_pin_fmts = available_fmt->num_input_formats; }
- ret = sof_ipc4_adjust_params_to_dai_format(sdev, &dai_params, pin_fmts, + ret = sof_ipc4_adjust_params_to_dai_format(sdev, params, pin_fmts, num_pin_fmts); if (ret) return ret; @@ -1684,15 +1683,11 @@ sof_ipc4_prepare_dai_copier(struct snd_sof_dev *sdev, struct snd_sof_dai *dai, single_bitdepth = sof_ipc4_copier_is_single_bitdepth(sdev, pin_fmts, num_pin_fmts); ret = snd_sof_get_nhlt_endpoint_data(sdev, dai, single_bitdepth, - &dai_params, + params, ipc4_copier->dai_index, ipc4_copier->dai_type, dir, &ipc4_copier->copier_config, &copier_data->gtw_cfg.config_length); - /* Update the params to reflect the changes made in this function */ - if (!ret) - *params = dai_params; - return ret; }
@@ -1784,9 +1779,9 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * for capture. */ if (dir == SNDRV_PCM_STREAM_PLAYBACK) - ref_params = *fe_params; + memcpy(&ref_params, fe_params, sizeof(ref_params)); else - ref_params = *pipeline_params; + memcpy(&ref_params, pipeline_params, sizeof(ref_params));
copier_data->gtw_cfg.node_id &= ~SOF_IPC4_NODE_INDEX_MASK; copier_data->gtw_cfg.node_id |= @@ -1819,7 +1814,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * In case of capture the ref_params returned will be used to * find the input configuration of the copier. */ - ref_params = *fe_params; + memcpy(&ref_params, fe_params, sizeof(ref_params)); ret = sof_ipc4_prepare_dai_copier(sdev, dai, &ref_params, dir); if (ret < 0) return ret; @@ -1829,7 +1824,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * input configuration of the copier. */ if (dir == SNDRV_PCM_STREAM_PLAYBACK) - ref_params = *pipeline_params; + memcpy(&ref_params, pipeline_params, sizeof(ref_params));
break; } @@ -1838,7 +1833,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, ipc4_copier = (struct sof_ipc4_copier *)swidget->private; copier_data = &ipc4_copier->data; available_fmt = &ipc4_copier->available_fmt; - ref_params = *pipeline_params; + memcpy(&ref_params, pipeline_params, sizeof(ref_params));
break; }
On 8/7/24 10:02, Arnd Bergmann wrote:
From: Arnd Bergmann arnd@arndb.de
The snd_pcm_hw_params structure is really too large to fit on the stack. Because of the way that clang inlines functions, it ends up twice in one function, which exceeds the 1024 byte limit for 32-bit architecutes:
sound/soc/sof/ipc4-topology.c:1700:1: error: stack frame size (1304) exceeds limit (1024) in 'sof_ipc4_prepare_copier_module' [-Werror,-Wframe-larger-than] sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget,
From what I can tell, this was unintentional, as both
sof_ipc4_prepare_dai_copier() and sof_ipc4_prepare_copier_module() make a copy for the same purpose, but copying it once has the exact same effect.
Humm, not sure. I think the copy was intentional so that if one of the fixups fails, then the initial hw_params structure is not modified.
Also not sure why a compiler would think inlining such a large function is a good idea?
Probably need Ranjani, Peter and Bard to comment on the format handling.
Remove the extra copy and change the direct struct assignment to an explicit memcpy() call to make it clearer to the reader that this is what happens. Note that gcc treats struct assignment as a memcpy() that may be inlined anyway, so the resulting object code is the same.
Fixes: f9209644ae76 ("ASoC: SOF: ipc4-topology: Correct DAI copier config and NHLT blob request") Signed-off-by: Arnd Bergmann arnd@arndb.de
sound/soc/sof/ipc4-topology.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 87be7f16e8c2..77f6850010f6 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1651,7 +1651,6 @@ sof_ipc4_prepare_dai_copier(struct snd_sof_dev *sdev, struct snd_sof_dai *dai, struct snd_pcm_hw_params *params, int dir) { struct sof_ipc4_available_audio_format *available_fmt;
- struct snd_pcm_hw_params dai_params = *params; struct sof_ipc4_copier_data *copier_data; struct sof_ipc4_pin_format *pin_fmts; struct sof_ipc4_copier *ipc4_copier;
@@ -1676,7 +1675,7 @@ sof_ipc4_prepare_dai_copier(struct snd_sof_dev *sdev, struct snd_sof_dai *dai, num_pin_fmts = available_fmt->num_input_formats; }
- ret = sof_ipc4_adjust_params_to_dai_format(sdev, &dai_params, pin_fmts,
- ret = sof_ipc4_adjust_params_to_dai_format(sdev, params, pin_fmts, num_pin_fmts); if (ret) return ret;
@@ -1684,15 +1683,11 @@ sof_ipc4_prepare_dai_copier(struct snd_sof_dev *sdev, struct snd_sof_dai *dai, single_bitdepth = sof_ipc4_copier_is_single_bitdepth(sdev, pin_fmts, num_pin_fmts); ret = snd_sof_get_nhlt_endpoint_data(sdev, dai, single_bitdepth,
&dai_params,
params, ipc4_copier->dai_index, ipc4_copier->dai_type, dir, &ipc4_copier->copier_config, &copier_data->gtw_cfg.config_length);
- /* Update the params to reflect the changes made in this function */
- if (!ret)
*params = dai_params;
- return ret;
}
@@ -1784,9 +1779,9 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * for capture. */ if (dir == SNDRV_PCM_STREAM_PLAYBACK)
ref_params = *fe_params;
elsememcpy(&ref_params, fe_params, sizeof(ref_params));
ref_params = *pipeline_params;
memcpy(&ref_params, pipeline_params, sizeof(ref_params));
copier_data->gtw_cfg.node_id &= ~SOF_IPC4_NODE_INDEX_MASK; copier_data->gtw_cfg.node_id |=
@@ -1819,7 +1814,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * In case of capture the ref_params returned will be used to * find the input configuration of the copier. */
ref_params = *fe_params;
ret = sof_ipc4_prepare_dai_copier(sdev, dai, &ref_params, dir); if (ret < 0) return ret;memcpy(&ref_params, fe_params, sizeof(ref_params));
@@ -1829,7 +1824,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * input configuration of the copier. */ if (dir == SNDRV_PCM_STREAM_PLAYBACK)
ref_params = *pipeline_params;
memcpy(&ref_params, pipeline_params, sizeof(ref_params));
break; }
@@ -1838,7 +1833,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, ipc4_copier = (struct sof_ipc4_copier *)swidget->private; copier_data = &ipc4_copier->data; available_fmt = &ipc4_copier->available_fmt;
ref_params = *pipeline_params;
memcpy(&ref_params, pipeline_params, sizeof(ref_params));
break; }
On Wed, Aug 7, 2024, at 10:37, Pierre-Louis Bossart wrote:
On 8/7/24 10:02, Arnd Bergmann wrote:
From: Arnd Bergmann arnd@arndb.de
The snd_pcm_hw_params structure is really too large to fit on the stack. Because of the way that clang inlines functions, it ends up twice in one function, which exceeds the 1024 byte limit for 32-bit architecutes:
sound/soc/sof/ipc4-topology.c:1700:1: error: stack frame size (1304) exceeds limit (1024) in 'sof_ipc4_prepare_copier_module' [-Werror,-Wframe-larger-than] sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget,
From what I can tell, this was unintentional, as both
sof_ipc4_prepare_dai_copier() and sof_ipc4_prepare_copier_module() make a copy for the same purpose, but copying it once has the exact same effect.
Humm, not sure. I think the copy was intentional so that if one of the fixups fails, then the initial hw_params structure is not modified.
It's clear that one of the two copies was intentional, however the same logic exists in the caller, which passes the copied ref_params into sof_ipc4_prepare_dai_copier() instead of the live 'fe_params'. If sof_ipc4_prepare_dai_copier() fails, the copy is discarded by returning the error, and otherwise it gets passed on to sof_ipc4_init_input_audio_fmt().
Also not sure why a compiler would think inlining such a large function is a good idea?
I think clang prefers to inline large functions in order to better do larger scale optimizations. gcc starts by inlining small leaf functions and doesn't get to this one.
Note that the actual problem of having two giant structures on the stack does not change through inlining, the risk of overflowing the 8kb thread stack and the cost of maintaining these variables is the same. The only difference is that clang triggers the warning because it sees the total stack size where gcc doesn't see it.
Arnd
participants (3)
-
Arnd Bergmann
-
Arnd Bergmann
-
Pierre-Louis Bossart