[PATCH] ASoC: soc-pcm: Shrink stack frame for __soc_pcm_hw_params
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com --- sound/soc/soc-pcm.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 3aa6b988cb4b4..46917add10560 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -985,6 +985,7 @@ static int __soc_pcm_hw_params(struct snd_soc_pcm_runtime *rtd, { struct snd_soc_dai *cpu_dai; struct snd_soc_dai *codec_dai; + struct snd_pcm_hw_params tmp_params; int i, ret = 0;
snd_soc_dpcm_mutex_assert_held(rtd); @@ -998,7 +999,6 @@ static int __soc_pcm_hw_params(struct snd_soc_pcm_runtime *rtd, goto out;
for_each_rtd_codec_dais(rtd, i, codec_dai) { - struct snd_pcm_hw_params codec_params; unsigned int tdm_mask = snd_soc_dai_tdm_mask_get(codec_dai, substream->stream);
/* @@ -1019,23 +1019,22 @@ static int __soc_pcm_hw_params(struct snd_soc_pcm_runtime *rtd, continue;
/* copy params for each codec */ - codec_params = *params; + tmp_params = *params;
/* fixup params based on TDM slot masks */ if (tdm_mask) - soc_pcm_codec_params_fixup(&codec_params, tdm_mask); + soc_pcm_codec_params_fixup(&tmp_params, tdm_mask);
ret = snd_soc_dai_hw_params(codec_dai, substream, - &codec_params); + &tmp_params); if(ret < 0) goto out;
- soc_pcm_set_dai_params(codec_dai, &codec_params); - snd_soc_dapm_update_dai(substream, &codec_params, codec_dai); + soc_pcm_set_dai_params(codec_dai, &tmp_params); + snd_soc_dapm_update_dai(substream, &tmp_params, codec_dai); }
for_each_rtd_cpu_dais(rtd, i, cpu_dai) { - struct snd_pcm_hw_params cpu_params; unsigned int ch_mask = 0; int j;
@@ -1047,7 +1046,7 @@ static int __soc_pcm_hw_params(struct snd_soc_pcm_runtime *rtd, continue;
/* copy params for each cpu */ - cpu_params = *params; + tmp_params = *params;
if (!rtd->dai_link->codec_ch_maps) goto hw_params; @@ -1062,16 +1061,16 @@ static int __soc_pcm_hw_params(struct snd_soc_pcm_runtime *rtd,
/* fixup cpu channel number */ if (ch_mask) - soc_pcm_codec_params_fixup(&cpu_params, ch_mask); + soc_pcm_codec_params_fixup(&tmp_params, ch_mask);
hw_params: - ret = snd_soc_dai_hw_params(cpu_dai, substream, &cpu_params); + ret = snd_soc_dai_hw_params(cpu_dai, substream, &tmp_params); if (ret < 0) goto out;
/* store the parameters for each DAI */ - soc_pcm_set_dai_params(cpu_dai, &cpu_params); - snd_soc_dapm_update_dai(substream, &cpu_params, cpu_dai); + soc_pcm_set_dai_params(cpu_dai, &tmp_params); + snd_soc_dapm_update_dai(substream, &tmp_params, cpu_dai); }
ret = snd_soc_pcm_component_hw_params(substream, params);
On Wed, Aug 23, 2023 at 10:21:13AM +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com
Hmm... this might need a little more thought its not clear why this should change the frame size and it only seems to change the frame size on the ARM cross compiler I am using, not x86.
Thanks, Charles
On Wed, Aug 23, 2023 at 03:49:58PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 10:21:13AM +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Hmm... this might need a little more thought its not clear why this should change the frame size and it only seems to change the frame size on the ARM cross compiler I am using, not x86.
Isn't that just going to be a function of the compiler being smart enough to work out that there aren't overlapping uses of the two variables and they can share stack space? There's no reason not to help it figure that out.
On 8/23/23 11:19, Mark Brown wrote:
On Wed, Aug 23, 2023 at 03:49:58PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 10:21:13AM +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Hmm... this might need a little more thought its not clear why this should change the frame size and it only seems to change the frame size on the ARM cross compiler I am using, not x86.
Isn't that just going to be a function of the compiler being smart enough to work out that there aren't overlapping uses of the two variables and they can share stack space? There's no reason not to help it figure that out.
One would think that compilers understand the variable scope, and free-up the stack when leaving a for loop scope?
On Wed, Aug 23, 2023 at 11:26:12AM -0500, Pierre-Louis Bossart wrote:
On 8/23/23 11:19, Mark Brown wrote:
Isn't that just going to be a function of the compiler being smart enough to work out that there aren't overlapping uses of the two variables and they can share stack space? There's no reason not to help it figure that out.
One would think that compilers understand the variable scope, and free-up the stack when leaving a for loop scope?
Clearly it's possible, but it's the sort of thing I can imagine is left to an optimisation pass and missed sometimes for whatever reason.
On Wed, Aug 23, 2023 at 05:19:31PM +0100, Mark Brown wrote:
On Wed, Aug 23, 2023 at 03:49:58PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 10:21:13AM +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Hmm... this might need a little more thought its not clear why this should change the frame size and it only seems to change the frame size on the ARM cross compiler I am using, not x86.
Isn't that just going to be a function of the compiler being smart enough to work out that there aren't overlapping uses of the two variables and they can share stack space? There's no reason not to help it figure that out.
Yeah I think my only concern here was I no longer was certain I understood what was happening. I don't think the patch can do any harm, well except for the names being slightly less clear in the code. It is starting to look like the mostly comes down to the compiler being smart enough, although both were GCC in my case so the difference is still a little surprising to me.
Thanks, Charles
On Wed, Aug 23, 2023 at 04:39:35PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 05:19:31PM +0100, Mark Brown wrote:
On Wed, Aug 23, 2023 at 03:49:58PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 10:21:13AM +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Hmm... this might need a little more thought its not clear why this should change the frame size and it only seems to change the frame size on the ARM cross compiler I am using, not x86.
Isn't that just going to be a function of the compiler being smart enough to work out that there aren't overlapping uses of the two variables and they can share stack space? There's no reason not to help it figure that out.
Yeah I think my only concern here was I no longer was certain I understood what was happening. I don't think the patch can do any harm, well except for the names being slightly less clear in the code. It is starting to look like the mostly comes down to the compiler being smart enough, although both were GCC in my case so the difference is still a little surprising to me.
Ah ok I see what is going on here, it depends on if you have -Os or -O2 set. -O2 will merge the two variables and give a smaller stack frame, -Os does not.
I would be inclined to say merge the patch, since it does help if some is trying to size optimise their kernel, but I don't feel strongly. Also I could respin to put this in the commit message if people prefer?
Thanks, Charles
On 8/24/23 04:33, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 04:39:35PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 05:19:31PM +0100, Mark Brown wrote:
On Wed, Aug 23, 2023 at 03:49:58PM +0000, Charles Keepax wrote:
On Wed, Aug 23, 2023 at 10:21:13AM +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
Hmm... this might need a little more thought its not clear why this should change the frame size and it only seems to change the frame size on the ARM cross compiler I am using, not x86.
Isn't that just going to be a function of the compiler being smart enough to work out that there aren't overlapping uses of the two variables and they can share stack space? There's no reason not to help it figure that out.
Yeah I think my only concern here was I no longer was certain I understood what was happening. I don't think the patch can do any harm, well except for the names being slightly less clear in the code. It is starting to look like the mostly comes down to the compiler being smart enough, although both were GCC in my case so the difference is still a little surprising to me.
Ah ok I see what is going on here, it depends on if you have -Os or -O2 set. -O2 will merge the two variables and give a smaller stack frame, -Os does not.
I would be inclined to say merge the patch, since it does help if some is trying to size optimise their kernel, but I don't feel strongly. Also I could respin to put this in the commit message if people prefer?
v2 with an updated commit message sounds good to me.
On Wed, 23 Aug 2023 10:21:13 +0100, Charles Keepax wrote:
Commit ac950278b087 ("ASoC: add N cpus to M codecs dai link support") added an additional local params in __soc_pmc_hw_params, for the CPU side of the DAI. The snd_pcm_hw_params struct is pretty large (604 bytes) and keeping two local copies of it makes the stack frame for __soc_pcm_hw_params really large. As the two copies are only used sequentially combine these into a single local variable to shrink the stack frame.
[...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/1] ASoC: soc-pcm: Shrink stack frame for __soc_pcm_hw_params commit: 396b907919e028d89bac912e49de014485deb8dc
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
participants (3)
-
Charles Keepax
-
Mark Brown
-
Pierre-Louis Bossart