On 21-03-23, 10:26, Bard Liao wrote:
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
The existing code copies the hw_params pointer and reuses it later in .prepare, specifically to re-initialize the ALH DMA channel information that's lost in suspend-resume cycles.
This is not needed, we can directly access the information from the substream/rtd - as done for the HDAudio DAIs in sound/soc/sof/intel/hda-dai.c
In addition, using the saved pointer causes the suspend-resume test cases to fail on specific platforms, depending on which version of GCC is used. Péter Ujfalusi and I have spent long hours to root-cause this problem that was reported by the Intel CI first with 6.2-rc1 and again v6.3-rc1. In the latter case we were lucky that the problem was 100% reproducible on local test devices, and found out that adding a dev_dbg() or adding a call to usleep_range() just before accessing the saved pointer "fixed" the issue. With errors appearing just by changing the compiler version or minor changes in the code generated, clearly we have a memory management Heisenbug.
The root-cause seems to be that the hw_params pointer is not persistent. The soc-pcm code allocates the hw_params structure on the stack, and passes it to the BE dailink hw_params and DAIs hw_params. Saving such a pointer and reusing it later during the .prepare stage cannot possibly work reliably, it's broken-by-design since v5.10. It's astonishing that the problem was not seen earlier.
This simple fix will have to be back-ported to -stable, due to changes to avoid the use of the get/set_dmadata routines this patch will only apply on kernels older than v6.1.
Applied, thanks