[Sound-open-firmware] [PATCH v7 0/2] ASoC: SOF: Fix deadlock when shutdown a frozen userspace
Since: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") we wait for all the workloads to be completed during shutdown. This was done to avoid a stall once the device is started again. Unfortunately this has the side effect of stalling kexec(), if the userspace is frozen. Let's handle that case. To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> To: Liam Girdwood <lgirdwood@gmail.com> To: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> To: Bard Liao <yung-chuan.liao@linux.intel.com> To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> To: Kai Vehmanen <kai.vehmanen@linux.intel.com> To: Daniel Baluta <daniel.baluta@nxp.com> To: Mark Brown <broonie@kernel.org> To: Jaroslav Kysela <perex@perex.cz> To: Takashi Iwai <tiwai@suse.com> To: Eric Biederman <ebiederm@xmission.com> To: Chromeos Kdump <chromeos-kdump@google.com> To: Steven Rostedt <rostedt@goodmis.org> Cc: stable@vger.kernel.org Cc: sound-open-firmware@alsa-project.org Cc: alsa-devel@alsa-project.org Cc: linux-kernel@vger.kernel.org Cc: kexec@lists.infradead.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> --- Changes in v7: - Fix commit message (Thanks Pierre-Louis). - Link to v6: https://lore.kernel.org/r/20221127-snd-freeze-v6-0-3e90553f64a5@chromium.org Changes in v6: - Check if we are in kexec with the userspace frozen. - Link to v5: https://lore.kernel.org/r/20221127-snd-freeze-v5-0-4ededeb08ba0@chromium.org Changes in v5: - Edit subject prefix. - Link to v4: https://lore.kernel.org/r/20221127-snd-freeze-v4-0-51ca64b7f2ab@chromium.org Changes in v4: - Do not call snd_sof_machine_unregister from shutdown. - Link to v3: https://lore.kernel.org/r/20221127-snd-freeze-v3-0-a2eda731ca14@chromium.org Changes in v3: - Wrap pm_freezing in a function. - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org Changes in v2: - Only use pm_freezing if CONFIG_FREEZER . - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org --- Ricardo Ribalda (2): kexec: Introduce kexec_with_frozen_processes ASoC: SOF: Fix deadlock when shutdown a frozen userspace include/linux/kexec.h | 3 +++ kernel/kexec_core.c | 5 +++++ sound/soc/sof/core.c | 4 +++- 3 files changed, 11 insertions(+), 1 deletion(-) --- base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 change-id: 20221127-snd-freeze-1ee143228326 Best regards, -- Ricardo Ribalda <ribalda@chromium.org>
Drivers running .shutdown() might want to wait for userspace to complete before exiting. If userspace is frozen and we are running kexec they will stall the computer. Add a way for them to figure out if they should just skip waiting for userspace. Cc: stable@vger.kernel.org Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> --- include/linux/kexec.h | 3 +++ kernel/kexec_core.c | 5 +++++ 2 files changed, 8 insertions(+) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 41a686996aaa..c22711e0f7b5 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -426,6 +426,8 @@ extern int kexec_load_disabled; /* flag to track if kexec reboot is in progress */ extern bool kexec_in_progress; +bool kexec_with_frozen_processes(void); + int crash_shrink_memory(unsigned long new_size); ssize_t crash_get_memory_size(void); @@ -507,6 +509,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { } static inline void crash_kexec(struct pt_regs *regs) { } static inline int kexec_should_crash(struct task_struct *p) { return 0; } static inline int kexec_crash_loaded(void) { return 0; } +static inline bool kexec_with_frozen_processes(void) { return false; } #define kexec_in_progress false #endif /* CONFIG_KEXEC_CORE */ diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index ca2743f9c634..8bc8257ee7ca 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -54,6 +54,11 @@ note_buf_t __percpu *crash_notes; /* Flag to indicate we are going to kexec a new kernel */ bool kexec_in_progress = false; +bool kexec_with_frozen_processes(void) +{ + return kexec_in_progress && pm_freezing; +} +EXPORT_SYMBOL(kexec_with_frozen_processes); /* Location of the reserved area for the crash kernel */ struct resource crashk_res = { -- 2.38.1.584.g0f3c55d4c2-goog-b4-0.11.0-dev-696ae
Hi Ricardo, I love your patch! Yet something to improve: [auto build test ERROR on 4312098baf37ee17a8350725e6e0d0e8590252d4] url: https://github.com/intel-lab-lkp/linux/commits/Ricardo-Ribalda/ASoC-SOF-Fix-... base: 4312098baf37ee17a8350725e6e0d0e8590252d4 patch link: https://lore.kernel.org/r/20221127-snd-freeze-v7-1-127c582f1ca4%40chromium.o... patch subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes config: ia64-randconfig-r022-20221128 compiler: ia64-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/ddb7195380bb431ff9f4505d5fcfbe... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214 git checkout ddb7195380bb431ff9f4505d5fcfbef756b80a3a # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): kernel/kexec_core.c: In function 'kexec_with_frozen_processes':
kernel/kexec_core.c:59:37: error: 'pm_freezing' undeclared (first use in this function); did you mean 'freezing'? 59 | return kexec_in_progress && pm_freezing; | ^~~~~~~~~~~ | freezing kernel/kexec_core.c:59:37: note: each undeclared identifier is reported only once for each function it appears in kernel/kexec_core.c:60:1: error: control reaches end of non-void function [-Werror=return-type] 60 | } | ^ cc1: some warnings being treated as errors
vim +59 kernel/kexec_core.c 56 57 bool kexec_with_frozen_processes(void) 58 {
59 return kexec_in_progress && pm_freezing; 60 } 61 EXPORT_SYMBOL(kexec_with_frozen_processes); 62
-- 0-DAY CI Kernel Test Service https://01.org/lkp
Hi Ricardo, I love your patch! Yet something to improve: [auto build test ERROR on 4312098baf37ee17a8350725e6e0d0e8590252d4] url: https://github.com/intel-lab-lkp/linux/commits/Ricardo-Ribalda/ASoC-SOF-Fix-... base: 4312098baf37ee17a8350725e6e0d0e8590252d4 patch link: https://lore.kernel.org/r/20221127-snd-freeze-v7-1-127c582f1ca4%40chromium.o... patch subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes config: s390-randconfig-r024-20221128 compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 6e4cea55f0d1104408b26ac574566a0e4de48036) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install s390 cross compiling tool for clang build # apt-get install binutils-s390x-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/ddb7195380bb431ff9f4505d5fcfbe... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214 git checkout ddb7195380bb431ff9f4505d5fcfbef756b80a3a # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): In file included from kernel/kexec_core.c:14: In file included from include/linux/kexec.h:19: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __raw_readb(PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu' #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) ^ include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16' #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) ^ In file included from kernel/kexec_core.c:14: In file included from include/linux/kexec.h:19: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu' #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) ^ include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32' #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x)) ^ In file included from kernel/kexec_core.c:14: In file included from include/linux/kexec.h:19: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writeb(value, PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^
kernel/kexec_core.c:59:30: error: use of undeclared identifier 'pm_freezing'; did you mean 'freezing'? return kexec_in_progress && pm_freezing; ^~~~~~~~~~~ freezing include/linux/freezer.h:75:20: note: 'freezing' declared here static inline bool freezing(struct task_struct *p) { return false; } ^ kernel/kexec_core.c:59:30: warning: address of function 'freezing' will always evaluate to 'true' [-Wpointer-bool-conversion] return kexec_in_progress && pm_freezing; ~~ ^~~~~~~~~~~ kernel/kexec_core.c:59:30: note: prefix with the address-of operator to silence this warning return kexec_in_progress && pm_freezing; ^ & 13 warnings and 1 error generated.
vim +59 kernel/kexec_core.c 56 57 bool kexec_with_frozen_processes(void) 58 {
59 return kexec_in_progress && pm_freezing; 60 } 61 EXPORT_SYMBOL(kexec_with_frozen_processes); 62
-- 0-DAY CI Kernel Test Service https://01.org/lkp
During kexec(), the userspace might frozen. Therefore we cannot wait for it to complete. During a kexec with frozen processe do not unregister the clients. This fixes: [ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done. [ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds. [ 246.819035] Call Trace: [ 246.821782] <TASK> [ 246.824186] __schedule+0x5f9/0x1263 [ 246.828231] schedule+0x87/0xc5 [ 246.831779] snd_card_disconnect_sync+0xb5/0x127 ... [ 246.889249] snd_sof_device_shutdown+0xb4/0x150 [ 246.899317] pci_device_shutdown+0x37/0x61 [ 246.903990] device_shutdown+0x14c/0x1d6 [ 246.908391] kernel_kexec+0x45/0xb9 And: [ 246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds. [ 246.927709] Call Trace: [ 246.930461] <TASK> [ 246.932819] __schedule+0x5f9/0x1263 [ 246.936855] ? fsnotify_grab_connector+0x5c/0x70 [ 246.942045] schedule+0x87/0xc5 [ 246.945567] schedule_timeout+0x49/0xf3 [ 246.949877] wait_for_completion+0x86/0xe8 [ 246.954463] snd_card_free+0x68/0x89 ... [ 247.001080] platform_device_unregister+0x12/0x35 Cc: stable@vger.kernel.org Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> --- sound/soc/sof/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index 3e6141d03770..4301f347bb90 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -9,6 +9,7 @@ // #include <linux/firmware.h> +#include <linux/kexec.h> #include <linux/module.h> #include <sound/soc.h> #include <sound/sof.h> @@ -484,7 +485,8 @@ int snd_sof_device_shutdown(struct device *dev) * make sure clients and machine driver(s) are unregistered to force * all userspace devices to be closed prior to the DSP shutdown sequence */ - sof_unregister_clients(sdev); + if (!kexec_with_frozen_processes()) + sof_unregister_clients(sdev); snd_sof_machine_unregister(sdev, pdata); -- 2.38.1.584.g0f3c55d4c2-goog-b4-0.11.0-dev-696ae
Hi, On Wed, 30 Nov 2022, Ricardo Ribalda wrote:
During kexec(), the userspace might frozen. Therefore we cannot wait for it to complete. [...] --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -9,6 +9,7 @@ //
#include <linux/firmware.h> +#include <linux/kexec.h> #include <linux/module.h> #include <sound/soc.h> #include <sound/sof.h> @@ -484,7 +485,8 @@ int snd_sof_device_shutdown(struct device *dev) * make sure clients and machine driver(s) are unregistered to force * all userspace devices to be closed prior to the DSP shutdown sequence */ - sof_unregister_clients(sdev); + if (!kexec_with_frozen_processes()) + sof_unregister_clients(sdev);
snd_sof_machine_unregister(sdev, pdata);
I think the case you hit was specifically snd_card_disconnect_sync() that gets called via snd_sof_machine_unregister(), right, so you'd have to skip both sof_unregister_clients() and the machine_unregister(). Skipping ok might be an ok solution here. There's clearly a problem and we cannot just drop these calls in the general case (when we are going to S5), but in the specific case of kexec, this is probably safe. And I agree one way or another this needs to be fixed. Pierre and others what do you think? Br, Kai
participants (3)
-
Kai Vehmanen -
kernel test robot -
Ricardo Ribalda