[alsa-devel] [PATCH v2 00/12] ASoC: SOF: stability fixes
No new functionality here but fixes to the IPC, memory allocation, workqueues, control, runtime_pm and HDaudio support. These issues were identified during the integration/productization of new platforms.
I added a 'Fixes' tag to make sure backports/stable can pick up these patches, in case it's too late for them to land in 5.2
The next batch for SOF will be a set of new capabilities for trace, IPC test, Hdaudio support, they will be submitted after additional tests and once these fixes are reviewed/merged.
Thanks!
Changes since v1: Feedback from Takashi: use spin_lock_irq instead of spin_lock_irq_save Added Takashi's Reviewed-by tag
Amadeusz Sławiński (1): ALSA: hdac: fix memory release for SST and SOF drivers
Guennadi Liakhovetski (1): ASoC: SOF: ipc: fix a race, leading to IPC timeouts
Keyon Jie (1): ASoC: SOF: control: correct the copy size for bytes kcontrol put
Libin Yang (2): ASoC: SOF: pcm: clear hw_params_upon_resume flag correctly ASoC: SOF: Intel: hda-codec: fix memory allocation
Pierre-Louis Bossart (2): ASoC: SOF: core: fix error handling with the probe workqueue ASoC: SOF: pcm: remove warning - initialize workqueue on open
Ranjani Sridharan (3): ASoC: SOF: core: remove DSP after unregistering machine driver ASoC: SOF: core: remove snd_soc_unregister_component in case of error ASoC: SOF: pcm: remove runtime PM calls during pcm open/close
Zhu Yingjiang (2): ASoC: SOF: Intel: hda: fix the hda init chip ASoC: SOF: Intel: hda: use the defined ppcap functions
sound/hda/ext/hdac_ext_bus.c | 1 - sound/soc/sof/control.c | 9 +-- sound/soc/sof/core.c | 29 +++++++-- sound/soc/sof/intel/bdw.c | 10 +-- sound/soc/sof/intel/byt.c | 11 ++-- sound/soc/sof/intel/cnl.c | 5 ++ sound/soc/sof/intel/hda-codec.c | 6 +- sound/soc/sof/intel/hda-ctrl.c | 102 +++++++++++++++++++++++++++--- sound/soc/sof/intel/hda-ipc.c | 18 +++++- sound/soc/sof/intel/hda.c | 109 +++++++------------------------- sound/soc/sof/ipc.c | 13 ---- sound/soc/sof/pcm.c | 37 ++--------- 12 files changed, 184 insertions(+), 166 deletions(-)
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
snd_sof_remove() disables the DSP and unmaps the DSP BAR. Removing topology after disabling the DSP results in a kernel panic while unloading the pipeline widget. This is because pipeline widget unload attempts to power down the core it is scheduled on by accessing the DSP registers.
So, the suggested fix here is to unregister the machine driver first to remove the topology and then disable the DSP to avoid the situation described above.
Note that the kernel panic only happens in cases where the HDaudio link is not managed by the hdac library, e.g. no codec or when HDMI is not supported. When the hdac library is used, snd_sof_remove() calls snd_hdac_ext_bus_device_remove() to remove the codec which unregisters the component driver thereby also removing the topology before the DSP is disabled.
Fixes: c16211d6226 ("ASoC: SOF: Add Sound Open Firmware driver core") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/core.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index 32105e0fabe8..0bc4a8472c10 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -484,7 +484,6 @@ int snd_sof_device_remove(struct device *dev) snd_sof_ipc_free(sdev); snd_sof_free_debug(sdev); snd_sof_free_trace(sdev); - snd_sof_remove(sdev);
/* * Unregister machine driver. This will unbind the snd_card which @@ -494,6 +493,14 @@ int snd_sof_device_remove(struct device *dev) if (!IS_ERR_OR_NULL(pdata->pdev_mach)) platform_device_unregister(pdata->pdev_mach);
+ /* + * Unregistering the machine driver results in unloading the topology. + * Some widgets, ex: scheduler, attempt to power down the core they are + * scheduled on, when they are unloaded. Therefore, the DSP must be + * removed only after the topology has been unloaded. + */ + snd_sof_remove(sdev); + /* release firmware */ release_firmware(pdata->fw); pdata->fw = NULL;
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
No need to call snd_soc_unregister_component in case of error because the component device is resource-managed.
Fixes: c16211d6226 ("ASoC: SOF: Add Sound Open Firmware driver core") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/core.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index 0bc4a8472c10..693ad83bffc9 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -382,7 +382,7 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
if (IS_ERR(plat_data->pdev_mach)) { ret = PTR_ERR(plat_data->pdev_mach); - goto comp_err; + goto fw_run_err; }
dev_dbg(sdev->dev, "created machine %s\n", @@ -393,8 +393,6 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
return 0;
-comp_err: - snd_soc_unregister_component(sdev->dev); fw_run_err: snd_sof_fw_unload(sdev); fw_load_err:
In some configurations, it's a requirement to split the probe in two, with a second part handled in a workqueue (e.g. for HDMI support which depends on the DRM modules).
SOF already handles these configurations but the error flow is incorrect. When an error occurs in the workqueue, the probe has technically already completed. If we release the resources on errors, this generates kernel oops/use-after-free when the resources are released a second time on module removal.
GitHub issue: https://github.com/thesofproject/linux/issues/945 Fixes: c16211d6226 ("ASoC: SOF: Add Sound Open Firmware driver core") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/core.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index 693ad83bffc9..5beda47cdf9f 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -393,6 +393,7 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
return 0;
+#if !IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE) fw_run_err: snd_sof_fw_unload(sdev); fw_load_err: @@ -401,6 +402,21 @@ static int sof_probe_continue(struct snd_sof_dev *sdev) snd_sof_free_debug(sdev); dbg_err: snd_sof_remove(sdev); +#else + + /* + * when the probe_continue is handled in a work queue, the + * probe does not fail so we don't release resources here. + * They will be released with an explicit call to + * snd_sof_device_remove() when the PCI/ACPI device is removed + */ + +fw_run_err: +fw_load_err: +ipc_err: +dbg_err: + +#endif
return ret; }
From: Ranjani Sridharan ranjani.sridharan@linux.intel.com
pm_runtime_get_sync()/pm_runtime_put_autosuspend() calls are already invoked by the ASoC core in soc_pcm_open() and soc_pcm_close(). So the SOF component driver does not need to call them again.
Fixes: 868bd00f495 ("ASoC: SOF: Add PCM operations support") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/pcm.c | 29 ++--------------------------- 1 file changed, 2 insertions(+), 27 deletions(-)
diff --git a/sound/soc/sof/pcm.c b/sound/soc/sof/pcm.c index 649968841dad..4f536c0de0a5 100644 --- a/sound/soc/sof/pcm.c +++ b/sound/soc/sof/pcm.c @@ -416,7 +416,6 @@ static int sof_pcm_open(struct snd_pcm_substream *substream) struct snd_sof_pcm *spcm; struct snd_soc_tplg_stream_caps *caps; int ret; - int err;
/* nothing to do for BE */ if (rtd->dai_link->no_pcm) @@ -434,14 +433,6 @@ static int sof_pcm_open(struct snd_pcm_substream *substream)
caps = &spcm->pcm.caps[substream->stream];
- ret = pm_runtime_get_sync(sdev->dev); - if (ret < 0) { - dev_err(sdev->dev, "error: pcm open failed to resume %d\n", - ret); - pm_runtime_put_noidle(sdev->dev); - return ret; - } - /* set any runtime constraints based on topology */ snd_pcm_hw_constraint_step(substream->runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, @@ -485,17 +476,8 @@ static int sof_pcm_open(struct snd_pcm_substream *substream) spcm->stream[substream->stream].substream = substream;
ret = snd_sof_pcm_platform_open(sdev, substream); - if (ret < 0) { - dev_err(sdev->dev, "error: pcm open failed %d\n", - ret); - - pm_runtime_mark_last_busy(sdev->dev); - - err = pm_runtime_put_autosuspend(sdev->dev); - if (err < 0) - dev_err(sdev->dev, "error: pcm close failed to idle %d\n", - err); - } + if (ret < 0) + dev_err(sdev->dev, "error: pcm open failed %d\n", ret);
return ret; } @@ -530,13 +512,6 @@ static int sof_pcm_close(struct snd_pcm_substream *substream) */ }
- pm_runtime_mark_last_busy(sdev->dev); - - err = pm_runtime_put_autosuspend(sdev->dev); - if (err < 0) - dev_err(sdev->dev, "error: pcm close failed to idle %d\n", - err); - return 0; }
From: Libin Yang libin.yang@intel.com
sof_pcm_hw_params() can only be called once to setup the FW hw_params. So after calling sof_pcm_hw_params(), hw_params_upon_resume flag must be cleared to avoid multiple invoking sof_pcm_hw_params() by prepare.
For example, after resume, there is an xrun happened, prepare() will be called. As the hw_params_upon_resume flag is not cleared, sof_pcm_hw_params() will be called and this will cause IPC timeout.
This patch fixes such issues.
Fixes: 868bd00f495 ("ASoC: SOF: Add PCM operations support") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Libin Yang libin.yang@intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/pcm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/sof/pcm.c b/sound/soc/sof/pcm.c index 4f536c0de0a5..e94892053690 100644 --- a/sound/soc/sof/pcm.c +++ b/sound/soc/sof/pcm.c @@ -214,6 +214,9 @@ static int sof_pcm_hw_params(struct snd_pcm_substream *substream, INIT_WORK(&spcm->stream[substream->stream].period_elapsed_work, sof_pcm_period_elapsed_work);
+ /* clear hw_params_upon_resume flag */ + spcm->hw_params_upon_resume[substream->stream] = 0; + return ret; }
@@ -428,9 +431,6 @@ static int sof_pcm_open(struct snd_pcm_substream *substream) dev_dbg(sdev->dev, "pcm: open stream %d dir %d\n", spcm->pcm.pcm_id, substream->stream);
- /* clear hw_params_upon_resume flag */ - spcm->hw_params_upon_resume[substream->stream] = 0; - caps = &spcm->pcm.caps[substream->stream];
/* set any runtime constraints based on topology */
If the SOF hw_params() fail, typically with an IPC error thrown by the firmware, the period_elapsed workqueue is not initialized, but we still cancel it in hw_free(), which results in a kernel warning.
Move the initialization to the .open callback. Tested on Broadwell (Samus) and IceLake.
Fixes: e2803e610ae ("ASoC: SOF: PCM: add period_elapsed work to fix race condition in interrupt context")
GitHub issue: https://github.com/thesofproject/linux/issues/932 Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/pcm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/sof/pcm.c b/sound/soc/sof/pcm.c index e94892053690..6dc5f97be0bc 100644 --- a/sound/soc/sof/pcm.c +++ b/sound/soc/sof/pcm.c @@ -211,9 +211,6 @@ static int sof_pcm_hw_params(struct snd_pcm_substream *substream, /* save pcm hw_params */ memcpy(&spcm->params[substream->stream], params, sizeof(*params));
- INIT_WORK(&spcm->stream[substream->stream].period_elapsed_work, - sof_pcm_period_elapsed_work); - /* clear hw_params_upon_resume flag */ spcm->hw_params_upon_resume[substream->stream] = 0;
@@ -431,6 +428,9 @@ static int sof_pcm_open(struct snd_pcm_substream *substream) dev_dbg(sdev->dev, "pcm: open stream %d dir %d\n", spcm->pcm.pcm_id, substream->stream);
+ INIT_WORK(&spcm->stream[substream->stream].period_elapsed_work, + sof_pcm_period_elapsed_work); + caps = &spcm->pcm.caps[substream->stream];
/* set any runtime constraints based on topology */
From: Keyon Jie yang.jie@linux.intel.com
The size for the bytes kcontrol should include the abi header, that is, data->size + sizeof(*data), it is also aligned with get method after this change.
Fixes: c3078f53970 ("ASoC: SOF: Add Sound Open Firmware KControl support") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Keyon Jie yang.jie@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/control.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/sound/soc/sof/control.c b/sound/soc/sof/control.c index 11762c4580f1..84e2cbfbbcbb 100644 --- a/sound/soc/sof/control.c +++ b/sound/soc/sof/control.c @@ -349,6 +349,7 @@ int snd_sof_bytes_put(struct snd_kcontrol *kcontrol, struct snd_sof_dev *sdev = scontrol->sdev; struct sof_ipc_ctrl_data *cdata = scontrol->control_data; struct sof_abi_hdr *data = cdata->data; + size_t size = data->size + sizeof(*data); int ret, err;
if (be->max > sizeof(ucontrol->value.bytes.data)) { @@ -358,10 +359,10 @@ int snd_sof_bytes_put(struct snd_kcontrol *kcontrol, return -EINVAL; }
- if (data->size > be->max) { + if (size > be->max) { dev_err_ratelimited(sdev->dev, - "error: size too big %d bytes max is %d\n", - data->size, be->max); + "error: size too big %zu bytes max is %d\n", + size, be->max); return -EINVAL; }
@@ -375,7 +376,7 @@ int snd_sof_bytes_put(struct snd_kcontrol *kcontrol, }
/* copy from kcontrol */ - memcpy(data, ucontrol->value.bytes.data, data->size); + memcpy(data, ucontrol->value.bytes.data, size);
/* notify DSP of byte control updates */ snd_sof_ipc_set_get_comp_data(sdev->ipc, scontrol,
From: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com
Currently on all supported platforms the IPC IRQ thread first signals the sender when an IPC response is received from the DSP, then unmasks the IPC interrupt. Those actions are performed without holding any locks, so the thread can be interrupted between them. IPC timeouts have been observed in such scenarios: if the sender is woken up and it proceeds with sending the next message without unmasking the IPC interrupt, it can miss the next response. This patch takes a spin-lock to prevent the IRQ thread from being preempted at that point. It also makes sure, that the next IPC transmission by the host cannot take place before the IRQ thread has finished updating all the required IPC registers.
Fixes: 53e0c72d98b ("ASoC: SOF: Add support for IPC IO between DSP and Host") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/intel/bdw.c | 10 +++++----- sound/soc/sof/intel/byt.c | 11 ++++++----- sound/soc/sof/intel/cnl.c | 5 +++++ sound/soc/sof/intel/hda-ipc.c | 18 +++++++++++++++--- sound/soc/sof/ipc.c | 13 ------------- 5 files changed, 31 insertions(+), 26 deletions(-)
diff --git a/sound/soc/sof/intel/bdw.c b/sound/soc/sof/intel/bdw.c index 065cb868bdfa..61353e710738 100644 --- a/sound/soc/sof/intel/bdw.c +++ b/sound/soc/sof/intel/bdw.c @@ -278,11 +278,14 @@ static irqreturn_t bdw_irq_thread(int irq, void *context) /* reply message from DSP */ if (ipcx & SHIM_IPCX_DONE && !(imrx & SHIM_IMRX_DONE)) { + /* Mask Done interrupt before return */ snd_sof_dsp_update_bits_unlocked(sdev, BDW_DSP_BAR, SHIM_IMRX, SHIM_IMRX_DONE, SHIM_IMRX_DONE);
+ spin_lock_irq(&sdev->ipc_lock); + /* * handle immediate reply from DSP core. If the msg is * found, set done bit in cmd_done which is called at the @@ -294,6 +297,8 @@ static irqreturn_t bdw_irq_thread(int irq, void *context) snd_sof_ipc_reply(sdev, ipcx);
bdw_dsp_done(sdev); + + spin_unlock_irq(&sdev->ipc_lock); }
ipcd = snd_sof_dsp_read(sdev, BDW_DSP_BAR, SHIM_IPCD); @@ -485,7 +490,6 @@ static void bdw_get_reply(struct snd_sof_dev *sdev) { struct snd_sof_ipc_msg *msg = sdev->msg; struct sof_ipc_reply reply; - unsigned long flags; int ret = 0;
/* @@ -501,8 +505,6 @@ static void bdw_get_reply(struct snd_sof_dev *sdev) /* get reply */ sof_mailbox_read(sdev, sdev->host_box.offset, &reply, sizeof(reply));
- spin_lock_irqsave(&sdev->ipc_lock, flags); - if (reply.error < 0) { memcpy(msg->reply_data, &reply, sizeof(reply)); ret = reply.error; @@ -521,8 +523,6 @@ static void bdw_get_reply(struct snd_sof_dev *sdev) }
msg->reply_error = ret; - - spin_unlock_irqrestore(&sdev->ipc_lock, flags); }
static void bdw_host_done(struct snd_sof_dev *sdev) diff --git a/sound/soc/sof/intel/byt.c b/sound/soc/sof/intel/byt.c index 7bf9143d3106..de8893d837cd 100644 --- a/sound/soc/sof/intel/byt.c +++ b/sound/soc/sof/intel/byt.c @@ -324,11 +324,15 @@ static irqreturn_t byt_irq_thread(int irq, void *context) /* reply message from DSP */ if (ipcx & SHIM_BYT_IPCX_DONE && !(imrx & SHIM_IMRX_DONE)) { + /* Mask Done interrupt before first */ snd_sof_dsp_update_bits64_unlocked(sdev, BYT_DSP_BAR, SHIM_IMRX, SHIM_IMRX_DONE, SHIM_IMRX_DONE); + + spin_lock_irq(&sdev->ipc_lock); + /* * handle immediate reply from DSP core. If the msg is * found, set done bit in cmd_done which is called at the @@ -340,6 +344,8 @@ static irqreturn_t byt_irq_thread(int irq, void *context) snd_sof_ipc_reply(sdev, ipcx);
byt_dsp_done(sdev); + + spin_unlock_irq(&sdev->ipc_lock); }
/* new message from DSP */ @@ -383,7 +389,6 @@ static void byt_get_reply(struct snd_sof_dev *sdev) { struct snd_sof_ipc_msg *msg = sdev->msg; struct sof_ipc_reply reply; - unsigned long flags; int ret = 0;
/* @@ -399,8 +404,6 @@ static void byt_get_reply(struct snd_sof_dev *sdev) /* get reply */ sof_mailbox_read(sdev, sdev->host_box.offset, &reply, sizeof(reply));
- spin_lock_irqsave(&sdev->ipc_lock, flags); - if (reply.error < 0) { memcpy(msg->reply_data, &reply, sizeof(reply)); ret = reply.error; @@ -419,8 +422,6 @@ static void byt_get_reply(struct snd_sof_dev *sdev) }
msg->reply_error = ret; - - spin_unlock_irqrestore(&sdev->ipc_lock, flags); }
static void byt_host_done(struct snd_sof_dev *sdev) diff --git a/sound/soc/sof/intel/cnl.c b/sound/soc/sof/intel/cnl.c index 08a1a3d3c08d..3a83e145ab8e 100644 --- a/sound/soc/sof/intel/cnl.c +++ b/sound/soc/sof/intel/cnl.c @@ -50,6 +50,7 @@ static irqreturn_t cnl_ipc_irq_thread(int irq, void *context) /* reply message from DSP */ if (hipcida & CNL_DSP_REG_HIPCIDA_DONE && hipcctl & CNL_DSP_REG_HIPCCTL_DONE) { + hipci = snd_sof_dsp_read(sdev, HDA_DSP_BAR, CNL_DSP_REG_HIPCIDR); msg_ext = hipci & CNL_DSP_REG_HIPCIDR_MSG_MASK; @@ -64,6 +65,8 @@ static irqreturn_t cnl_ipc_irq_thread(int irq, void *context) CNL_DSP_REG_HIPCCTL, CNL_DSP_REG_HIPCCTL_DONE, 0);
+ spin_lock_irq(&sdev->ipc_lock); + /* handle immediate reply from DSP core */ hda_dsp_ipc_get_reply(sdev); snd_sof_ipc_reply(sdev, msg); @@ -75,6 +78,8 @@ static irqreturn_t cnl_ipc_irq_thread(int irq, void *context)
cnl_ipc_dsp_done(sdev);
+ spin_unlock_irq(&sdev->ipc_lock); + ret = IRQ_HANDLED; }
diff --git a/sound/soc/sof/intel/hda-ipc.c b/sound/soc/sof/intel/hda-ipc.c index 73ead7070cde..f841032e88da 100644 --- a/sound/soc/sof/intel/hda-ipc.c +++ b/sound/soc/sof/intel/hda-ipc.c @@ -72,7 +72,6 @@ void hda_dsp_ipc_get_reply(struct snd_sof_dev *sdev) struct snd_sof_ipc_msg *msg = sdev->msg; struct sof_ipc_reply reply; struct sof_ipc_cmd_hdr *hdr; - unsigned long flags; int ret = 0;
/* @@ -84,7 +83,6 @@ void hda_dsp_ipc_get_reply(struct snd_sof_dev *sdev) dev_warn(sdev->dev, "unexpected ipc interrupt raised!\n"); return; } - spin_lock_irqsave(&sdev->ipc_lock, flags);
hdr = msg->msg_data; if (hdr->cmd == (SOF_IPC_GLB_PM_MSG | SOF_IPC_PM_CTX_SAVE)) { @@ -123,7 +121,6 @@ void hda_dsp_ipc_get_reply(struct snd_sof_dev *sdev) out: msg->reply_error = ret;
- spin_unlock_irqrestore(&sdev->ipc_lock, flags); }
static bool hda_dsp_ipc_is_sof(uint32_t msg) @@ -158,6 +155,7 @@ irqreturn_t hda_dsp_ipc_irq_thread(int irq, void *context) /* is this a reply message from the DSP */ if (hipcie & HDA_DSP_REG_HIPCIE_DONE && hipcctl & HDA_DSP_REG_HIPCCTL_DONE) { + hipci = snd_sof_dsp_read(sdev, HDA_DSP_BAR, HDA_DSP_REG_HIPCI); msg = hipci & HDA_DSP_REG_HIPCI_MSG_MASK; @@ -172,6 +170,18 @@ irqreturn_t hda_dsp_ipc_irq_thread(int irq, void *context) HDA_DSP_REG_HIPCCTL, HDA_DSP_REG_HIPCCTL_DONE, 0);
+ /* + * Make sure the interrupt thread cannot be preempted between + * waking up the sender and re-enabling the interrupt. Also + * protect against a theoretical race with sof_ipc_tx_message(): + * if the DSP is fast enough to receive an IPC message, reply to + * it, and the host interrupt processing calls this function on + * a different core from the one, where the sending is taking + * place, the message might not yet be marked as expecting a + * reply. + */ + spin_lock_irq(&sdev->ipc_lock); + /* handle immediate reply from DSP core - ignore ROM messages */ if (hda_dsp_ipc_is_sof(msg)) { hda_dsp_ipc_get_reply(sdev); @@ -187,6 +197,8 @@ irqreturn_t hda_dsp_ipc_irq_thread(int irq, void *context) /* set the done bit */ hda_dsp_ipc_dsp_done(sdev);
+ spin_unlock_irq(&sdev->ipc_lock); + ret = IRQ_HANDLED; }
diff --git a/sound/soc/sof/ipc.c b/sound/soc/sof/ipc.c index 894e68cbd69d..10304a90cf25 100644 --- a/sound/soc/sof/ipc.c +++ b/sound/soc/sof/ipc.c @@ -308,19 +308,8 @@ EXPORT_SYMBOL(sof_ipc_tx_message); int snd_sof_ipc_reply(struct snd_sof_dev *sdev, u32 msg_id) { struct snd_sof_ipc_msg *msg = &sdev->ipc->msg; - unsigned long flags; - - /* - * Protect against a theoretical race with sof_ipc_tx_message(): if the - * DSP is fast enough to receive an IPC message, reply to it, and the - * host interrupt processing calls this function on a different core - * from the one, where the sending is taking place, the message might - * not yet be marked as expecting a reply. - */ - spin_lock_irqsave(&sdev->ipc_lock, flags);
if (msg->ipc_complete) { - spin_unlock_irqrestore(&sdev->ipc_lock, flags); dev_err(sdev->dev, "error: no reply expected, received 0x%x", msg_id); return -EINVAL; @@ -330,8 +319,6 @@ int snd_sof_ipc_reply(struct snd_sof_dev *sdev, u32 msg_id) msg->ipc_complete = true; wake_up(&msg->waitq);
- spin_unlock_irqrestore(&sdev->ipc_lock, flags); - return 0; } EXPORT_SYMBOL(snd_sof_ipc_reply);
Currently on all supported platforms the IPC IRQ thread first signals the sender when an IPC response is received from the DSP, then unmasks the IPC interrupt. Those actions are performed without holding any locks, so the thread can be interrupted between them. IPC timeouts have been observed in such scenarios: if the sender is woken up and it proceeds with sending the next message without unmasking the IPC interrupt, it can miss the next response. This patch takes a spin-lock to prevent the IRQ thread from being preempted at that point. It also makes sure, that the next IPC transmission by the host cannot take place before the IRQ thread has finished updating all the required IPC registers.
Signed-off-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com ---
Hi Takashi,
Unfortunately there were 4 empty lines added in the version, that Pierre had sent to you, that weren't in the patch, that I have provided to him. Below the correct version - if not too late yet.
Thanks Guennadi
sound/soc/sof/intel/bdw.c | 9 ++++----- sound/soc/sof/intel/byt.c | 10 +++++----- sound/soc/sof/intel/cnl.c | 4 ++++ sound/soc/sof/intel/hda-ipc.c | 17 ++++++++++++++--- sound/soc/sof/ipc.c | 13 ------------- 5 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/sound/soc/sof/intel/bdw.c b/sound/soc/sof/intel/bdw.c index 065cb86..8ff3ee5 100644 --- a/sound/soc/sof/intel/bdw.c +++ b/sound/soc/sof/intel/bdw.c @@ -283,6 +283,8 @@ static irqreturn_t bdw_irq_thread(int irq, void *context) SHIM_IMRX, SHIM_IMRX_DONE, SHIM_IMRX_DONE);
+ spin_lock_irq(&sdev->ipc_lock); + /* * handle immediate reply from DSP core. If the msg is * found, set done bit in cmd_done which is called at the @@ -294,6 +296,8 @@ static irqreturn_t bdw_irq_thread(int irq, void *context) snd_sof_ipc_reply(sdev, ipcx);
bdw_dsp_done(sdev); + + spin_unlock_irq(&sdev->ipc_lock); }
ipcd = snd_sof_dsp_read(sdev, BDW_DSP_BAR, SHIM_IPCD); @@ -485,7 +489,6 @@ static void bdw_get_reply(struct snd_sof_dev *sdev) { struct snd_sof_ipc_msg *msg = sdev->msg; struct sof_ipc_reply reply; - unsigned long flags; int ret = 0;
/* @@ -501,8 +504,6 @@ static void bdw_get_reply(struct snd_sof_dev *sdev) /* get reply */ sof_mailbox_read(sdev, sdev->host_box.offset, &reply, sizeof(reply));
- spin_lock_irqsave(&sdev->ipc_lock, flags); - if (reply.error < 0) { memcpy(msg->reply_data, &reply, sizeof(reply)); ret = reply.error; @@ -521,8 +522,6 @@ static void bdw_get_reply(struct snd_sof_dev *sdev) }
msg->reply_error = ret; - - spin_unlock_irqrestore(&sdev->ipc_lock, flags); }
static void bdw_host_done(struct snd_sof_dev *sdev) diff --git a/sound/soc/sof/intel/byt.c b/sound/soc/sof/intel/byt.c index 7bf9143..9e4c07e 100644 --- a/sound/soc/sof/intel/byt.c +++ b/sound/soc/sof/intel/byt.c @@ -329,6 +329,9 @@ static irqreturn_t byt_irq_thread(int irq, void *context) SHIM_IMRX, SHIM_IMRX_DONE, SHIM_IMRX_DONE); + + spin_lock_irq(&sdev->ipc_lock); + /* * handle immediate reply from DSP core. If the msg is * found, set done bit in cmd_done which is called at the @@ -340,6 +343,8 @@ static irqreturn_t byt_irq_thread(int irq, void *context) snd_sof_ipc_reply(sdev, ipcx);
byt_dsp_done(sdev); + + spin_unlock_irq(&sdev->ipc_lock); }
/* new message from DSP */ @@ -383,7 +388,6 @@ static void byt_get_reply(struct snd_sof_dev *sdev) { struct snd_sof_ipc_msg *msg = sdev->msg; struct sof_ipc_reply reply; - unsigned long flags; int ret = 0;
/* @@ -399,8 +403,6 @@ static void byt_get_reply(struct snd_sof_dev *sdev) /* get reply */ sof_mailbox_read(sdev, sdev->host_box.offset, &reply, sizeof(reply));
- spin_lock_irqsave(&sdev->ipc_lock, flags); - if (reply.error < 0) { memcpy(msg->reply_data, &reply, sizeof(reply)); ret = reply.error; @@ -419,8 +421,6 @@ static void byt_get_reply(struct snd_sof_dev *sdev) }
msg->reply_error = ret; - - spin_unlock_irqrestore(&sdev->ipc_lock, flags); }
static void byt_host_done(struct snd_sof_dev *sdev) diff --git a/sound/soc/sof/intel/cnl.c b/sound/soc/sof/intel/cnl.c index c059d11..e59d180 100644 --- a/sound/soc/sof/intel/cnl.c +++ b/sound/soc/sof/intel/cnl.c @@ -64,6 +64,8 @@ static irqreturn_t cnl_ipc_irq_thread(int irq, void *context) CNL_DSP_REG_HIPCCTL, CNL_DSP_REG_HIPCCTL_DONE, 0);
+ spin_lock_irq(&sdev->ipc_lock); + /* handle immediate reply from DSP core */ hda_dsp_ipc_get_reply(sdev); snd_sof_ipc_reply(sdev, msg); @@ -75,6 +77,8 @@ static irqreturn_t cnl_ipc_irq_thread(int irq, void *context)
cnl_ipc_dsp_done(sdev);
+ spin_unlock_irq(&sdev->ipc_lock); + ret = IRQ_HANDLED; }
diff --git a/sound/soc/sof/intel/hda-ipc.c b/sound/soc/sof/intel/hda-ipc.c index 73ead70..51b2851 100644 --- a/sound/soc/sof/intel/hda-ipc.c +++ b/sound/soc/sof/intel/hda-ipc.c @@ -72,7 +72,6 @@ void hda_dsp_ipc_get_reply(struct snd_sof_dev *sdev) struct snd_sof_ipc_msg *msg = sdev->msg; struct sof_ipc_reply reply; struct sof_ipc_cmd_hdr *hdr; - unsigned long flags; int ret = 0;
/* @@ -84,7 +83,6 @@ void hda_dsp_ipc_get_reply(struct snd_sof_dev *sdev) dev_warn(sdev->dev, "unexpected ipc interrupt raised!\n"); return; } - spin_lock_irqsave(&sdev->ipc_lock, flags);
hdr = msg->msg_data; if (hdr->cmd == (SOF_IPC_GLB_PM_MSG | SOF_IPC_PM_CTX_SAVE)) { @@ -123,7 +121,6 @@ void hda_dsp_ipc_get_reply(struct snd_sof_dev *sdev) out: msg->reply_error = ret;
- spin_unlock_irqrestore(&sdev->ipc_lock, flags); }
static bool hda_dsp_ipc_is_sof(uint32_t msg) @@ -172,6 +169,18 @@ irqreturn_t hda_dsp_ipc_irq_thread(int irq, void *context) HDA_DSP_REG_HIPCCTL, HDA_DSP_REG_HIPCCTL_DONE, 0);
+ /* + * Make sure the interrupt thread cannot be preempted between + * waking up the sender and re-enabling the interrupt. Also + * protect against a theoretical race with sof_ipc_tx_message(): + * if the DSP is fast enough to receive an IPC message, reply to + * it, and the host interrupt processing calls this function on + * a different core from the one, where the sending is taking + * place, the message might not yet be marked as expecting a + * reply. + */ + spin_lock_irq(&sdev->ipc_lock); + /* handle immediate reply from DSP core - ignore ROM messages */ if (hda_dsp_ipc_is_sof(msg)) { hda_dsp_ipc_get_reply(sdev); @@ -187,6 +196,8 @@ irqreturn_t hda_dsp_ipc_irq_thread(int irq, void *context) /* set the done bit */ hda_dsp_ipc_dsp_done(sdev);
+ spin_unlock_irq(&sdev->ipc_lock); + ret = IRQ_HANDLED; }
diff --git a/sound/soc/sof/ipc.c b/sound/soc/sof/ipc.c index 894e68c..10304a9 100644 --- a/sound/soc/sof/ipc.c +++ b/sound/soc/sof/ipc.c @@ -308,19 +308,8 @@ int sof_ipc_tx_message(struct snd_sof_ipc *ipc, u32 header, int snd_sof_ipc_reply(struct snd_sof_dev *sdev, u32 msg_id) { struct snd_sof_ipc_msg *msg = &sdev->ipc->msg; - unsigned long flags; - - /* - * Protect against a theoretical race with sof_ipc_tx_message(): if the - * DSP is fast enough to receive an IPC message, reply to it, and the - * host interrupt processing calls this function on a different core - * from the one, where the sending is taking place, the message might - * not yet be marked as expecting a reply. - */ - spin_lock_irqsave(&sdev->ipc_lock, flags);
if (msg->ipc_complete) { - spin_unlock_irqrestore(&sdev->ipc_lock, flags); dev_err(sdev->dev, "error: no reply expected, received 0x%x", msg_id); return -EINVAL; @@ -330,8 +319,6 @@ int snd_sof_ipc_reply(struct snd_sof_dev *sdev, u32 msg_id) msg->ipc_complete = true; wake_up(&msg->waitq);
- spin_unlock_irqrestore(&sdev->ipc_lock, flags); - return 0; } EXPORT_SYMBOL(snd_sof_ipc_reply);
On Thu, May 23, 2019 at 08:34:35AM +0200, Guennadi Liakhovetski wrote:
Currently on all supported platforms the IPC IRQ thread first signals the sender when an IPC response is received from the DSP, then unmasks the IPC interrupt. Those actions are performed without
Please don't send new versions of single patches in the middle of a series, it makes it really hard to follow what the current version of the series is. It's better to resend the whole series.
From: Zhu Yingjiang yingjiang.zhu@linux.intel.com
re-write hda_init_caps and remove the HDA reset, clean HDA streams and clear interrupt steps in hda_dsp_probe so the HDA init steps will not be called twice if the CONFIG_SND_SOC_SOF_HDA is true.
Fixes: 8a300c8fb17 ("ASoC: SOF: Intel: Add HDA controller for Intel DSP") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Zhu Yingjiang yingjiang.zhu@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/intel/hda-ctrl.c | 102 ++++++++++++++++++++++++++++++--- sound/soc/sof/intel/hda.c | 99 ++++++-------------------------- 2 files changed, 112 insertions(+), 89 deletions(-)
diff --git a/sound/soc/sof/intel/hda-ctrl.c b/sound/soc/sof/intel/hda-ctrl.c index 2c3645736e1f..07bc123112c9 100644 --- a/sound/soc/sof/intel/hda-ctrl.c +++ b/sound/soc/sof/intel/hda-ctrl.c @@ -161,21 +161,105 @@ int hda_dsp_ctrl_clock_power_gating(struct snd_sof_dev *sdev, bool enable) return 0; }
-#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) -/* - * While performing reset, controller may not come back properly and causing - * issues, so recommendation is to set CGCTL.MISCBDCGE to 0 then do reset - * (init chip) and then again set CGCTL.MISCBDCGE to 1 - */ int hda_dsp_ctrl_init_chip(struct snd_sof_dev *sdev, bool full_reset) { struct hdac_bus *bus = sof_to_bus(sdev); - int ret; + struct hdac_stream *stream; + int sd_offset, ret = 0; + + if (bus->chip_init) + return 0;
hda_dsp_ctrl_misc_clock_gating(sdev, false); - ret = snd_hdac_bus_init_chip(bus, full_reset); + + if (full_reset) { + /* clear WAKESTS */ + snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_WAKESTS, + SOF_HDA_WAKESTS_INT_MASK, + SOF_HDA_WAKESTS_INT_MASK); + + /* reset HDA controller */ + ret = hda_dsp_ctrl_link_reset(sdev, true); + if (ret < 0) { + dev_err(sdev->dev, "error: failed to reset HDA controller\n"); + return ret; + } + + usleep_range(500, 1000); + + /* exit HDA controller reset */ + ret = hda_dsp_ctrl_link_reset(sdev, false); + if (ret < 0) { + dev_err(sdev->dev, "error: failed to exit HDA controller reset\n"); + return ret; + } + + usleep_range(1000, 1200); + } + +#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) + /* check to see if controller is ready */ + if (!snd_hdac_chip_readb(bus, GCTL)) { + dev_dbg(bus->dev, "controller not ready!\n"); + return -EBUSY; + } + + /* Accept unsolicited responses */ + snd_hdac_chip_updatel(bus, GCTL, AZX_GCTL_UNSOL, AZX_GCTL_UNSOL); + + /* detect codecs */ + if (!bus->codec_mask) { + bus->codec_mask = snd_hdac_chip_readw(bus, STATESTS); + dev_dbg(bus->dev, "codec_mask = 0x%lx\n", bus->codec_mask); + } +#endif + + /* clear stream status */ + list_for_each_entry(stream, &bus->stream_list, list) { + sd_offset = SOF_STREAM_SD_OFFSET(stream); + snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, + sd_offset + + SOF_HDA_ADSP_REG_CL_SD_STS, + SOF_HDA_CL_DMA_SD_INT_MASK, + SOF_HDA_CL_DMA_SD_INT_MASK); + } + + /* clear WAKESTS */ + snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_WAKESTS, + SOF_HDA_WAKESTS_INT_MASK, + SOF_HDA_WAKESTS_INT_MASK); + +#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) + /* clear rirb status */ + snd_hdac_chip_writeb(bus, RIRBSTS, RIRB_INT_MASK); +#endif + + /* clear interrupt status register */ + snd_sof_dsp_write(sdev, HDA_DSP_HDA_BAR, SOF_HDA_INTSTS, + SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_ALL_STREAM); + +#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) + /* initialize the codec command I/O */ + snd_hdac_bus_init_cmd_io(bus); +#endif + + /* enable CIE and GIE interrupts */ + snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_INTCTL, + SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_GLOBAL_EN, + SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_GLOBAL_EN); + +#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) + /* program the position buffer */ + if (bus->use_posbuf && bus->posbuf.addr) { + snd_hdac_chip_writel(bus, DPLBASE, (u32)bus->posbuf.addr); + snd_hdac_chip_writel(bus, DPUBASE, + upper_32_bits(bus->posbuf.addr)); + } +#endif + + bus->chip_init = true; + hda_dsp_ctrl_misc_clock_gating(sdev, true);
return ret; } -#endif diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c index 7e3980a2f7ba..e47f03dc62f0 100644 --- a/sound/soc/sof/intel/hda.c +++ b/sound/soc/sof/intel/hda.c @@ -264,9 +264,12 @@ static const char *fixup_tplg_name(struct snd_sof_dev *sdev, return tplg_filename; }
+#endif + static int hda_init_caps(struct snd_sof_dev *sdev) { struct hdac_bus *bus = sof_to_bus(sdev); +#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) struct hdac_ext_link *hlink; struct snd_soc_acpi_mach_params *mach_params; struct snd_soc_acpi_mach *hda_mach; @@ -274,8 +277,9 @@ static int hda_init_caps(struct snd_sof_dev *sdev) struct snd_soc_acpi_mach *mach; const char *tplg_filename; int codec_num = 0; - int ret = 0; int i; +#endif + int ret = 0;
device_disable_async_suspend(bus->dev);
@@ -283,6 +287,14 @@ static int hda_init_caps(struct snd_sof_dev *sdev) if (bus->ppcap) dev_dbg(sdev->dev, "PP capability, will probe DSP later.\n");
+ ret = hda_dsp_ctrl_init_chip(sdev, true); + if (ret < 0) { + dev_err(bus->dev, "error: init chip failed with ret: %d\n", + ret); + return ret; + } + +#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA) if (bus->mlcap) snd_hdac_ext_bus_get_ml_capabilities(bus);
@@ -293,12 +305,6 @@ static int hda_init_caps(struct snd_sof_dev *sdev) return ret; }
- ret = hda_dsp_ctrl_init_chip(sdev, true); - if (ret < 0) { - dev_err(bus->dev, "error: init chip failed with ret: %d\n", ret); - goto out; - } - /* codec detection */ if (!bus->codec_mask) { dev_info(bus->dev, "no hda codecs found!\n"); @@ -339,8 +345,10 @@ static int hda_init_caps(struct snd_sof_dev *sdev) /* use local variable for readability */ tplg_filename = pdata->tplg_filename; tplg_filename = fixup_tplg_name(sdev, tplg_filename); - if (!tplg_filename) - goto out; + if (!tplg_filename) { + hda_codec_i915_exit(sdev); + return ret; + } pdata->tplg_filename = tplg_filename; } } @@ -364,35 +372,10 @@ static int hda_init_caps(struct snd_sof_dev *sdev) */ list_for_each_entry(hlink, &bus->hlink_list, list) snd_hdac_ext_bus_link_put(bus, hlink); - - return 0; - -out: - hda_codec_i915_exit(sdev); - return ret; -} - -#else - -static int hda_init_caps(struct snd_sof_dev *sdev) -{ - /* - * set CGCTL.MISCBDCGE to 0 during reset and set back to 1 - * when reset finished. - * TODO: maybe no need for init_caps? - */ - hda_dsp_ctrl_misc_clock_gating(sdev, 0); - - /* clear WAKESTS */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_WAKESTS, - SOF_HDA_WAKESTS_INT_MASK, - SOF_HDA_WAKESTS_INT_MASK); - +#endif return 0; }
-#endif - static const struct sof_intel_dsp_desc *get_chip_info(struct snd_sof_pdata *pdata) { @@ -409,9 +392,8 @@ int hda_dsp_probe(struct snd_sof_dev *sdev) struct pci_dev *pci = to_pci_dev(sdev->dev); struct sof_intel_hda_dev *hdev; struct hdac_bus *bus; - struct hdac_stream *stream; const struct sof_intel_dsp_desc *chip; - int sd_offset, ret = 0; + int ret = 0;
/* * detect DSP by checking class/subclass/prog-id information @@ -558,49 +540,6 @@ int hda_dsp_probe(struct snd_sof_dev *sdev) if (ret < 0) goto free_ipc_irq;
- /* reset HDA controller */ - ret = hda_dsp_ctrl_link_reset(sdev, true); - if (ret < 0) { - dev_err(sdev->dev, "error: failed to reset HDA controller\n"); - goto free_ipc_irq; - } - - /* exit HDA controller reset */ - ret = hda_dsp_ctrl_link_reset(sdev, false); - if (ret < 0) { - dev_err(sdev->dev, "error: failed to exit HDA controller reset\n"); - goto free_ipc_irq; - } - - /* clear stream status */ - list_for_each_entry(stream, &bus->stream_list, list) { - sd_offset = SOF_STREAM_SD_OFFSET(stream); - snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, - sd_offset + - SOF_HDA_ADSP_REG_CL_SD_STS, - SOF_HDA_CL_DMA_SD_INT_MASK, - SOF_HDA_CL_DMA_SD_INT_MASK); - } - - /* clear WAKESTS */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_WAKESTS, - SOF_HDA_WAKESTS_INT_MASK, - SOF_HDA_WAKESTS_INT_MASK); - - /* clear interrupt status register */ - snd_sof_dsp_write(sdev, HDA_DSP_HDA_BAR, SOF_HDA_INTSTS, - SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_ALL_STREAM); - - /* enable CIE and GIE interrupts */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_INTCTL, - SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_GLOBAL_EN, - SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_GLOBAL_EN); - - /* re-enable CGCTL.MISCBDCGE after reset */ - hda_dsp_ctrl_misc_clock_gating(sdev, true); - - device_disable_async_suspend(&pci->dev); - /* enable DSP features */ snd_sof_dsp_update_bits(sdev, HDA_DSP_PP_BAR, SOF_HDA_REG_PP_PPCTL, SOF_HDA_PPCTL_GPROCEN, SOF_HDA_PPCTL_GPROCEN);
From: Zhu Yingjiang yingjiang.zhu@linux.intel.com
There are already defined ppcap and ppcap interrupt functions, use the already defined functions for easy code read.
Fixes: 8a300c8fb17 ("ASoC: SOF: Intel: Add HDA controller for Intel DSP") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Zhu Yingjiang yingjiang.zhu@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/intel/hda.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c index e47f03dc62f0..7edeee4fc74b 100644 --- a/sound/soc/sof/intel/hda.c +++ b/sound/soc/sof/intel/hda.c @@ -540,13 +540,9 @@ int hda_dsp_probe(struct snd_sof_dev *sdev) if (ret < 0) goto free_ipc_irq;
- /* enable DSP features */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_PP_BAR, SOF_HDA_REG_PP_PPCTL, - SOF_HDA_PPCTL_GPROCEN, SOF_HDA_PPCTL_GPROCEN); - - /* enable DSP IRQ */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_PP_BAR, SOF_HDA_REG_PP_PPCTL, - SOF_HDA_PPCTL_PIE, SOF_HDA_PPCTL_PIE); + /* enable ppcap interrupt */ + hda_dsp_ctrl_ppcap_enable(sdev, true); + hda_dsp_ctrl_ppcap_int_enable(sdev, true);
/* initialize waitq for code loading */ init_waitqueue_head(&sdev->waitq);
The patch
ASoC: SOF: Intel: hda: use the defined ppcap functions
has been applied to the asoc tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-5.2
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
From 1f5253b08e06bc798e5254ede22d4238e9a52af8 Mon Sep 17 00:00:00 2001
From: Zhu Yingjiang yingjiang.zhu@linux.intel.com Date: Wed, 22 May 2019 11:21:40 -0500 Subject: [PATCH] ASoC: SOF: Intel: hda: use the defined ppcap functions
There are already defined ppcap and ppcap interrupt functions, use the already defined functions for easy code read.
Fixes: 8a300c8fb17 ("ASoC: SOF: Intel: Add HDA controller for Intel DSP") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Zhu Yingjiang yingjiang.zhu@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org --- sound/soc/sof/intel/hda.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c index 9e2e0f21524e..faf1a8ada091 100644 --- a/sound/soc/sof/intel/hda.c +++ b/sound/soc/sof/intel/hda.c @@ -548,13 +548,9 @@ int hda_dsp_probe(struct snd_sof_dev *sdev) if (ret < 0) goto free_ipc_irq;
- /* enable DSP features */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_PP_BAR, SOF_HDA_REG_PP_PPCTL, - SOF_HDA_PPCTL_GPROCEN, SOF_HDA_PPCTL_GPROCEN); - - /* enable DSP IRQ */ - snd_sof_dsp_update_bits(sdev, HDA_DSP_PP_BAR, SOF_HDA_REG_PP_PPCTL, - SOF_HDA_PPCTL_PIE, SOF_HDA_PPCTL_PIE); + /* enable ppcap interrupt */ + hda_dsp_ctrl_ppcap_enable(sdev, true); + hda_dsp_ctrl_ppcap_int_enable(sdev, true);
/* initialize waitq for code loading */ init_waitqueue_head(&sdev->waitq);
From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
During the integration of HDaudio support, we changed the way in which we get hdev in snd_hdac_ext_bus_device_init() to use one preallocated with devm_kzalloc(), however it still left kfree(hdev) in snd_hdac_ext_bus_device_exit(). It leads to oopses when trying to rmmod and modprobe. Fix it, by just removing kfree call.
SOF also uses some of the snd_hdac_ functions for HDAudio support but allocated the memory with kzalloc. A matching fix is provided separately to align all users of the snd_hdac_ library.
Fixes: 6298542fa33b ("ALSA: hdac: remove memory allocation from snd_hdac_ext_bus_device_init") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/hda/ext/hdac_ext_bus.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/sound/hda/ext/hdac_ext_bus.c b/sound/hda/ext/hdac_ext_bus.c index ec7715c6b0c0..c147ebe542da 100644 --- a/sound/hda/ext/hdac_ext_bus.c +++ b/sound/hda/ext/hdac_ext_bus.c @@ -172,7 +172,6 @@ EXPORT_SYMBOL_GPL(snd_hdac_ext_bus_device_init); void snd_hdac_ext_bus_device_exit(struct hdac_device *hdev) { snd_hdac_device_exit(hdev); - kfree(hdev); } EXPORT_SYMBOL_GPL(snd_hdac_ext_bus_device_exit);
From: Libin Yang libin.yang@intel.com
Align all users of the hdac library to use devm_kzalloc.
Note for backports/stable: the patch ("ALSA: hdac: fix memory release for SST and SOF drivers") needs to be applied as well.
Fixes: 5507b8103e26 ("ASoC: SOF: Intel: Add support for HDAudio codecs") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Libin Yang libin.yang@intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com --- sound/soc/sof/intel/hda-codec.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/sound/soc/sof/intel/hda-codec.c b/sound/soc/sof/intel/hda-codec.c index b8b37f082309..0d8437b080bf 100644 --- a/sound/soc/sof/intel/hda-codec.c +++ b/sound/soc/sof/intel/hda-codec.c @@ -62,8 +62,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address) address, resp);
#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_AUDIO_CODEC) - /* snd_hdac_ext_bus_device_exit will use kfree to free hdev */ - hda_priv = kzalloc(sizeof(*hda_priv), GFP_KERNEL); + hda_priv = devm_kzalloc(sdev->dev, sizeof(*hda_priv), GFP_KERNEL); if (!hda_priv) return -ENOMEM;
@@ -82,8 +81,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address)
return 0; #else - /* snd_hdac_ext_bus_device_exit will use kfree to free hdev */ - hdev = kzalloc(sizeof(*hdev), GFP_KERNEL); + hdev = devm_kzalloc(sdev->dev, sizeof(*hdev), GFP_KERNEL); if (!hdev) return -ENOMEM;
Please hold on this patch. It seems there is some corner case failed because of this patch.
Regards, Libin
-----Original Message----- From: Pierre-Louis Bossart [mailto:pierre-louis.bossart@linux.intel.com] Sent: Thursday, May 23, 2019 12:22 AM To: alsa-devel@alsa-project.org Cc: tiwai@suse.de; broonie@kernel.org; Yang, Libin libin.yang@intel.com; Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Subject: [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
From: Libin Yang libin.yang@intel.com
Align all users of the hdac library to use devm_kzalloc.
Note for backports/stable: the patch ("ALSA: hdac: fix memory release for SST and SOF drivers") needs to be applied as well.
Fixes: 5507b8103e26 ("ASoC: SOF: Intel: Add support for HDAudio codecs") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Libin Yang libin.yang@intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
sound/soc/sof/intel/hda-codec.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/sound/soc/sof/intel/hda-codec.c b/sound/soc/sof/intel/hda- codec.c index b8b37f082309..0d8437b080bf 100644 --- a/sound/soc/sof/intel/hda-codec.c +++ b/sound/soc/sof/intel/hda-codec.c @@ -62,8 +62,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address) address, resp);
#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_AUDIO_CODEC)
- /* snd_hdac_ext_bus_device_exit will use kfree to free hdev */
- hda_priv = kzalloc(sizeof(*hda_priv), GFP_KERNEL);
- hda_priv = devm_kzalloc(sdev->dev, sizeof(*hda_priv), GFP_KERNEL); if (!hda_priv) return -ENOMEM;
@@ -82,8 +81,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address)
return 0; #else
- /* snd_hdac_ext_bus_device_exit will use kfree to free hdev */
- hdev = kzalloc(sizeof(*hdev), GFP_KERNEL);
- hdev = devm_kzalloc(sdev->dev, sizeof(*hdev), GFP_KERNEL); if (!hdev) return -ENOMEM;
-- 2.20.1
Please let me describe the issue here.
The test case is: 1) Unload module with script "sudo ./sof_remove.sh" , 2) reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
sof_remove.sh is used to remove the modules: remove_module sof_pci_dev remove_module snd_sof_intel_hda_common remove_module snd_soc_skl_hda_dsp remove_module snd_soc_hdac_hdmi remove_module snd_hda_codec_realtek remove_module snd_hda_codec_generic remove_module snd_soc_dmic remove_module snd_soc_hdac_hda remove_module snd_sof_intel_hda remove_module snd_hda_ext_core remove_module snd_hda_codec remove_module snd_hda_core
remove_module sof_acpi_dev remove_module snd_soc_acpi_intel_match remove_module snd_sof_intel_byt remove_module snd_sof_intel_hsw remove_module snd_sof_intel_bdw remove_module snd_sof_xtensa_dsp remove_module snd_sof_intel_ipc
remove_module snd_soc_sst_bytcr_rt5640 remove_module snd_soc_sst_bytcr_rt5651 remove_module snd_soc_sst_cht_bsw_rt5645 remove_module snd_soc_sst_cht_bsw_rt5670 remove_module snd_soc_sst_byt_cht_da7213 remove_module snd_soc_sst_bxt_pcm512x remove_module snd_soc_sst_bxt_tdf8532 remove_module snd_soc_cnl_rt274 remove_module snd_sof_nocodec remove_module snd_sof
remove_module snd_soc_acpi_intel_match
remove_module snd_soc_rt5670 remove_module snd_soc_rt5645 remove_module snd_soc_rt5651 remove_module snd_soc_rt5640 remove_module snd_soc_rl6231 remove_module snd_soc_da7213 remove_module snd_soc_pcm512x_i2c remove_module snd_soc_pcm512x remove_module snd_soc_tdf8532 remove_module snd_soc_rt274 remove_module snd_soc_acpi
remove_module snd_soc_core remove_module snd_pcm
And sof_insert.sh is to insert the modules: modprobe snd_soc_rt5670 modprobe snd_soc_rt5645 modprobe snd_soc_rt5651 modprobe snd_soc_rt5640 modprobe snd_soc_da7213 modprobe snd_soc_pcm512x_i2c #modprobe snd_soc_tdf8532 #modprobe snd_soc_rt274
modprobe sof_acpi_dev modprobe sof_pci_dev
Regards, Libin
-----Original Message----- From: Yang, Libin Sent: Thursday, May 23, 2019 12:10 PM To: 'Pierre-Louis Bossart' pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org Cc: tiwai@suse.de; broonie@kernel.org Subject: RE: [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
Please hold on this patch. It seems there is some corner case failed because of this patch.
Regards, Libin
-----Original Message----- From: Pierre-Louis Bossart [mailto:pierre-louis.bossart@linux.intel.com] Sent: Thursday, May 23, 2019 12:22 AM To: alsa-devel@alsa-project.org Cc: tiwai@suse.de; broonie@kernel.org; Yang, Libin libin.yang@intel.com; Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Subject: [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
From: Libin Yang libin.yang@intel.com
Align all users of the hdac library to use devm_kzalloc.
Note for backports/stable: the patch ("ALSA: hdac: fix memory release for SST and SOF drivers") needs to be applied as well.
Fixes: 5507b8103e26 ("ASoC: SOF: Intel: Add support for HDAudio codecs") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Libin Yang libin.yang@intel.com Signed-off-by: Pierre-Louis Bossart
pierre-louis.bossart@linux.intel.com
sound/soc/sof/intel/hda-codec.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/sound/soc/sof/intel/hda-codec.c b/sound/soc/sof/intel/hda- codec.c index b8b37f082309..0d8437b080bf 100644 --- a/sound/soc/sof/intel/hda-codec.c +++ b/sound/soc/sof/intel/hda-codec.c @@ -62,8 +62,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address) address, resp);
#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_AUDIO_CODEC)
- /* snd_hdac_ext_bus_device_exit will use kfree to free hdev */
- hda_priv = kzalloc(sizeof(*hda_priv), GFP_KERNEL);
- hda_priv = devm_kzalloc(sdev->dev, sizeof(*hda_priv), GFP_KERNEL); if (!hda_priv) return -ENOMEM;
@@ -82,8 +81,7 @@ static int hda_codec_probe(struct snd_sof_dev *sdev, int address)
return 0; #else
- /* snd_hdac_ext_bus_device_exit will use kfree to free hdev */
- hdev = kzalloc(sizeof(*hdev), GFP_KERNEL);
- hdev = devm_kzalloc(sdev->dev, sizeof(*hdev), GFP_KERNEL); if (!hdev) return -ENOMEM;
-- 2.20.1
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
Takashi
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
Regards, Libin
Takashi
On Thu, 23 May 2019 10:21:20 +0200, Yang, Libin wrote:
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option for showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever interesting in CONFIG_DEBUG section, too.
Takashi
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:27 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:21:20 +0200, Yang, Libin wrote:
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option for showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever interesting in CONFIG_DEBUG section, too.
Thanks. I will have a try.
Regards, Libin
Takashi
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:27 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:21:20 +0200, Yang, Libin wrote:
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option for showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever interesting in CONFIG_DEBUG section, too.
Thanks for your suggestion. After more than 1 week debug, I think maybe I have root caused this issue from the devres.log message.
Below is my finding. 1. When initialing the codecs, snd_hdac_ext_bus_device_init() will be called, and it will set hdev->dev.release = default_release. However, for analog codec (not hdac_hdmi codec), hdac_hda_codec_probe() will be called later. And it will call snd_hda_codec_device_new(), which will reset codec->code.dev.release = snd_hda_codec_dev_release; This means hdac_hdmi's hdev dev release is default_release() defined in hdac_ext_bus.c, and other hda codec's hdev dev release is snd_hda_codec_dev_release().
Both default_release() and snd_hda_codec_dev_release() will call kfree() to free the hdac_device (or its container) in the current code.
2. When we run rmmod sof_pci_dev, it will free the sdev. If we use Struct hdac_device *hdev = devm_kzalloc(sdev->dev...). This means hdev will also be freed automatically.
In the removal, snd_hdac_ext_bus_device_remove() will be called to remove the hdev (in this function it is struct hdac_device *codec. The name is not aligned in different places). However for hdac_hdmi, the hdev->dev is used by other code. So calling device.release() (the function default_release()) will be postponed. After after sdev is freed, the device.release() will be called. But for devm_xxx, hdev will also be freed when sdev is freed. This means hdev.dev after sdev is freed is invalid now as hdev has already freed. It will access invalid memory. This will cause the bug.
So I think we should not use devm_xxx, and let's free the hdev manually.
At the end of this topic, I still found 2 suspicious code in the current code. 1. in sound/soc/intel/skylate/skl.c it calls hdev = devm_kazlloc() or hda_codec = devm_kzalloc(). As we will call kfree() in the current code, should we replace it with kzalloc()? Maybe we need cavs drivers owner's help on it.
2. in snd_hdac_ext_bus_device_remove() It will call snd_hdac_device_unregister() to unregister the hdac_devices and put_device(&codec->dev) to release the dev. For analog codec, snd_hdac_device_unregister() will free the codec->dev's kobject. And snd_hda_codec_dev_release() will be called to free the hdac_device. So it is invalid to call put_device(&codec->dev). If you print refcound_read(&(codec->dev.kobj.kref.refcount)) for analog codec before put_device(), you will find the refcount has already been 0.
Regards, Libin
Takashi
Hi Libin,
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option for showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever interesting in CONFIG_DEBUG section, too.
Thanks for your suggestion. After more than 1 week debug, I think maybe I have root caused this issue from the devres.log message.
Below is my finding.
- When initialing the codecs, snd_hdac_ext_bus_device_init() will be called,
and it will set hdev->dev.release = default_release. However, for analog codec (not hdac_hdmi codec), hdac_hda_codec_probe() will be called later. And it will call snd_hda_codec_device_new(), which will reset codec->code.dev.release = snd_hda_codec_dev_release; This means hdac_hdmi's hdev dev release is default_release() defined in hdac_ext_bus.c, and other hda codec's hdev dev release is snd_hda_codec_dev_release().
Both default_release() and snd_hda_codec_dev_release() will call kfree() to free the hdac_device (or its container) in the current code.
- When we run rmmod sof_pci_dev, it will free the sdev. If we use
Struct hdac_device *hdev = devm_kzalloc(sdev->dev...). This means hdev will also be freed automatically.
In the removal, snd_hdac_ext_bus_device_remove() will be called to remove the hdev (in this function it is struct hdac_device *codec. The name is not aligned in different places). However for hdac_hdmi, the hdev->dev is used by other code.
what other code? Can you elaborate on why the release is delayed?
So calling device.release() (the function default_release()) will be postponed. After after sdev is freed, the device.release() will be called. But for devm_xxx, hdev will also be freed when sdev is freed. This means hdev.dev after sdev is freed is invalid now as hdev has already freed. It will access invalid memory. This will cause the bug.
This is very hard to follow. 4 lines above you wrote the release is postponed but the way you describe is looks completely sequential.
So I think we should not use devm_xxx, and let's free the hdev manually.
At the end of this topic, I still found 2 suspicious code in the current code.
- in sound/soc/intel/skylate/skl.c
it calls hdev = devm_kazlloc() or hda_codec = devm_kzalloc(). As we will call kfree() in the current code, should we replace it with kzalloc()? Maybe we need cavs drivers owner's help on it.
maybe you should send a diff suggestion to help everyone understand the changes you are referring to?
- in snd_hdac_ext_bus_device_remove()
It will call snd_hdac_device_unregister() to unregister the hdac_devices and put_device(&codec->dev) to release the dev. For analog codec, snd_hdac_device_unregister() will free the codec->dev's kobject. And snd_hda_codec_dev_release() will be called to free the hdac_device. So it is invalid to call put_device(&codec->dev). If you print refcound_read(&(codec->dev.kobj.kref.refcount)) for analog codec before put_device(), you will find the refcount has already been 0.
Isn't it a different problem though? Does this cause a freeze or is this just a bad refcount?
Hi Pierre,
Hi Libin,
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option for showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever
interesting
in CONFIG_DEBUG section, too.
Thanks for your suggestion. After more than 1 week debug, I think maybe I have root caused this issue from the devres.log message.
Below is my finding.
- When initialing the codecs, snd_hdac_ext_bus_device_init() will be
called, and it will set hdev->dev.release = default_release. However, for analog codec (not hdac_hdmi codec), hdac_hda_codec_probe() will be called later. And it will call snd_hda_codec_device_new(), which will reset codec->code.dev.release = snd_hda_codec_dev_release; This means hdac_hdmi's hdev dev release is default_release() defined in hdac_ext_bus.c, and other hda codec's hdev dev release is snd_hda_codec_dev_release().
Both default_release() and snd_hda_codec_dev_release() will call kfree() to free the hdac_device (or its container) in the current code.
- When we run rmmod sof_pci_dev, it will free the sdev. If we use
Struct hdac_device *hdev = devm_kzalloc(sdev->dev...). This means hdev will also be freed automatically.
In the removal, snd_hdac_ext_bus_device_remove() will be called to remove the hdev (in this function it is struct hdac_device *codec. The name is not aligned in different places). However for hdac_hdmi, the hdev->dev is used by other code.
what other code? Can you elaborate on why the release is delayed?
From the dmesg, it is the device_link used for hdac_hdmi that causes the release delay. Device_link will increase the refcount.
So calling device.release() (the function default_release()) will be postponed. After after sdev is freed, the device.release() will be called. But for devm_xxx, hdev will also be freed when sdev is freed. This means hdev.dev after sdev is freed is invalid now as hdev has already freed. It will access invalid memory. This will cause the bug.
This is very hard to follow. 4 lines above you wrote the release is postponed but the way you describe is looks completely sequential.
For hdac_hdmi, as it is used by other code (device_link), the device release will not be called immediately. After sdev and hdev is freed, the hdev device is put by device_link. Then hdev device can be released. But hdev has already be freed. This is wrong.
So I think we should not use devm_xxx, and let's free the hdev manually.
At the end of this topic, I still found 2 suspicious code in the current code.
- in sound/soc/intel/skylate/skl.c
it calls hdev = devm_kazlloc() or hda_codec = devm_kzalloc(). As we will call kfree() in the current code, should we replace it with kzalloc()? Maybe we need cavs drivers owner's help on it.
maybe you should send a diff suggestion to help everyone understand the changes you are referring to?
Please see my inline patch:
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index f864f7b..12af99a 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -697,8 +697,7 @@ static int probe_codec(struct hdac_bus *bus, int addr) dev_dbg(bus->dev, "codec #%d probed OK: %x\n", addr, res);
#if IS_ENABLED(CONFIG_SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC) - hda_codec = devm_kzalloc(&skl->pci->dev, sizeof(*hda_codec), - GFP_KERNEL); + hda_codec = kzalloc(sizeof(*hda_codec), GFP_KERNEL); if (!hda_codec) return -ENOMEM;
@@ -716,7 +715,7 @@ static int probe_codec(struct hdac_bus *bus, int addr) } return 0; #else - hdev = devm_kzalloc(&skl->pci->dev, sizeof(*hdev), GFP_KERNEL); + hdev = kzalloc(sizeof(*hdev), GFP_KERNEL); if (!hdev) return -ENOMEM;
- in snd_hdac_ext_bus_device_remove() It will call
snd_hdac_device_unregister() to unregister the hdac_devices and put_device(&codec->dev) to release the dev. For analog codec, snd_hdac_device_unregister() will free the codec->dev's kobject. And snd_hda_codec_dev_release() will be called to free the hdac_device. So it is invalid to call put_device(&codec->dev). If you print refcound_read(&(codec->dev.kobj.kref.refcount)) for analog codec before put_device(), you will find the refcount has already been 0.
Isn't it a different problem though? Does this cause a freeze or is this just a bad refcount?
Yes, it is a totally different problem. And today morning, I found Ranjani has a patch " ASoC: hda: increment codec device refcount when it is added to the card ", which should fix this bug.
Regards, Libin
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option
for
showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever
interesting in
CONFIG_DEBUG section, too.
Thanks for your suggestion. After more than 1 week debug, I think maybe I have root caused this issue from the devres.log message.
Below is my finding.
- When initialing the codecs, snd_hdac_ext_bus_device_init() will be called,
and it will set hdev->dev.release = default_release. However, for analog codec (not hdac_hdmi codec), hdac_hda_codec_probe() will be called later. And it will call snd_hda_codec_device_new(), which will reset codec->code.dev.release = snd_hda_codec_dev_release; This means hdac_hdmi's hdev dev release is default_release() defined in hdac_ext_bus.c, and other hda codec's hdev dev release is snd_hda_codec_dev_release().
Both default_release() and snd_hda_codec_dev_release() will call kfree() to free the hdac_device (or its container) in the current code.
- When we run rmmod sof_pci_dev, it will free the sdev. If we use Struct
hdac_device *hdev = devm_kzalloc(sdev->dev...). This means hdev will also be freed automatically.
In the removal, snd_hdac_ext_bus_device_remove() will be called to remove the hdev (in this function it is struct hdac_device *codec. The name is not aligned in different places). However for hdac_hdmi, the hdev->dev is used by other code. So calling device.release() (the function default_release()) will be postponed. After after sdev is freed, the device.release() will be called. But for devm_xxx, hdev will also be freed when sdev is freed. This means hdev.dev after sdev is freed is invalid now as hdev has already freed. It will access invalid memory. This will cause the bug.
So I think we should not use devm_xxx, and let's free the hdev manually.
At the end of this topic, I still found 2 suspicious code in the current code.
- in sound/soc/intel/skylate/skl.c
it calls hdev = devm_kazlloc() or hda_codec = devm_kzalloc(). As we will call kfree() in the current code, should we replace it with kzalloc()? Maybe we need cavs drivers owner's help on it.
- in snd_hdac_ext_bus_device_remove()
It will call snd_hdac_device_unregister() to unregister the hdac_devices and put_device(&codec->dev) to release the dev. For analog codec, snd_hdac_device_unregister() will free the codec->dev's kobject. And snd_hda_codec_dev_release() will be called to free the hdac_device. So it is invalid to call put_device(&codec->dev). If you print refcound_read(&(codec->dev.kobj.kref.refcount)) for analog codec before put_device(), you will find the refcount has already been 0.
It seems Ranjani's " ASoC: hda: increment codec device refcount when it is added to the card" patch can fix the second issue.
Regards, Libin
-----Original Message----- From: Yang, Libin Sent: Monday, June 3, 2019 5:10 PM To: Takashi Iwai tiwai@suse.de Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: RE: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:27 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:21:20 +0200, Yang, Libin wrote:
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option
for
showing each devres allocation and removal. And I'd try CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever
interesting in
CONFIG_DEBUG section, too.
Thanks for your suggestion. After more than 1 week debug, I think maybe I have root caused this issue from the devres.log message.
Below is my finding.
- When initialing the codecs, snd_hdac_ext_bus_device_init() will be called,
and it will set hdev->dev.release = default_release. However, for analog codec (not hdac_hdmi codec), hdac_hda_codec_probe() will be called later. And it will call snd_hda_codec_device_new(), which will reset codec->code.dev.release = snd_hda_codec_dev_release; This means hdac_hdmi's hdev dev release is default_release() defined in hdac_ext_bus.c, and other hda codec's hdev dev release is snd_hda_codec_dev_release().
Both default_release() and snd_hda_codec_dev_release() will call kfree() to free the hdac_device (or its container) in the current code.
- When we run rmmod sof_pci_dev, it will free the sdev. If we use Struct
hdac_device *hdev = devm_kzalloc(sdev->dev...). This means hdev will also be freed automatically.
In the removal, snd_hdac_ext_bus_device_remove() will be called to remove the hdev (in this function it is struct hdac_device *codec. The name is not aligned in different places). However for hdac_hdmi, the hdev->dev is used by other code. So calling device.release() (the function default_release()) will be postponed. After after sdev is freed, the device.release() will be called. But for devm_xxx, hdev will also be freed when sdev is freed. This means hdev.dev after sdev is freed is invalid now as hdev has already freed. It will access invalid memory. This will cause the bug.
So I think we should not use devm_xxx, and let's free the hdev manually.
At the end of this topic, I still found 2 suspicious code in the current code.
- in sound/soc/intel/skylate/skl.c
it calls hdev = devm_kazlloc() or hda_codec = devm_kzalloc(). As we will call kfree() in the current code, should we replace it with kzalloc()? Maybe we need cavs drivers owner's help on it.
- in snd_hdac_ext_bus_device_remove()
It will call snd_hdac_device_unregister() to unregister the hdac_devices and put_device(&codec->dev) to release the dev. For analog codec, snd_hdac_device_unregister() will free the codec->dev's kobject. And snd_hda_codec_dev_release() will be called to free the hdac_device. So it is invalid to call put_device(&codec->dev). If you print refcound_read(&(codec->dev.kobj.kref.refcount)) for analog codec before put_device(), you will find the refcount has already been 0.
I tested on the latest code. The behavior on kernel 5.2.0-rc1 is different from it on kernel 5.0.0 of https://github.com/thesofproject/linux/. (thesofproject kernel has already been upgraded to 5.2.0-rc1 now)
On kernel 5.0.0 (https://github.com/thesofproject/linux/), when removing the sof kernel modules, the sequence is: 1. devm_kazlloc_release() to release hdac_device 2. snd_hdac_ext_bus_device_exit(): this function will use hdac_device
However, on kernel 5.2.0-rc1, the sequence is: 1. snd_hdac_ext_bus_device_exit() 2. devm_kazlloc_release() to release hdac_device
So I think it is safe now to use devm_xxx to allocate the hdac_device. Also, Ranjani found a duplicated release of hdac_device. She has a patch to fix it.
Regards, Libin
Regards, Libin
Takashi
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
BTW: I have enabled DEBUG_DEVRES. Are there any other options I can try?
Regards, Libin
Regards, Libin
Takashi
On 5/23/19 3:24 AM, Yang, Libin wrote:
Hi Takashi,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Thursday, May 23, 2019 4:16 PM To: Yang, Libin libin.yang@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; alsa- devel@alsa-project.org; broonie@kernel.org Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
BTW: I have enabled DEBUG_DEVRES. Are there any other options I can try?
There are already a set of kconfig fragments for debug, see https://github.com/thesofproject/kconfig and select memory-debug-defconfig.
I guess I will need to require this test in the SOF CI, I really don't get how this issue was not seen earlier. Gah.
Hi Peirre,
On Thu, 23 May 2019 10:03:03 +0200, Yang, Libin wrote:
Please let me describe the issue here.
The test case is:
- Unload module with script "sudo ./sof_remove.sh" ,
- reload module with script "sudo ./sof_insert.sh"
After several rounds of removing and inserting kernel modules, system will complain like below: "BUG: unable to handle kernel paging request at 000000292a282031"
Did you try some kernel debug options? It might show what went wrong.
No, I haven't. I'm not sure which options I can use for this case. Could you please give me some suggestions?
BTW: I have enabled DEBUG_DEVRES. Are there any other options I can try?
There are already a set of kconfig fragments for debug, see https://github.com/thesofproject/kconfig and select memory-debug-defconfig.
I guess I will need to require this test in the SOF CI, I really don't get how this issue was not seen earlier. Gah.
This bug can't be reproduced easily sometimes. Sometimes, the bug will be hit at the second or third round of the test. And sometimes, we need do more rounds to reproduce this bug. This may be the reason we didn't hit this bug before.
Regards, Libin
participants (5)
-
Guennadi Liakhovetski
-
Mark Brown
-
Pierre-Louis Bossart
-
Takashi Iwai
-
Yang, Libin