[PATCH v2 0/4] soundwire: qcom: stablity fixes
During x13s audio testing we hit few corner cases due to issues in codec drivers and some obvious code bugs.
Here are the fixes for those issues, mostly the issues are around devices loosing the sync in between runtime pm suspend resume path.
With codec fixes along with these fixes, audio on x13s is pretty stable.
Thanks, Srini
Changes since v1: - dropped runtime pm changes patch as unable to reproduced it anymore - fixed clk stop flag as suggested by Pierre - rebased on top of linux-next
Srinivas Kandagatla (4): soundwire: qcom: update status correctly with mask soundwire: qcom: wait for fifo to be empty before suspend soundwire: qcom: add software workaround for bus clash interrupt assertion soundwire: qcom: set clk stop need reset flag at runtime
drivers/soundwire/qcom.c | 100 ++++++++++++++++++++++++++------------- 1 file changed, 68 insertions(+), 32 deletions(-)
SoundWire device status can be incorrectly updated without proper mask, fix this by adding a mask before updating the status.
Fixes: c7d49c76d1d5 ("soundwire: qcom: add support to new interrupts") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org --- drivers/soundwire/qcom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index aad5942e5980..9440787e924b 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -515,7 +515,7 @@ static int qcom_swrm_get_alert_slave_dev_num(struct qcom_swrm_ctrl *ctrl) status = (val >> (dev_num * SWRM_MCP_SLV_STATUS_SZ));
if ((status & SWRM_MCP_SLV_STATUS_MASK) == SDW_SLAVE_ALERT) { - ctrl->status[dev_num] = status; + ctrl->status[dev_num] = status & SWRM_MCP_SLV_STATUS_MASK; return dev_num; } }
On Thu, May 25, 2023 at 02:38:09PM +0100, Srinivas Kandagatla wrote:
SoundWire device status can be incorrectly updated without proper mask, fix this by adding a mask before updating the status.
Fixes: c7d49c76d1d5 ("soundwire: qcom: add support to new interrupts") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org
When rebasing on 6.5-rc1, I noticed that this one was apparently never merged along with the rest of the series.
Any idea how this could have happened?
And can we get this one into 6.5 as well?
drivers/soundwire/qcom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index aad5942e5980..9440787e924b 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -515,7 +515,7 @@ static int qcom_swrm_get_alert_slave_dev_num(struct qcom_swrm_ctrl *ctrl) status = (val >> (dev_num * SWRM_MCP_SLV_STATUS_SZ));
if ((status & SWRM_MCP_SLV_STATUS_MASK) == SDW_SLAVE_ALERT) {
ctrl->status[dev_num] = status;
} }ctrl->status[dev_num] = status & SWRM_MCP_SLV_STATUS_MASK; return dev_num;
Johan
Wait for Fifo to be empty before going to suspend or before bank switch happens. Just to make sure that all the reads/writes are done.
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org --- drivers/soundwire/qcom.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index 9440787e924b..adf025194a31 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -404,6 +404,32 @@ static int swrm_wait_for_wr_fifo_avail(struct qcom_swrm_ctrl *ctrl) return 0; }
+static bool swrm_wait_for_wr_fifo_done(struct qcom_swrm_ctrl *ctrl) +{ + u32 fifo_outstanding_cmds, value; + int fifo_retry_count = SWR_OVERFLOW_RETRY_COUNT; + + /* Check for fifo overflow during write */ + ctrl->reg_read(ctrl, ctrl->reg_layout[SWRM_REG_CMD_FIFO_STATUS], &value); + fifo_outstanding_cmds = FIELD_GET(SWRM_WR_CMD_FIFO_CNT_MASK, value); + + if (fifo_outstanding_cmds) { + while (fifo_retry_count) { + usleep_range(500, 510); + ctrl->reg_read(ctrl, ctrl->reg_layout[SWRM_REG_CMD_FIFO_STATUS], &value); + fifo_outstanding_cmds = FIELD_GET(SWRM_WR_CMD_FIFO_CNT_MASK, value); + fifo_retry_count--; + if (fifo_outstanding_cmds == 0) + return true; + } + } else { + return true; + } + + + return false; +} + static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *ctrl, u8 cmd_data, u8 dev_addr, u16 reg_addr) { @@ -434,6 +460,7 @@ static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *ctrl, u8 cmd_data, usleep_range(150, 155);
if (cmd_id == SWR_BROADCAST_CMD_ID) { + swrm_wait_for_wr_fifo_done(ctrl); /* * sleep for 10ms for MSM soundwire variant to allow broadcast * command to complete. @@ -1230,6 +1257,7 @@ static void qcom_swrm_shutdown(struct snd_pcm_substream *substream, { struct qcom_swrm_ctrl *ctrl = dev_get_drvdata(dai->dev);
+ swrm_wait_for_wr_fifo_done(ctrl); sdw_release_stream(ctrl->sruntime[dai->id]); ctrl->sruntime[dai->id] = NULL; pm_runtime_mark_last_busy(ctrl->dev); @@ -1688,6 +1716,7 @@ static int __maybe_unused swrm_runtime_suspend(struct device *dev) struct qcom_swrm_ctrl *ctrl = dev_get_drvdata(dev); int ret;
+ swrm_wait_for_wr_fifo_done(ctrl); if (!ctrl->clock_stop_not_supported) { /* Mask bus clash interrupt */ ctrl->intr_mask &= ~SWRM_INTERRUPT_STATUS_MASTER_CLASH_DET;
Sometimes Hard reset does not clear some of the registers, this sometimes results in firing a bus clash interrupt. Add workaround for this during power up sequence, as suggested by hardware manual.
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org --- drivers/soundwire/qcom.c | 56 ++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 22 deletions(-)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index adf025194a31..1d2a105cb77f 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -793,6 +793,26 @@ static irqreturn_t qcom_swrm_irq_handler(int irq, void *dev_id) return ret; }
+static bool swrm_wait_for_frame_gen_enabled(struct qcom_swrm_ctrl *ctrl) +{ + int retry = SWRM_LINK_STATUS_RETRY_CNT; + int comp_sts; + + do { + ctrl->reg_read(ctrl, SWRM_COMP_STATUS, &comp_sts); + + if (comp_sts & SWRM_FRM_GEN_ENABLED) + return true; + + usleep_range(500, 510); + } while (retry--); + + dev_err(ctrl->dev, "%s: link status not %s\n", __func__, + comp_sts & SWRM_FRM_GEN_ENABLED ? "connected" : "disconnected"); + + return false; +} + static int qcom_swrm_init(struct qcom_swrm_ctrl *ctrl) { u32 val; @@ -841,16 +861,28 @@ static int qcom_swrm_init(struct qcom_swrm_ctrl *ctrl) SWRM_RD_WR_CMD_RETRIES); }
+ /* COMP Enable */ + ctrl->reg_write(ctrl, SWRM_COMP_CFG_ADDR, SWRM_COMP_CFG_ENABLE_MSK); + /* Set IRQ to PULSE */ ctrl->reg_write(ctrl, SWRM_COMP_CFG_ADDR, - SWRM_COMP_CFG_IRQ_LEVEL_OR_PULSE_MSK | - SWRM_COMP_CFG_ENABLE_MSK); + SWRM_COMP_CFG_IRQ_LEVEL_OR_PULSE_MSK); + + ctrl->reg_write(ctrl, ctrl->reg_layout[SWRM_REG_INTERRUPT_CLEAR], + 0xFFFFFFFF);
/* enable CPU IRQs */ if (ctrl->mmio) { ctrl->reg_write(ctrl, ctrl->reg_layout[SWRM_REG_INTERRUPT_CPU_EN], SWRM_INTERRUPT_STATUS_RMSK); } + + /* Set IRQ to PULSE */ + ctrl->reg_write(ctrl, SWRM_COMP_CFG_ADDR, + SWRM_COMP_CFG_IRQ_LEVEL_OR_PULSE_MSK | + SWRM_COMP_CFG_ENABLE_MSK); + + swrm_wait_for_frame_gen_enabled(ctrl); ctrl->slave_status = 0; ctrl->reg_read(ctrl, SWRM_COMP_PARAMS, &val); ctrl->rd_fifo_depth = FIELD_GET(SWRM_COMP_PARAMS_RD_FIFO_DEPTH, val); @@ -1626,26 +1658,6 @@ static int qcom_swrm_remove(struct platform_device *pdev) return 0; }
-static bool swrm_wait_for_frame_gen_enabled(struct qcom_swrm_ctrl *ctrl) -{ - int retry = SWRM_LINK_STATUS_RETRY_CNT; - int comp_sts; - - do { - ctrl->reg_read(ctrl, SWRM_COMP_STATUS, &comp_sts); - - if (comp_sts & SWRM_FRM_GEN_ENABLED) - return true; - - usleep_range(500, 510); - } while (retry--); - - dev_err(ctrl->dev, "%s: link status not %s\n", __func__, - comp_sts & SWRM_FRM_GEN_ENABLED ? "connected" : "disconnected"); - - return false; -} - static int __maybe_unused swrm_runtime_resume(struct device *dev) { struct qcom_swrm_ctrl *ctrl = dev_get_drvdata(dev);
WSA Soundwire controller needs an full reset if clock stop support is not available in slave devices. WSA881x does not support clock stop however WSA883x supports clock stop.
Make setting this flag at runtime to address above issue.
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org --- drivers/soundwire/qcom.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index 1d2a105cb77f..b6c3fadc9090 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -617,10 +617,14 @@ static int qcom_swrm_enumerate(struct sdw_bus *bus)
sdw_extract_slave_id(bus, addr, &id); found = false; + ctrl->clock_stop_not_supported = false; /* Now compare with entries */ list_for_each_entry_safe(slave, _s, &bus->slaves, node) { if (sdw_compare_devid(slave, id) == 0) { qcom_swrm_set_slave_dev_num(bus, slave, i); + if (slave->prop.clk_stop_mode1) + ctrl->clock_stop_not_supported = true; + found = true; break; } @@ -1623,15 +1627,6 @@ static int qcom_swrm_probe(struct platform_device *pdev) pm_runtime_set_active(dev); pm_runtime_enable(dev);
- /* Clk stop is not supported on WSA Soundwire masters */ - if (ctrl->version <= SWRM_VERSION_1_3_0) { - ctrl->clock_stop_not_supported = true; - } else { - ctrl->reg_read(ctrl, SWRM_COMP_MASTER_ID, &val); - if (val == MASTER_ID_WSA) - ctrl->clock_stop_not_supported = true; - } - #ifdef CONFIG_DEBUG_FS ctrl->debugfs = debugfs_create_dir("qualcomm-sdw", ctrl->bus.debugfs); debugfs_create_file("qualcomm-registers", 0400, ctrl->debugfs, ctrl,
On 25-05-23, 14:38, Srinivas Kandagatla wrote:
During x13s audio testing we hit few corner cases due to issues in codec drivers and some obvious code bugs.
Here are the fixes for those issues, mostly the issues are around devices loosing the sync in between runtime pm suspend resume path.
Applied, thanks
On Thu, May 25, 2023 at 02:38:08PM +0100, Srinivas Kandagatla wrote:
During x13s audio testing we hit few corner cases due to issues in codec drivers and some obvious code bugs.
Here are the fixes for those issues, mostly the issues are around devices loosing the sync in between runtime pm suspend resume path.
With codec fixes along with these fixes, audio on x13s is pretty stable.
Thanks, Srini
Changes since v1:
- dropped runtime pm changes patch as unable to reproduced it anymore
- fixed clk stop flag as suggested by Pierre
- rebased on top of linux-next
I tried to update to this series on my 6.4-rc5 branch for the X13s and the above changes appear to lead to breakages again.
Specifically, with the updated clk stop flag (simple_clk_stop_capable) I see:
[ 14.789533] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1 [ 14.789717] qcom-soundwire 3330000.soundwire-controller: qcom_swrm_irq_handler: SWR bus clsh detected [ 14.796164] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
and without the runtime pm patch that you dropped in v2 I get intermittent (e.g. twice in five boots):
[ 11.527301] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.527409] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.527557] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.527640] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.528079] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.533388] snd-sc8280xp: probe of sound failed with error -22
Again, this was with 6.4-rc5, but these problems are likely still there also with linux-next.
Johan
On 05/06/2023 10:08, Johan Hovold wrote:
On Thu, May 25, 2023 at 02:38:08PM +0100, Srinivas Kandagatla wrote:
During x13s audio testing we hit few corner cases due to issues in codec drivers and some obvious code bugs.
Here are the fixes for those issues, mostly the issues are around devices loosing the sync in between runtime pm suspend resume path.
With codec fixes along with these fixes, audio on x13s is pretty stable.
Thanks, Srini
Changes since v1:
- dropped runtime pm changes patch as unable to reproduced it anymore
- fixed clk stop flag as suggested by Pierre
- rebased on top of linux-next
I tried to update to this series on my 6.4-rc5 branch for the X13s and the above changes appear to lead to breakages again.
These patches are based on linux next, we can not cleanly apply them on rc5 i guess without the depended patches.
I have tried these patches with your rc4 branch along with other depended patches my branch is at: https://git.linaro.org/people/srinivas.kandagatla/linux.git/log/?h=wip/sc828...
this works fine for me, not seeing any issues so far. tested both speakers and headset.
Specifically, with the updated clk stop flag (simple_clk_stop_capable) I see:
[ 14.789533] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1 [ 14.789717] qcom-soundwire 3330000.soundwire-controller: qcom_swrm_irq_handler: SWR bus clsh detected [ 14.796164] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
and without the runtime pm patch that you dropped in v2 I get intermittent (e.g. twice in five boots):
[ 11.527301] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.527409] snd-sc8280xp sound: ASoC: topology: could not load header: -517
Never seen this, looks like some corrupted tplg to me.. I might be wrong.
[ 11.527557] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.527640] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.528079] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.533388] snd-sc8280xp: probe of sound failed with error -22
Again, this was with 6.4-rc5, but these problems are likely still there also with linux-next.
Have you tried linux-next?
--srini
Johan
On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:
On 05/06/2023 10:08, Johan Hovold wrote:
On Thu, May 25, 2023 at 02:38:08PM +0100, Srinivas Kandagatla wrote:
During x13s audio testing we hit few corner cases due to issues in codec drivers and some obvious code bugs.
Here are the fixes for those issues, mostly the issues are around devices loosing the sync in between runtime pm suspend resume path.
With codec fixes along with these fixes, audio on x13s is pretty stable.
Changes since v1:
- dropped runtime pm changes patch as unable to reproduced it anymore
- fixed clk stop flag as suggested by Pierre
- rebased on top of linux-next
I tried to update to this series on my 6.4-rc5 branch for the X13s and the above changes appear to lead to breakages again.
These patches are based on linux next, we can not cleanly apply them on rc5 i guess without the depended patches.
Yeah, I did the corresponding changes you did in v2 to v1 applied to 6.4-rc5 instead of backporting the dependencies.
I have tried these patches with your rc4 branch along with other depended patches my branch is at: https://git.linaro.org/people/srinivas.kandagatla/linux.git/log/?h=wip/sc828...
this works fine for me, not seeing any issues so far. tested both speakers and headset.
Specifically, with the updated clk stop flag (simple_clk_stop_capable) I see:
[ 14.789533] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1 [ 14.789717] qcom-soundwire 3330000.soundwire-controller: qcom_swrm_irq_handler: SWR bus clsh detected [ 14.796164] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
So as we discussed of list, this particular issue was due to a mistake I made when "backporting" your v2 where I forgot to invert the test when replacing simple_clk_stop_capable with clk_stop_mode1.
and without the runtime pm patch that you dropped in v2 I get intermittent (e.g. twice in five boots):
[ 11.527301] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.527409] snd-sc8280xp sound: ASoC: topology: could not load header: -517
Never seen this, looks like some corrupted tplg to me.. I might be wrong.
[ 11.527557] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.527640] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.528079] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.533388] snd-sc8280xp: probe of sound failed with error -22
Again, this was with 6.4-rc5, but these problems are likely still there also with linux-next.
Have you tried linux-next?
No, not yet, but I just triggered the above once more after not having seen with my latest -rc5 branch for a while (e.g. 20 reboots?):
[ 11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw]) [ 11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw]) [ 11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.585766] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.591831] snd-sc8280xp: probe of sound failed with error -22
I don't think I've ever seen it before dropping the runtime PM patch as you did in v2, and I hit it twice fairly quickly after dropping it. And now again.
I'm not saying that the runtime PM patch is necessarily correct, and perhaps it is just changes in timing that lead to the above, but we definitely have a bug here.
My X13s is sitting here in the above state right now if there's some state you want me to inspect.
This is the branch I'm using:
https://github.com/jhovold/linux/tree/wip/sc8280xp-v6.4-rc5
with commit
c0ab29445663 ("soundwire: qcom: enable runtime pm before controller is registered")
reverted.
Johan
On Thu, Jun 08, 2023 at 12:11:45PM +0200, Johan Hovold wrote:
On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:
No, not yet, but I just triggered the above once more after not having seen with my latest -rc5 branch for a while (e.g. 20 reboots?):
[ 11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw]) [ 11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw]) [ 11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.585766] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.591831] snd-sc8280xp: probe of sound failed with error -22
I don't think I've ever seen it before dropping the runtime PM patch as you did in v2, and I hit it twice fairly quickly after dropping it. And now again.
I'm not saying that the runtime PM patch is necessarily correct, and perhaps it is just changes in timing that lead to the above, but we definitely have a bug here.
I searched my notes and realised that I have seen this once also with the runtime pm patch. So the fact that happened to see it more often after dropping it is likely due to changes in timing.
Looking at the above log it seems like we hit a probe deferral somewhere as some resource is not available yet, and this is eventually turned into a hard failure that breaks audio as the error is propagated up the stack.
Johan
On 08/06/2023 13:40, Johan Hovold wrote:
On Thu, Jun 08, 2023 at 12:11:45PM +0200, Johan Hovold wrote:
On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:
No, not yet, but I just triggered the above once more after not having seen with my latest -rc5 branch for a while (e.g. 20 reboots?):
[ 11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw]) [ 11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw]) [ 11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered [ 11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23 [ 11.585766] snd-sc8280xp sound: ASoC: adding FE link failed [ 11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22 [ 11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22 [ 11.591831] snd-sc8280xp: probe of sound failed with error -22
I don't think I've ever seen it before dropping the runtime PM patch as you did in v2, and I hit it twice fairly quickly after dropping it. And now again.
I'm not saying that the runtime PM patch is necessarily correct, and perhaps it is just changes in timing that lead to the above, but we definitely have a bug here.
I searched my notes and realised that I have seen this once also with the runtime pm patch. So the fact that happened to see it more often after dropping it is likely due to changes in timing.
Looking at the above log it seems like we hit a probe deferral somewhere as some resource is not available yet, and this is eventually turned into a hard failure that breaks audio as the error is propagated up the stack.
I was looking at this too, And I think this change should help.. Not tried it though
----------------------------------->cut<---------------------------------- diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index cdabfe2feceb..b57532d6b163 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -543,7 +543,7 @@ static int qcom_swrm_enumerate(struct sdw_bus *bus) list_for_each_entry_safe(slave, _s, &bus->slaves, node) { if (sdw_compare_devid(slave, id) == 0) { qcom_swrm_set_slave_dev_num(bus, slave, i); - if (!slave->prop.simple_clk_stop_capable) + if (slave->prop.clk_stop_mode1) ctrl->clock_stop_not_supported = true;
found = true; diff --git a/sound/soc/qcom/qdsp6/topology.c b/sound/soc/qcom/qdsp6/topology.c index cccc59b570b9..3c50a560bc84 100644 --- a/sound/soc/qcom/qdsp6/topology.c +++ b/sound/soc/qcom/qdsp6/topology.c @@ -1276,10 +1276,8 @@ int audioreach_tplg_init(struct snd_soc_component *component) }
ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw); - if (ret < 0) { + if (ret < 0) dev_err(dev, "tplg component load failed%d\n", ret); - ret = -EINVAL; - }
release_firmware(fw); err:
----------------------------------->cut<----------------------------------
Johan
On Thu, Jun 08, 2023 at 01:45:22PM +0100, Srinivas Kandagatla wrote:
On 08/06/2023 13:40, Johan Hovold wrote:
Looking at the above log it seems like we hit a probe deferral somewhere as some resource is not available yet, and this is eventually turned into a hard failure that breaks audio as the error is propagated up the stack.
I was looking at this too, And I think this change should help.. Not tried it though
ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
- if (ret < 0) {
- if (ret < 0) dev_err(dev, "tplg component load failed%d\n", ret);
ret = -EINVAL;
- }
That in itself only seems to make the problem worse with new follow-on errors. Looks like a dereference of an error pointer in a codec driver:
[ 12.086999] snd-sc8280xp sound: ASoC: adding FE link failed [ 12.087231] snd-sc8280xp sound: ASoC: topology: could not load header: -517 [ 12.087547] qcom-apm gprsvc:service:2:1: tplg component load failed-517 [ 12.088865] snd-sc8280xp sound: ASoC: failed to instantiate card -517 [ 12.117093] genirq: Flags mismatch irq 289. 00002001 (HPHR PDM WD INT) vs. 00002001 (HPHR PDM WD INT) [ 12.117504] wcd938x_codec audio-codec: Failed to request HPHR WD interrupt (-16) [ 12.117664] genirq: Flags mismatch irq 290. 00002001 (HPHL PDM WD INT) vs. 00002001 (HPHL PDM WD INT) [ 12.117861] wcd938x_codec audio-codec: Failed to request HPHL WD interrupt (-16) [ 12.118010] genirq: Flags mismatch irq 291. 00002001 (AUX PDM WD INT) vs. 00002001 (AUX PDM WD INT) [ 12.118197] wcd938x_codec audio-codec: Failed to request Aux WD interrupt (-16) [ 12.118579] genirq: Flags mismatch irq 292. 00002001 (mbhc sw intr) vs. 00002001 (mbhc sw intr) [ 12.118763] wcd938x_codec audio-codec: Failed to request mbhc interrupts -16 [ 12.122415] snd-sc8280xp sound: ASoC: Parent card not yet available, widget card binding deferred [ 12.126740] Unable to handle kernel paging request at virtual address fffffffffffffff8 [ 12.126856] Mem abort info: [ 12.126903] ESR = 0x0000000096000004 [ 12.126961] EC = 0x25: DABT (current EL), IL = 32 bits [ 12.127036] SET = 0, FnV = 0 [ 12.127085] EA = 0, S1PTW = 0 [ 12.127135] FSC = 0x04: level 0 translation fault [ 12.127205] Data abort info: [ 12.127250] ISV = 0, ISS = 0x00000004 [ 12.127308] CM = 0, WnR = 0 [ 12.127356] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000028e41b000 [ 12.127445] [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000 [ 12.127624] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 12.127674] Modules linked in: q6apm_dai q6apm_lpass_dais snd_soc_wsa883x q6prm_clocks snd_q6dsp_common q6prm michael_mic cbc des_generic libdes ecb algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack snd_q6apm nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter r8152 mii qrtr_mhi panel_edp snd_soc_hdmi_codec venus_dec venus_enc fastrpc gpio_sbu_mux rpmsg_ctrl apr videobuf2_dma_contig rpmsg_char qrtr_smd qcom_spmi_adc_tm5 videobuf2_memops snd_soc_lpass_va_macro snd_soc_lpass_tx_macro snd_soc_lpass_wsa_macro snd_soc_lpass_rx_macro snd_soc_lpass_macro_common ath11k_pci qcom_pm8008_regulator ath11k mac80211 venus_core pmic_glink_altmode qcom_battmgr msm libarc4 v4l2_mem2mem hci_uart videobuf2_v4l2 gpu_sched btqca videodev bluetooth drm_display_helper cfg80211 phy_qcom_qmp_combo videobuf2_common leds_qcom_lpg qcom_spmi_temp_alarm drm_dp_aux_bus ecdh_generic qcom_spmi_adc5 drm_kms_helper mc ecc led_class_multicolor rtc_pm8xxx mhi industrialio [ 12.127941] qcom_pon rfkill syscopyarea snd_soc_sc8280xp reboot_mode qcom_vadc_common sysfillrect nvmem_qcom_spmi_sdam qcom_pm8008 snd_soc_wcd938x sysimgblt snd_soc_qcom_common snd_soc_qcom_sdw videocc_sc8280xp typec phy_qcom_edp qcom_stats regmap_i2c snd_soc_wcd938x_sdw phy_qcom_qmp_usb pinctrl_sc8280xp_lpass_lpi phy_qcom_snps_femto_v2 qcom_q6v5_pas icc_bwmon soundwire_qcom regmap_sdw pinctrl_lpass_lpi snd_soc_wcd_mbhc lpasscc_sc8280xp qcom_pil_info snd_soc_core qcom_common snd_compress qcom_glink_smem qcom_q6v5 snd_pcm qcom_sysmon snd_timer icc_osm_l3 mdt_loader qrtr qcom_wdt qcom_rng snd pmic_glink pdr_interface soundcore socinfo qmi_helpers pwm_bl soundwire_bus drm dm_mod ip_tables x_tables ipv6 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid i2c_core [ 12.129101] CPU: 1 PID: 158 Comm: kworker/u16:7 Not tainted 6.4.0-rc5 #62 [ 12.129153] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET53W (1.25 ) 10/12/2022 [ 12.129209] Workqueue: events_unbound deferred_probe_work_func [ 12.129267] pstate: a0400005 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 12.129319] pc : wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc] [ 12.129381] lr : wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x] [ 12.129445] sp : ffff80000a523950 [ 12.129471] x29: ffff80000a523950 x28: ffff408fd176f080 x27: ffff408fd176c880 [ 12.129530] x26: ffffac710f2c0b20 x25: 0002000000000000 x24: 00000000fffffdf4 [ 12.129588] x23: ffff408fd176c880 x22: ffff408fd176d518 x21: ffff408fd176f080 [ 12.129645] x20: ffff408fc5c30c80 x19: fffffffffffffff0 x18: 0000000000000020 [ 12.129701] x17: 0000000000000000 x16: ffffac710e565dac x15: ffffffffffffffff [ 12.129758] x14: ffff40905779741b x13: ffffffffffffffff x12: 0000000000000000 [ 12.129814] x11: ffff409b340586f0 x10: ffffac710fbcf1e0 x9 : 0000000000000000 [ 12.129870] x8 : ffff408fd9288700 x7 : 0000000000000000 x6 : ffff408fd77891a0 [ 12.129926] x5 : ffff408fd176f750 x4 : 0000000000000000 x3 : ffffac70c8662310 [ 12.132271] x2 : ffff408fd176d518 x1 : ffff408fc3480158 x0 : fffffffffffffff0 [ 12.133443] Call trace: [ 12.134003] wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc] [ 12.134567] wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x] [ 12.135131] snd_soc_component_set_jack+0x28/0x8c [snd_soc_core] [ 12.135709] qcom_snd_wcd_jack_setup+0x7c/0x19c [snd_soc_qcom_common] [ 12.136278] sc8280xp_snd_init+0x20/0x2c [snd_soc_sc8280xp] [ 12.137080] snd_soc_link_init+0x28/0x90 [snd_soc_core] [ 12.139427] snd_soc_bind_card+0x628/0xbfc [snd_soc_core] [ 12.141761] snd_soc_register_card+0xec/0x104 [snd_soc_core] [ 12.144082] devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core] [ 12.146402] sc8280xp_platform_probe+0xf0/0x108 [snd_soc_sc8280xp] [ 12.148694] platform_probe+0x68/0xd8 [ 12.150969] really_probe+0x184/0x3c8 [ 12.153257] __driver_probe_device+0x7c/0x16c [ 12.155532] driver_probe_device+0x3c/0x110 [ 12.157787] __device_attach_driver+0xbc/0x158 [ 12.160056] bus_for_each_drv+0x84/0xe0 [ 12.162304] __device_attach+0xa8/0x1d4 [ 12.164518] device_initial_probe+0x14/0x20 [ 12.166716] bus_probe_device+0xb0/0xb4 [ 12.168924] deferred_probe_work_func+0xa0/0xf4 [ 12.171100] process_one_work+0x288/0x5bc [ 12.173267] worker_thread+0x74/0x450 [ 12.175409] kthread+0x124/0x128 [ 12.177555] ret_from_fork+0x10/0x20 [ 12.179683] Code: fa401804 54001ae0 a9025bf5 aa0003f3 (f9400415) [ 12.181815] ---[ end trace 0000000000000000 ]---
Johan
On 08/06/2023 14:05, Johan Hovold wrote:
On Thu, Jun 08, 2023 at 01:45:22PM +0100, Srinivas Kandagatla wrote:
On 08/06/2023 13:40, Johan Hovold wrote:
Looking at the above log it seems like we hit a probe deferral somewhere as some resource is not available yet, and this is eventually turned into a hard failure that breaks audio as the error is propagated up the stack.
I was looking at this too, And I think this change should help.. Not tried it though
ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
- if (ret < 0) {
- if (ret < 0) dev_err(dev, "tplg component load failed%d\n", ret);
ret = -EINVAL;
- }
That in itself only seems to make the problem worse with new follow-on errors. Looks like a dereference of an error pointer in a codec driver:
Thanks Johan,
I will try to see if we can fix this properly.
--srini
participants (3)
-
Johan Hovold
-
Srinivas Kandagatla
-
Vinod Koul