[PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend
Intel validation reported an issue where the HW_RST self-clearing bit is not cleared in hardware, which as a ripple effect creates issues with the clock stop mode.
This happens is a specific sequence where the Intel manager is pm_runtime suspended with the clock-stop mode enabled. During the system suspend, we currently do nothing, which can lead to potential issues on system resume and the following pm_runtime suspend, depending on the hardware state.
This patch suggests a full resume if the clock-stop mode is used. This may require extra time but will make the suspend/resume flows completely symmetric. This also removes a race condition where we could not access SHIM registers if the parent was suspended as well. Resuming the link also resumes the parent by construction.
BugLink: https://github.com/thesofproject/linux/issues/2606
v2: - Better comments and commit messages. - Modified the .prepare callback to only deal with the corner case that is NOT covered today instead of systematically doing a full resume.
Pierre-Louis Bossart (3): soundwire: intel: fix potential race condition during power down soundwire: intel: skip suspend/resume/wake when link was not started soundwire: intel: conditionally exit clock stop mode on system suspend
drivers/soundwire/intel.c | 150 ++++++++++++++++++++++++++++++-------- drivers/soundwire/intel.h | 1 + 2 files changed, 119 insertions(+), 32 deletions(-)
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
The power down sequence sets the link_up flag as false outside of the mutex_lock. This is potentially unsafe.
In additional the flow in that sequence can be improved by first testing if the link was powered, setting the link_up flag as false and proceeding with the power down. In case the CPA bits cannot be cleared, we only flag an error since we cannot deal with interrupts any longer.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com --- drivers/soundwire/intel.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 9794bc222fb5..808dda1380c2 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -538,12 +538,14 @@ static int intel_link_power_down(struct sdw_intel *sdw)
mutex_lock(sdw->link_res->shim_lock);
- intel_shim_master_ip_to_glue(sdw); - if (!(*shim_mask & BIT(link_id))) dev_err(sdw->cdns.dev, "%s: Unbalanced power-up/down calls\n", __func__);
+ sdw->cdns.link_up = false; + + intel_shim_master_ip_to_glue(sdw); + *shim_mask &= ~BIT(link_id);
if (!*shim_mask) { @@ -560,18 +562,19 @@ static int intel_link_power_down(struct sdw_intel *sdw) link_control &= spa_mask;
ret = intel_clear_bit(shim, SDW_SHIM_LCTL, link_control, cpa_mask); + if (ret < 0) { + dev_err(sdw->cdns.dev, "%s: could not power down link\n", __func__); + + /* + * we leave the sdw->cdns.link_up flag as false since we've disabled + * the link at this point and cannot handle interrupts any longer. + */ + } }
mutex_unlock(sdw->link_res->shim_lock);
- if (ret < 0) { - dev_err(sdw->cdns.dev, "%s: could not power down link\n", __func__); - - return ret; - } - - sdw->cdns.link_up = false; - return 0; + return ret; }
static void intel_shim_sync_arm(struct sdw_intel *sdw)
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
The SoundWire Linux devices are created purely based on information provided by platform firmware (e.g. ACPI DSDT table). When the kernel finds a matching driver for the device address (_ADR), the probe will initialize required data structures and initialize pm ops.
When the SoundWire link is started at a later point, the physical devices will synchronize on the SoundWire frames and report their attachment status, thereby triggering the enumeration and initialization of device registers.
This two-step solution was a conscious design decision to allow e.g. a driver to use sideband mechanisms to turn power rails on. This can also allow OEMs to describe multiple platforms with the same DSDT table, the devices that are not physically present in hardware.
The drawback of this approach is a bit of confusion, with more devices than are actually present in hardware. This results in 'ghost' devices, for which the driver successfully probes, but that will not generate any traffic on the bus. suspend-resume transitions are handled by drivers, and skipped when the devices are not physically present.
This patch provides a work-around for a second-level of confusion in platform firmware: some platforms only use HDaudio links, but nevertheless expose SoundWire 'ghost' devices. This results in error messages in the Intel driver while trying to suspend/resume these links. The simplest solution is to add a boolean status flag to skip all suspend/resume/wake sequences if the link was never started.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com --- drivers/soundwire/intel.c | 22 ++++++++++++---------- drivers/soundwire/intel.h | 1 + 2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 808dda1380c2..8b42053b171f 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -1525,6 +1525,7 @@ int intel_link_startup(struct auxiliary_device *auxdev) if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE)) pm_runtime_idle(dev);
+ sdw->startup_done = true; return 0;
err_interrupt: @@ -1564,8 +1565,9 @@ int intel_link_process_wakeen_event(struct auxiliary_device *auxdev) sdw = dev_get_drvdata(dev); bus = &sdw->cdns.bus;
- if (bus->prop.hw_disabled) { - dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n", bus->link_id); + if (bus->prop.hw_disabled || !sdw->startup_done) { + dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n", + bus->link_id); return 0; }
@@ -1602,8 +1604,8 @@ static int __maybe_unused intel_suspend(struct device *dev) u32 clock_stop_quirks; int ret;
- if (bus->prop.hw_disabled) { - dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n", + if (bus->prop.hw_disabled || !sdw->startup_done) { + dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n", bus->link_id); return 0; } @@ -1656,8 +1658,8 @@ static int __maybe_unused intel_suspend_runtime(struct device *dev) u32 clock_stop_quirks; int ret;
- if (bus->prop.hw_disabled) { - dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n", + if (bus->prop.hw_disabled || !sdw->startup_done) { + dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n", bus->link_id); return 0; } @@ -1721,8 +1723,8 @@ static int __maybe_unused intel_resume(struct device *dev) bool multi_link; int ret;
- if (bus->prop.hw_disabled) { - dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n", + if (bus->prop.hw_disabled || !sdw->startup_done) { + dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n", bus->link_id); return 0; } @@ -1819,8 +1821,8 @@ static int __maybe_unused intel_resume_runtime(struct device *dev) int status; int ret;
- if (bus->prop.hw_disabled) { - dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n", + if (bus->prop.hw_disabled || !sdw->startup_done) { + dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n", bus->link_id); return 0; } diff --git a/drivers/soundwire/intel.h b/drivers/soundwire/intel.h index 0b47b148da3f..cd93a44dba9a 100644 --- a/drivers/soundwire/intel.h +++ b/drivers/soundwire/intel.h @@ -41,6 +41,7 @@ struct sdw_intel { struct sdw_cdns cdns; int instance; struct sdw_intel_link_res *link_res; + bool startup_done; #ifdef CONFIG_DEBUG_FS struct dentry *debugfs; #endif
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
Intel stress tests reported issues with the clock stop mode, specifically when trying to do a system suspend while the link is already pm_runtime suspended.
In this case, we need to disable the shim wake, but when the PCI parent device is also pm_runtime suspended the SHIM registers are not accessible.
Since this is an invalid corner case, this patch suggests a pm_runtime resume of the entire bus to full power (parent+child devices) before the system suspend so that the shim wake can be disabled.
Unlike the suspend operation, the .prepare callbacks are propagated from root device to leaf devices. By adding a .prepare callback at the SoundWire link level, we can double-check the pm_runtime status of the device as well as its parent PCI device. When the problematic configuration is detected, the device is pm_runtime resumed - which by construction also resume its parent.
An additional loop is added to resume all child devices. In theory we only need to restart the link, but doing so will also cause the physical devices to synchronize and re-initialize, while their Linux devices remain pm_runtime suspended. It's simpler to make sure the codec devices are fully resumed so that we don't have to deal with zombie states.
This additional loop could have been avoided by adding a .prepare callback in SoundWire codec drivers. Functionally this would have been equivalent. The rationale for implementing a loop at the link level is only to reduce the amount of code required to deal at the codec level with an Intel corner case - in other words keep codec drivers independent from Intel platform-specific programming sequences.
BugLink: https://github.com/thesofproject/linux/issues/2606 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com --- drivers/soundwire/intel.c | 105 +++++++++++++++++++++++++++++++++----- 1 file changed, 93 insertions(+), 12 deletions(-)
diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 8b42053b171f..f66fcbc33a2f 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -1596,6 +1596,87 @@ int intel_link_process_wakeen_event(struct auxiliary_device *auxdev) * PM calls */
+static int intel_resume_child_device(struct device *dev, void *data) +{ + int ret; + struct sdw_slave *slave = dev_to_sdw_dev(dev); + + if (!slave->probed) { + dev_dbg(dev, "%s: skipping device, no probed driver\n", __func__); + return 0; + } + if (!slave->dev_num_sticky) { + dev_dbg(dev, "%s: skipping device, never detected on bus\n", __func__); + return 0; + } + + ret = pm_request_resume(dev); + if (ret < 0) + dev_err(dev, "%s: pm_request_resume failed: %d\n", __func__, ret); + + return ret; +} + +static int __maybe_unused intel_pm_prepare(struct device *dev) +{ + struct sdw_cdns *cdns = dev_get_drvdata(dev); + struct sdw_intel *sdw = cdns_to_intel(cdns); + struct sdw_bus *bus = &cdns->bus; + u32 clock_stop_quirks; + int ret = 0; + + if (bus->prop.hw_disabled || !sdw->startup_done) { + dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n", + bus->link_id); + return 0; + } + + clock_stop_quirks = sdw->link_res->clock_stop_quirks; + + if (pm_runtime_suspended(dev) && + pm_runtime_suspended(dev->parent) && + ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET) || + !clock_stop_quirks)) { + /* + * if we've enabled clock stop, and the parent is suspended, the SHIM registers + * are not accessible and the shim wake cannot be disabled. + * The only solution is to resume the entire bus to full power + */ + + /* + * If any operation in this block fails, we keep going since we don't want + * to prevent system suspend from happening and errors should be recoverable + * on resume. + */ + + /* + * first resume the device for this link. This will also by construction + * resume the PCI parent device. + */ + ret = pm_request_resume(dev); + if (ret < 0) { + dev_err(dev, "%s: pm_request_resume failed: %d\n", __func__, ret); + return 0; + } + + /* + * Continue resuming the entire bus (parent + child devices) to exit + * the clock stop mode. If there are no devices connected on this link + * this is a no-op. + * The resume to full power could have been implemented with a .prepare + * step in SoundWire codec drivers. This would however require a lot + * of code to handle an Intel-specific corner case. It is simpler in + * practice to add a loop at the link level. + */ + ret = device_for_each_child(bus->dev, NULL, intel_resume_child_device); + + if (ret < 0) + dev_err(dev, "%s: intel_resume_child_device failed: %d\n", __func__, ret); + } + + return 0; +} + static int __maybe_unused intel_suspend(struct device *dev) { struct sdw_cdns *cdns = dev_get_drvdata(dev); @@ -1615,19 +1696,18 @@ static int __maybe_unused intel_suspend(struct device *dev)
clock_stop_quirks = sdw->link_res->clock_stop_quirks;
- if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET || - !clock_stop_quirks) && - !pm_runtime_suspended(dev->parent)) { + if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET) || + !clock_stop_quirks) {
- /* - * if we've enabled clock stop, and the parent - * is still active, disable shim wake. The - * SHIM registers are not accessible if the - * parent is already pm_runtime suspended so - * it's too late to change that configuration - */ - - intel_shim_wake(sdw, false); + if (pm_runtime_suspended(dev->parent)) { + /* + * paranoia check: this should not happen with the .prepare + * resume to full power + */ + dev_err(dev, "%s: invalid config: parent is suspended\n", __func__); + } else { + intel_shim_wake(sdw, false); + } }
return 0; @@ -1992,6 +2072,7 @@ static int __maybe_unused intel_resume_runtime(struct device *dev) }
static const struct dev_pm_ops intel_pm = { + .prepare = intel_pm_prepare, SET_SYSTEM_SLEEP_PM_OPS(intel_suspend, intel_resume) SET_RUNTIME_PM_OPS(intel_suspend_runtime, intel_resume_runtime, NULL) };
On 18-08-21, 10:49, Bard Liao wrote:
Intel validation reported an issue where the HW_RST self-clearing bit is not cleared in hardware, which as a ripple effect creates issues with the clock stop mode.
This happens is a specific sequence where the Intel manager is pm_runtime suspended with the clock-stop mode enabled. During the system suspend, we currently do nothing, which can lead to potential issues on system resume and the following pm_runtime suspend, depending on the hardware state.
This patch suggests a full resume if the clock-stop mode is used. This may require extra time but will make the suspend/resume flows completely symmetric. This also removes a race condition where we could not access SHIM registers if the parent was suspended as well. Resuming the link also resumes the parent by construction.
Applied all, thanks
participants (2)
-
Bard Liao
-
Vinod Koul