[PATCH] ASoC: max98373: Mark cache dirty before entering sleep
Amp lose its register values in case amp power loss or 'ForceReset' over Soundwire SCP_ctrl register(0x0044) or HW_RESET pin control during the audio suspend and resume. Mark cache dirty before audio suspension to restore existing values when audio resume.
Signed-off-by: Ryan Lee ryans.lee@maximintegrated.com --- sound/soc/codecs/max98373-sdw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c index dc520effc61c..a7e4a6e880b0 100644 --- a/sound/soc/codecs/max98373-sdw.c +++ b/sound/soc/codecs/max98373-sdw.c @@ -259,6 +259,7 @@ static __maybe_unused int max98373_suspend(struct device *dev) regmap_read(max98373->regmap, max98373->cache[i].reg, &max98373->cache[i].val);
regcache_cache_only(max98373->regmap, true); + regcache_mark_dirty(max98373->regmap);
return 0; }
On 9/24/21 5:13 PM, Ryan Lee wrote:
Amp lose its register values in case amp power loss or 'ForceReset' over Soundwire SCP_ctrl register(0x0044) or HW_RESET pin control during the audio suspend and resume. Mark cache dirty before audio suspension to restore existing values when audio resume.
Signed-off-by: Ryan Lee ryans.lee@maximintegrated.com
sound/soc/codecs/max98373-sdw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c index dc520effc61c..a7e4a6e880b0 100644 --- a/sound/soc/codecs/max98373-sdw.c +++ b/sound/soc/codecs/max98373-sdw.c @@ -259,6 +259,7 @@ static __maybe_unused int max98373_suspend(struct device *dev) regmap_read(max98373->regmap, max98373->cache[i].reg, &max98373->cache[i].val);
regcache_cache_only(max98373->regmap, true);
- regcache_mark_dirty(max98373->regmap);
We already do the following sequence in max98373_io_init() when the amplifier re-attaches:
if (max98373->first_hw_init) { regcache_cache_bypass(max98373->regmap, false); regcache_mark_dirty(max98373->regmap); }
I don't see what marking the cache as dirty on suspend might do, we will do a sync only in the resume step.
IIRC this is a patch that we've seen before and removed since it wasn't aligned with any other codec driver.
Does this actually improve anything?
-----Original Message----- From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Sent: Monday, September 27, 2021 7:55 AM To: Ryan Lee RyanS.Lee@maximintegrated.com; lgirdwood@gmail.com; broonie@kernel.org; perex@perex.cz; tiwai@suse.com; yung- chuan.liao@linux.intel.com; guennadi.liakhovetski@linux.intel.com; alsa- devel@alsa-project.org; linux-kernel@vger.kernel.org Cc: sathya.prakash.m.r@intel.com; ryan.lee.maxim@gmail.com Subject: [EXTERNAL] Re: [PATCH] ASoC: max98373: Mark cache dirty before entering sleep
EXTERNAL EMAIL
On 9/24/21 5:13 PM, Ryan Lee wrote:
Amp lose its register values in case amp power loss or 'ForceReset' over Soundwire SCP_ctrl register(0x0044) or HW_RESET pin control during the audio suspend and resume. Mark cache dirty before audio suspension to restore existing values when audio resume.
Signed-off-by: Ryan Lee ryans.lee@maximintegrated.com
sound/soc/codecs/max98373-sdw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c index dc520effc61c..a7e4a6e880b0 100644 --- a/sound/soc/codecs/max98373-sdw.c +++ b/sound/soc/codecs/max98373-sdw.c @@ -259,6 +259,7 @@ static __maybe_unused int
max98373_suspend(struct device *dev)
regmap_read(max98373->regmap, max98373->cache[i].reg,
&max98373->cache[i].val);
regcache_cache_only(max98373->regmap, true);
regcache_mark_dirty(max98373->regmap);
We already do the following sequence in max98373_io_init() when the amplifier re-attaches:
if (max98373->first_hw_init) { regcache_cache_bypass(max98373->regmap, false); regcache_mark_dirty(max98373->regmap); }
I don't see what marking the cache as dirty on suspend might do, we will do a sync only in the resume step.
IIRC this is a patch that we've seen before and removed since it wasn't aligned with any other codec driver.
Does this actually improve anything?
Yes, it does. There was an mute problem report due to amp register reset during suspend/resume. and we confirmed that the modification is effective. (https://partnerissuetracker.corp.google.com/issues/194472331) The added code helps to re-write valid values in cache to the amp hardware when audio resume. Same code was there on i2c driver, but not on Soundwire driver.
On Mon, Sep 27, 2021 at 04:01:25PM +0000, Ryan Lee wrote:
regcache_cache_only(max98373->regmap, true);
regcache_mark_dirty(max98373->regmap);
We already do the following sequence in max98373_io_init() when the amplifier re-attaches:
if (max98373->first_hw_init) { regcache_cache_bypass(max98373->regmap, false); regcache_mark_dirty(max98373->regmap); }
I don't see what marking the cache as dirty on suspend might do, we will do a sync only in the resume step.
IIRC this is a patch that we've seen before and removed since it wasn't aligned with any other codec driver.
Yes, it does. There was an mute problem report due to amp register reset during suspend/resume. and we confirmed that the modification is effective. (https://partnerissuetracker.corp.google.com/issues/194472331) The added code helps to re-write valid values in cache to the amp hardware when audio resume. Same code was there on i2c driver, but not on Soundwire driver.
More specifically what it does is make the invalidation of the register cache unconditional. It doesn't really matter if the invalidation is done on suspend or resume, so long as it happens before we attempt to resync - this could also be done by deleting the first_hw_init check.
On 9/27/21 11:06 AM, Mark Brown wrote:
On Mon, Sep 27, 2021 at 04:01:25PM +0000, Ryan Lee wrote:
regcache_cache_only(max98373->regmap, true);
regcache_mark_dirty(max98373->regmap);
We already do the following sequence in max98373_io_init() when the amplifier re-attaches:
if (max98373->first_hw_init) { regcache_cache_bypass(max98373->regmap, false); regcache_mark_dirty(max98373->regmap); }
I don't see what marking the cache as dirty on suspend might do, we will do a sync only in the resume step.
IIRC this is a patch that we've seen before and removed since it wasn't aligned with any other codec driver.
Yes, it does. There was an mute problem report due to amp register reset during suspend/resume. and we confirmed that the modification is effective. (https://partnerissuetracker.corp.google.com/issues/194472331) The added code helps to re-write valid values in cache to the amp hardware when audio resume. Same code was there on i2c driver, but not on Soundwire driver.
Ryan, we removed this in f184892613dd ('ASoC: codecs: max98373-sdw: align regmap use with other codecs'), so even if this was needed you'd need a mention that this is a revert and why this sequence is better. You are suggesting a change based on an analogy with I2C which is questionable: when a SoundWire device regains sync on the bus, it will be re-initialized using a callback, and the resume waits for the initialization to complete.
More specifically what it does is make the invalidation of the register cache unconditional. It doesn't really matter if the invalidation is done on suspend or resume, so long as it happens before we attempt to resync - this could also be done by deleting the first_hw_init check.
Mark, that's exactly my point: if the amp rejoins the bus, we will *always* mark the cache as dirty, before the resync is done in the resume sequence.
I am really trying to figure out if we have a major flaw in the resume sequence and why things are different in the case of the Maxim amp.
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
On Mon, Sep 27, 2021 at 11:48:56AM -0500, Pierre-Louis Bossart wrote:
On 9/27/21 11:06 AM, Mark Brown wrote:
More specifically what it does is make the invalidation of the register cache unconditional. It doesn't really matter if the invalidation is done on suspend or resume, so long as it happens before we attempt to resync - this could also be done by deleting the first_hw_init check.
Mark, that's exactly my point: if the amp rejoins the bus, we will *always* mark the cache as dirty, before the resync is done in the resume sequence.
Ah, yes - I see.
I am really trying to figure out if we have a major flaw in the resume sequence and why things are different in the case of the Maxim amp.
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
A quick survey of other drivers suggests that this pattern should be factored out into some helpers as it looks like there's several ways of implementing it that look very similar but not quite the same...
On 9/27/21 12:10 PM, Mark Brown wrote:
On Mon, Sep 27, 2021 at 11:48:56AM -0500, Pierre-Louis Bossart wrote:
On 9/27/21 11:06 AM, Mark Brown wrote:
More specifically what it does is make the invalidation of the register cache unconditional. It doesn't really matter if the invalidation is done on suspend or resume, so long as it happens before we attempt to resync - this could also be done by deleting the first_hw_init check.
Mark, that's exactly my point: if the amp rejoins the bus, we will *always* mark the cache as dirty, before the resync is done in the resume sequence.
Ah, yes - I see.
I am really trying to figure out if we have a major flaw in the resume sequence and why things are different in the case of the Maxim amp.
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
A quick survey of other drivers suggests that this pattern should be factored out into some helpers as it looks like there's several ways of implementing it that look very similar but not quite the same...
No disagreement here, we tried really hard to enforce a common pattern for suspend-resume, but i just noticed that the maxim amp driver is different on suspend (resume is consistent with the rest).
static int __maybe_unused rt711_dev_suspend(struct device *dev) { struct rt711_priv *rt711 = dev_get_drvdata(dev);
if (!rt711->hw_init) return 0;
cancel_delayed_work_sync(&rt711->jack_detect_work); cancel_delayed_work_sync(&rt711->jack_btn_check_work); cancel_work_sync(&rt711->calibration_work);
regcache_cache_only(rt711->regmap, true);
return 0; }
static int __maybe_unused rt1308_dev_suspend(struct device *dev) { struct rt1308_sdw_priv *rt1308 = dev_get_drvdata(dev);
if (!rt1308->hw_init) return 0;
regcache_cache_only(rt1308->regmap, true);
return 0; }
static __maybe_unused int max98373_suspend(struct device *dev) { struct max98373_priv *max98373 = dev_get_drvdata(dev); int i;
<<<< missing test
/* cache feedback register values before suspend */ for (i = 0; i < max98373->cache_num; i++) regmap_read(max98373->regmap, max98373->cache[i].reg, &max98373->cache[i].val);
<<<< why is this needed???
regcache_cache_only(max98373->regmap, true);
return 0; }
On Mon, Sep 27, 2021 at 12:23:06PM -0500, Pierre-Louis Bossart wrote:
On 9/27/21 12:10 PM, Mark Brown wrote:
A quick survey of other drivers suggests that this pattern should be factored out into some helpers as it looks like there's several ways of implementing it that look very similar but not quite the same...
No disagreement here, we tried really hard to enforce a common pattern for suspend-resume, but i just noticed that the maxim amp driver is different on suspend (resume is consistent with the rest).
There seem to be several slightly different ways of writing what I think is supposed to be the same thing in _io_init() too.
-----Original Message----- From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Sent: Monday, September 27, 2021 10:23 AM To: Mark Brown broonie@kernel.org Cc: guennadi.liakhovetski@linux.intel.com; alsa-devel@alsa-project.org; ryan.lee.maxim@gmail.com; Ryan Lee RyanS.Lee@maximintegrated.com; linux-kernel@vger.kernel.org; tiwai@suse.com; lgirdwood@gmail.com; sathya.prakash.m.r@intel.com; yung-chuan.liao@linux.intel.com Subject: Re: [EXTERNAL] Re: [PATCH] ASoC: max98373: Mark cache dirty before entering sleep
EXTERNAL EMAIL
On 9/27/21 12:10 PM, Mark Brown wrote:
On Mon, Sep 27, 2021 at 11:48:56AM -0500, Pierre-Louis Bossart wrote:
On 9/27/21 11:06 AM, Mark Brown wrote:
More specifically what it does is make the invalidation of the register cache unconditional. It doesn't really matter if the invalidation is done on suspend or resume, so long as it happens before we attempt to resync - this could also be done by deleting the
first_hw_init check.
Mark, that's exactly my point: if the amp rejoins the bus, we will *always* mark the cache as dirty, before the resync is done in the resume sequence.
Ah, yes - I see.
I am really trying to figure out if we have a major flaw in the resume sequence and why things are different in the case of the Maxim
amp.
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
A quick survey of other drivers suggests that this pattern should be factored out into some helpers as it looks like there's several ways of implementing it that look very similar but not quite the same...
No disagreement here, we tried really hard to enforce a common pattern for suspend-resume, but i just noticed that the maxim amp driver is different on suspend (resume is consistent with the rest).
OK. I believe it was similar before. But it looks like 'regcache_mark_dirty' is being disappeared on suspend function.
static int __maybe_unused rt5682_dev_suspend(struct device *dev) { struct rt5682_priv *rt5682 = dev_get_drvdata(dev);
if (!rt5682->hw_init) return 0;
cancel_delayed_work_sync(&rt5682->jack_detect_work);
regcache_cache_only(rt5682->regmap, true); regcache_mark_dirty(rt5682->regmap);
return 0; }
static int __maybe_unused rt711_dev_suspend(struct device *dev) { struct rt711_priv *rt711 = dev_get_drvdata(dev);
if (!rt711->hw_init) return 0; cancel_delayed_work_sync(&rt711->jack_detect_work); cancel_delayed_work_sync(&rt711->jack_btn_check_work); cancel_work_sync(&rt711->calibration_work); regcache_cache_only(rt711->regmap, true); return 0;
}
static int __maybe_unused rt1308_dev_suspend(struct device *dev) { struct rt1308_sdw_priv *rt1308 = dev_get_drvdata(dev);
if (!rt1308->hw_init) return 0; regcache_cache_only(rt1308->regmap, true); return 0;
}
static __maybe_unused int max98373_suspend(struct device *dev) { struct max98373_priv *max98373 = dev_get_drvdata(dev); int i;
<<<< missing test
/* cache feedback register values before suspend */ for (i = 0; i < max98373->cache_num; i++) regmap_read(max98373->regmap, max98373->cache[i].reg,
&max98373->cache[i].val);
<<<< why is this needed???
[] It looks like this was added to get a last ADC values when ADC value read is not available during suspension. https://www.spinics.net/lists/alsa-devel/msg119808.html
regcache_cache_only(max98373->regmap, true); return 0;
}
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
A quick survey of other drivers suggests that this pattern should be factored out into some helpers as it looks like there's several ways of implementing it that look very similar but not quite the same...
No disagreement here, we tried really hard to enforce a common pattern for suspend-resume, but i just noticed that the maxim amp driver is different on suspend (resume is consistent with the rest).
OK. I believe it was similar before. But it looks like 'regcache_mark_dirty' is being disappeared on suspend function.
Not sure what you are trying to say?
static int __maybe_unused rt5682_dev_suspend(struct device *dev) { struct rt5682_priv *rt5682 = dev_get_drvdata(dev);
if (!rt5682->hw_init) return 0;
cancel_delayed_work_sync(&rt5682->jack_detect_work);
regcache_cache_only(rt5682->regmap, true); regcache_mark_dirty(rt5682->regmap);
return 0; }
That last line is also not needed. If you look at rt5682-sdw.c, you'll see a regcache_mark_dirty() when the device is re-initialized.
But now I am starting to wonder if this is due to Chrome kernel differences and possibly a missing backport patch? I no longer believe in coincidences, these two devices are ONLY used in Chromebooks so far...
-----Original Message----- From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Sent: Monday, September 27, 2021 9:49 AM To: Mark Brown broonie@kernel.org; Ryan Lee RyanS.Lee@maximintegrated.com Cc: lgirdwood@gmail.com; perex@perex.cz; tiwai@suse.com; yung- chuan.liao@linux.intel.com; guennadi.liakhovetski@linux.intel.com; alsa- devel@alsa-project.org; linux-kernel@vger.kernel.org; sathya.prakash.m.r@intel.com; ryan.lee.maxim@gmail.com Subject: Re: [EXTERNAL] Re: [PATCH] ASoC: max98373: Mark cache dirty before entering sleep
EXTERNAL EMAIL
On 9/27/21 11:06 AM, Mark Brown wrote:
On Mon, Sep 27, 2021 at 04:01:25PM +0000, Ryan Lee wrote:
regcache_cache_only(max98373->regmap, true);
regcache_mark_dirty(max98373->regmap);
We already do the following sequence in max98373_io_init() when the amplifier re-attaches:
if (max98373->first_hw_init) { regcache_cache_bypass(max98373->regmap, false); regcache_mark_dirty(max98373->regmap); }
I don't see what marking the cache as dirty on suspend might do, we will do a sync only in the resume step.
IIRC this is a patch that we've seen before and removed since it wasn't aligned with any other codec driver.
Yes, it does. There was an mute problem report due to amp register reset during suspend/resume. and we confirmed that the modification is effective.
(https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpa
rtnerissuetracker.corp.google.com%2Fissues%2F194472331&data=04% 7C
01%7Cryans.lee%40maximintegrated.com%7C56f7bb3f05ae4c199dce08d98 1d6b8
94%7Cfbd909dfea694788a554f24b7854ad03%7C0%7C0%7C6376835814870 22873%7C
Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJB TiI6Ik1
haWwiLCJXVCI6Mn0%3D%7C1000&sdata=U1LZabH5MbrotS976TKK8Rh 9nqi0ueRO
%2FxR0obeQlrM%3D&reserved=0) The added code helps to re-write valid values in cache to the amp hardware when audio resume. Same code was there on i2c driver, but not on Soundwire driver.
Ryan, we removed this in f184892613dd ('ASoC: codecs: max98373-sdw: align regmap use with other codecs'), so even if this was needed you'd need a mention that this is a revert and why this sequence is better. You are suggesting a change based on an analogy with I2C which is questionable: when a SoundWire device regains sync on the bus, it will be re- initialized using a callback, and the resume waits for the initialization to complete.
I think there is always possibility that amp lose its power or is reset by hw reset pin control during audio suspension to minimize current consumption. Register restoration process is required for both i2c and Soundwire case when the amp was reset by some reason. max98373_update_status () is called when audio resume but ' sdw_slave_status' remains ' SDW_SLAVE_ATTACHED' and ' first_hw_init' is 1, so restoration is not happening. This is software variable and the value remains the same for the amp hardware reset not triggered by the software driver. If regcache_mark_dirty() is not called, regcache_sync() will assume that the hardware state still matches the cache state which causes the mute problem here.
More specifically what it does is make the invalidation of the register cache unconditional. It doesn't really matter if the invalidation is done on suspend or resume, so long as it happens before we attempt to resync - this could also be done by deleting the
first_hw_init check.
I tried to delete the first_hw_init, but it didn't work because other status variable is also related. Calling regcache_mark_dirty() is needed to solve a synchronization issue between cache and actual hardware value during suspend/resume, but it was not called.
Mark, that's exactly my point: if the amp rejoins the bus, we will *always* mark the cache as dirty, before the resync is done in the resume sequence.
I am really trying to figure out if we have a major flaw in the resume sequence and why things are different in the case of the Maxim amp.
Maybe other amps were not reset during audio suspend/resume. Even max98373 was okay with the previous gen. Intel board.
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
max98373_io_init() is not called because ' sdw_slave_status' remains ' SDW_SLAVE_ATTACHED' and 'max98373->hw_init' is already true. Removing 'if (max98373->hw_init || status != SDW_SLAVE_ATTACHED)' condition in max98373_update_status() function instead of adding regcache_mark_dirty() into max98373_suspend() can be an alternative way. I think it is all about where regcache_mark_dirty() is called from. The difference is that max98373_io_init() really do the software reset and do amp initialization again which could be an overhead.
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
max98373_io_init() is not called because ' sdw_slave_status' remains ' SDW_SLAVE_ATTACHED' and 'max98373->hw_init' is already true. Removing 'if (max98373->hw_init || status != SDW_SLAVE_ATTACHED)' condition in max98373_update_status() function instead of adding regcache_mark_dirty() into max98373_suspend() can be an alternative way. I think it is all about where regcache_mark_dirty() is called from. The difference is that max98373_io_init() really do the software reset and do amp initialization again which could be an overhead.
that description is aligned with my analysis that there's something very wrong happening here, it's not just a simple miss in the regmap handling but a major conceptual bug or misunderstanding in the way reset is handled.
First, there's the spec: on a reset initiated by the host or if the device loses sync for ANY reason, its status cannot remain ATTACHED. There's got to be a 16-frame period at least where the device has to monitor the sync pattern and cannot drive anything on the bus.
Then there's the hardware behavior on resume: on resume by default the Intel host will toggle the data pin for at least 4096 frames, which by spec means severe reset.
And last, there's the software init: we also force the status as UNATTACHED in drivers/soundwire/intel.c:
/* * make sure all Slaves are tagged as UNATTACHED and provide * reason for reinitialization */ sdw_clear_slave_status(bus, SDW_UNATTACH_REQUEST_MASTER_RESET);
But we've also seen the opposite effect of an amplifier reporting attached but losing sync immediately after the end of enumeration and never coming back on the bus, see issue https://github.com/thesofproject/linux/issues/3063
In other words, we need to check what really happens on resume and why the amplifier keeps reporting its status as ATTACHED despite the spec requirements and software init, or loses this status after enumeration....Something really does not add-up, again it's not just a regmap management issue.
-----Original Message----- From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Sent: Monday, September 27, 2021 12:34 PM To: Ryan Lee RyanS.Lee@maximintegrated.com; Mark Brown broonie@kernel.org Cc: lgirdwood@gmail.com; perex@perex.cz; tiwai@suse.com; yung- chuan.liao@linux.intel.com; guennadi.liakhovetski@linux.intel.com; alsa- devel@alsa-project.org; linux-kernel@vger.kernel.org; sathya.prakash.m.r@intel.com; ryan.lee.maxim@gmail.com Subject: Re: [EXTERNAL] Re: [PATCH] ASoC: max98373: Mark cache dirty before entering sleep
EXTERNAL EMAIL
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
max98373_io_init() is not called because ' sdw_slave_status' remains ' SDW_SLAVE_ATTACHED' and 'max98373->hw_init' is already true. Removing 'if (max98373->hw_init || status != SDW_SLAVE_ATTACHED)' condition in max98373_update_status() function instead of adding regcache_mark_dirty() into max98373_suspend() can be an alternative way. I think it is all about where regcache_mark_dirty() is called from. The difference is that max98373_io_init() really do the software reset and do amp initialization again which could be an overhead.
that description is aligned with my analysis that there's something very wrong happening here, it's not just a simple miss in the regmap handling but a major conceptual bug or misunderstanding in the way reset is handled.
First, there's the spec: on a reset initiated by the host or if the device loses sync for ANY reason, its status cannot remain ATTACHED. There's got to be a 16-frame period at least where the device has to monitor the sync pattern and cannot drive anything on the bus.
Then there's the hardware behavior on resume: on resume by default the Intel host will toggle the data pin for at least 4096 frames, which by spec means severe reset.
And last, there's the software init: we also force the status as UNATTACHED in drivers/soundwire/intel.c:
/* * make sure all Slaves are tagged as UNATTACHED and provide * reason for reinitialization */ sdw_clear_slave_status(bus,
SDW_UNATTACH_REQUEST_MASTER_RESET);
But we've also seen the opposite effect of an amplifier reporting attached but losing sync immediately after the end of enumeration and never coming back on the bus, see issue https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith ub.com%2Fthesofproject%2Flinux%2Fissues%2F3063&data=04%7C01% 7Cryans.lee%40maximintegrated.com%7Cb9f84a1267ec4f50b7a008d981edc c46%7Cfbd909dfea694788a554f24b7854ad03%7C0%7C0%7C637683680607 026027%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rARkwTSB3 DN%2BCYxGaehOhtCGEj1eLBl6Mk7QhynQSY8%3D&reserved=0
In other words, we need to check what really happens on resume and why the amplifier keeps reporting its status as ATTACHED despite the spec requirements and software init, or loses this status after enumeration....Something really does not add-up, again it's not just a regmap management issue.
I agree that the fix is not necessary if the reset issue was not occurred. Thanks for your input about the status check on Intel driver, too. I will continue to find the culprit who cause the amp reset, but this is not simple because it is not only related to Maxim driver but also other things on both hardware and software side. I think making the amp driver code more conservative by adding regcache_mark_dirty() can make the system robust from the glitches between suspend and resume. This is what I can do from the amp driver side and the code I added is not new but a proven way on existing drivers as we all know. I just wonder why the issue was not observed when the code is removed. This probably means there is an external reason which was not exist before.
Instead of changing the suspend sequence, can we please try to
modify
the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
max98373_io_init() is not called because ' sdw_slave_status' remains
'
SDW_SLAVE_ATTACHED' and 'max98373->hw_init' is already true. Removing 'if (max98373->hw_init || status !=
SDW_SLAVE_ATTACHED)'
condition in max98373_update_status() function instead of adding regcache_mark_dirty() into max98373_suspend() can be an
alternative way.
I think it is all about where regcache_mark_dirty() is called from. The difference is that max98373_io_init() really do the software
reset
and do amp initialization again which could be an overhead.
that description is aligned with my analysis that there's something very wrong happening here, it's not just a simple miss in the regmap handling but a major conceptual bug or misunderstanding in the way reset is handled.
First, there's the spec: on a reset initiated by the host or if the device loses sync for ANY reason, its status cannot remain ATTACHED. There's got to be a 16-frame period at least where the device has to monitor the sync pattern and cannot drive anything on the bus.
Then there's the hardware behavior on resume: on resume by default the Intel host will toggle the data pin for at least 4096 frames, which by spec means severe reset.
And last, there's the software init: we also force the status as UNATTACHED in drivers/soundwire/intel.c:
/* * make sure all Slaves are tagged as UNATTACHED and provide * reason for reinitialization */ sdw_clear_slave_status(bus,
SDW_UNATTACH_REQUEST_MASTER_RESET);
But we've also seen the opposite effect of an amplifier reporting attached but losing sync immediately after the end of enumeration and never coming back on the bus, see issue https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F% 2Fgithub.com%2Fthesofproject%2Flinux%2Fissues%2F3063&data =04%7C01%7Cryans.lee%40maximintegrated.com%7Cb9f84a1267ec4 f50b7a008d981edcc46%7Cfbd909dfea694788a554f24b7854ad03%7C0 %7C0%7C637683680607026027%7CUnknown%7CTWFpbGZsb3d8eyJ WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0 %3D%7C1000&sdata=rARkwTSB3DN%2BCYxGaehOhtCGEj1eLBl6 Mk7QhynQSY8%3D&reserved=0
In other words, we need to check what really happens on resume and why the amplifier keeps reporting its status as ATTACHED despite the spec requirements and software init, or loses this status after enumeration....Something really does not add-up, again it's not just a regmap management issue.
I do not see #3063 issue on my side. No initialization failure or time-out has occurred.
Now I'm trying to solve the issue with max98373_io_init() function as suggested instead of adding regmap_cache_dirty() in the suspend function. max98373_io_init() was not called from max98373_update_status() when audio resume because max98373->hw_init was 1 and Status was SDW_SLAVE_ATTACHED. max98373_update_status() do not get SDW_SLAVE_UNATTACHED. I confirmed that the issue could be resolved if SDW_SLAVE_UNATTACHED event arrives at max98373_update_status() before SDW_SLAVE_ATTACHED is triggered. Actually sdw_handle_slave_status() get SDW_SLAVE_UNATTACHED but this function exits at https://github.com/thesofproject/linux/blob/topic/sof-dev/drivers/soundwire/... before reaching to https://github.com/thesofproject/linux/blob/topic/sof-dev/drivers/soundwire/... I'm not sure how to solve this issue because this code is commonly used for other Soundwire drivers as well.
I share the debug messages for the resume event as your reference. [ 127.490644] [DEBUG3] intel_resume_runtime [ 127.490655] [DEBUG3] intel_resume_runtime SDW_INTEL_CLK_STOP_BUS_RESET [ 127.490658] [DEBUG3] intel_init [ 127.490660] [DEBUG3] intel_link_power_up [ 127.490977] [DEBUG3] intel_resume_runtime SDW_UNATTACH_REQUEST_MASTER_RESET .. [ 127.490980] [DEBUG4] sdw_clear_slave_status request: 1 [ 127.490983] [DEBUG4] sdw_modify_slave_status, ID:7, status: 0 [ 127.490986] [DEBUG4] sdw_modify_slave_status, ID:3, status: 0 [ 127.490994] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491060] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491191] [DEBUG] max98373_resume, first_hw_init: 1, unattach_request: 1 [ 127.491194] [DEBUG] max98373_resume, INF MODE: 0 [ 127.491953] [DEBUG4] sdw_handle_slave_status IN [ 127.491956] [DEBUG4] sdw_handle_slave_status, status[1] : 0, slave->status: 0, id:7 // UNATTACHED [ 127.491958] [DEBUG4] sdw_handle_slave_status, status[2] : 0, slave->status: 0, id:3 [ 127.491960] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 1 [ 127.492808] [DEBUG4] sdw_handle_slave_status IN [ 127.492810] [DEBUG4] sdw_handle_slave_status, status[1] : 1, slave->status: 0, id:7 // ATTACHED [ 127.492812] [DEBUG4] sdw_handle_slave_status, status[2] : 1, slave->status: 0, id:3 [ 127.492814] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 0 [ 127.492816] [DEBUG4] sdw_handle_slave_status IN3 [ 127.492818] [DEBUG4] sdw_handle_slave_status status[1] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:7, prev_status:0 [ 127.492820] [DEBUG4] sdw_modify_slave_status, ID:7, status: 1 [ 127.493008] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:7 [ 127.493010] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493012] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :7 [ 127.493015] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1 [ 127.493017] [DEBUG4] sdw_handle_slave_status status[2] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:3, prev_status:0 [ 127.493019] [DEBUG4] sdw_modify_slave_status, ID:3, status: 1 [ 127.493199] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:3 [ 127.493201] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493204] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :3 [ 127.493207] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1
I do not see #3063 issue on my side. No initialization failure or time-out has occurred.
It's rather random, we've only seen the error in long daily tests.
Now I'm trying to solve the issue with max98373_io_init() function as suggested instead of adding regmap_cache_dirty() in the suspend function. max98373_io_init() was not called from max98373_update_status() when audio resume because max98373->hw_init was 1 and Status was SDW_SLAVE_ATTACHED. max98373_update_status() do not get SDW_SLAVE_UNATTACHED. I confirmed that the issue could be resolved if SDW_SLAVE_UNATTACHED event arrives at max98373_update_status() before SDW_SLAVE_ATTACHED is triggered. Actually sdw_handle_slave_status() get SDW_SLAVE_UNATTACHED but this function exits at https://github.com/thesofproject/linux/blob/topic/sof-dev/drivers/soundwire/... before reaching to https://github.com/thesofproject/linux/blob/topic/sof-dev/drivers/soundwire/... I'm not sure how to solve this issue because this code is commonly
used for other Soundwire drivers as well.
There may be a confusion here.
The SoundWire spec says the device will show up as Device #0. That means the status[0] = ATTACHED.
The driver reads the devID registers and programs the device number N. The device will then report as device #N in PING frames. The controller hardware will detect that device and call the function to update the status a second time.
I share the debug messages for the resume event as your reference. [ 127.490644] [DEBUG3] intel_resume_runtime [ 127.490655] [DEBUG3] intel_resume_runtime SDW_INTEL_CLK_STOP_BUS_RESET [ 127.490658] [DEBUG3] intel_init [ 127.490660] [DEBUG3] intel_link_power_up [ 127.490977] [DEBUG3] intel_resume_runtime SDW_UNATTACH_REQUEST_MASTER_RESET .. [ 127.490980] [DEBUG4] sdw_clear_slave_status request: 1 [ 127.490983] [DEBUG4] sdw_modify_slave_status, ID:7, status: 0 [ 127.490986] [DEBUG4] sdw_modify_slave_status, ID:3, status: 0 [ 127.490994] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491060] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491191] [DEBUG] max98373_resume, first_hw_init: 1, unattach_request: 1 [ 127.491194] [DEBUG] max98373_resume, INF MODE: 0 [ 127.491953] [DEBUG4] sdw_handle_slave_status IN [ 127.491956] [DEBUG4] sdw_handle_slave_status, status[1] : 0, slave->status: 0, id:7 // UNATTACHED [ 127.491958] [DEBUG4] sdw_handle_slave_status, status[2] : 0, slave->status: 0, id:3 [ 127.491960] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 1 [ 127.492808] [DEBUG4] sdw_handle_slave_status IN [ 127.492810] [DEBUG4] sdw_handle_slave_status, status[1] : 1, slave->status: 0, id:7 // ATTACHED [ 127.492812] [DEBUG4] sdw_handle_slave_status, status[2] : 1, slave->status: 0, id:3 [ 127.492814] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 0 [ 127.492816] [DEBUG4] sdw_handle_slave_status IN3 [ 127.492818] [DEBUG4] sdw_handle_slave_status status[1] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:7, prev_status:0 [ 127.492820] [DEBUG4] sdw_modify_slave_status, ID:7, status: 1 [ 127.493008] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:7 [ 127.493010] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493012] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :7 [ 127.493015] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1 [ 127.493017] [DEBUG4] sdw_handle_slave_status status[2] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:3, prev_status:0 [ 127.493019] [DEBUG4] sdw_modify_slave_status, ID:3, status: 1 [ 127.493199] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:3 [ 127.493201] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493204] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :3 [ 127.493207] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1
I don't really see anything in this sequence that differs from my explanations?
The update_status() is only called when the device has a non-zero device number.
There may be a real problem with update_status() not being called but I just don't see it so far.
One way to improve the traces would be to use dev_dbg, that way we'd have a trace of which device is being handled. There are two devices managed by the same driver, a trace with pr_dbg doesn't tell us much.
Instead of changing the suspend sequence, can we please try to
modify
the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
max98373_io_init() is not called because ' sdw_slave_status'
remains
'
SDW_SLAVE_ATTACHED' and 'max98373->hw_init' is already true. Removing 'if (max98373->hw_init || status !=
SDW_SLAVE_ATTACHED)'
condition in max98373_update_status() function instead of adding regcache_mark_dirty() into max98373_suspend() can be an
alternative way.
I think it is all about where regcache_mark_dirty() is called from. The difference is that max98373_io_init() really do the software
reset
and do amp initialization again which could be an overhead.
that description is aligned with my analysis that there's something very wrong happening here, it's not just a simple miss in the regmap handling but a major conceptual bug or misunderstanding in the way reset is handled.
First, there's the spec: on a reset initiated by the host or if the device loses sync for ANY reason, its status cannot remain
ATTACHED.
There's got to be a 16-frame period at least where the device has to monitor the sync pattern and cannot drive anything on the bus.
Then there's the hardware behavior on resume: on resume by default
the
Intel host will toggle the data pin for at least 4096 frames, which by spec means severe reset.
And last, there's the software init: we also force the status as UNATTACHED in drivers/soundwire/intel.c:
/* * make sure all Slaves are tagged as UNATTACHED and provide * reason for reinitialization */ sdw_clear_slave_status(bus,
SDW_UNATTACH_REQUEST_MASTER_RESET);
But we've also seen the opposite effect of an amplifier reporting attached but losing sync immediately after the end of enumeration
and
never coming back on the bus, see issue https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%
2Fgithub.com%2Fthesofproject%2Flinux%2Fissues%2F3063&data
=04%7C01%7Cryans.lee%40maximintegrated.com%7Cb9f84a1267ec4
f50b7a008d981edcc46%7Cfbd909dfea694788a554f24b7854ad03%7C0
%7C0%7C637683680607026027%7CUnknown%7CTWFpbGZsb3d8eyJ
WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
%3D%7C1000&sdata=rARkwTSB3DN%2BCYxGaehOhtCGEj1eLBl
6
Mk7QhynQSY8%3D&reserved=0
In other words, we need to check what really happens on resume
and why
the amplifier keeps reporting its status as ATTACHED despite the spec requirements and software init, or loses this status after enumeration....Something really does not add-up, again it's not just a regmap management issue.
I do not see #3063 issue on my side. No initialization failure or time- out has occurred.
Now I'm trying to solve the issue with max98373_io_init() function as suggested instead of adding regmap_cache_dirty() in the suspend function. max98373_io_init() was not called from max98373_update_status() when audio resume because max98373->hw_init was 1 and Status was SDW_SLAVE_ATTACHED. max98373_update_status() do not get SDW_SLAVE_UNATTACHED. I confirmed that the issue could be resolved if SDW_SLAVE_UNATTACHED event arrives at max98373_update_status() before SDW_SLAVE_ATTACHED is triggered. Actually sdw_handle_slave_status() get SDW_SLAVE_UNATTACHED but this function exits at https://github.com/thesofproject/linux/blob/topic/sof- dev/drivers/soundwire/bus.c#L1765 before reaching to https://github.com/thesofproject/linux/blob/topic/sof- dev/drivers/soundwire/bus.c#L1825 I'm not sure how to solve this issue because this code is commonly used for other Soundwire drivers as well.
I share the debug messages for the resume event as your reference. [ 127.490644] [DEBUG3] intel_resume_runtime [ 127.490655] [DEBUG3] intel_resume_runtime SDW_INTEL_CLK_STOP_BUS_RESET [ 127.490658] [DEBUG3] intel_init [ 127.490660] [DEBUG3] intel_link_power_up [ 127.490977] [DEBUG3] intel_resume_runtime SDW_UNATTACH_REQUEST_MASTER_RESET .. [ 127.490980] [DEBUG4] sdw_clear_slave_status request: 1 [ 127.490983] [DEBUG4] sdw_modify_slave_status, ID:7, status: 0 [ 127.490986] [DEBUG4] sdw_modify_slave_status, ID:3, status: 0 [ 127.490994] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491060] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491191] [DEBUG] max98373_resume, first_hw_init: 1, unattach_request: 1 [ 127.491194] [DEBUG] max98373_resume, INF MODE: 0 [ 127.491953] [DEBUG4] sdw_handle_slave_status IN [ 127.491956] [DEBUG4] sdw_handle_slave_status, status[1] : 0, slave->status: 0, id:7 // UNATTACHED [ 127.491958] [DEBUG4] sdw_handle_slave_status, status[2] : 0, slave->status: 0, id:3 [ 127.491960] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 1 [ 127.492808] [DEBUG4] sdw_handle_slave_status IN [ 127.492810] [DEBUG4] sdw_handle_slave_status, status[1] : 1, slave->status: 0, id:7 // ATTACHED [ 127.492812] [DEBUG4] sdw_handle_slave_status, status[2] : 1, slave->status: 0, id:3 [ 127.492814] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 0 [ 127.492816] [DEBUG4] sdw_handle_slave_status IN3 [ 127.492818] [DEBUG4] sdw_handle_slave_status status[1] = SDW_SLAVE_ATTACHED, slave-
status : 0, slave:7, prev_status:0 [ 127.492820] [DEBUG4]
sdw_modify_slave_status, ID:7, status: 1 [ 127.493008] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:7 [ 127.493010] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493012] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :7 [ 127.493015] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1 [ 127.493017] [DEBUG4] sdw_handle_slave_status status[2] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:3, prev_status:0 [ 127.493019] [DEBUG4] sdw_modify_slave_status, ID:3, status: 1 [ 127.493199] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:3 [ 127.493201] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493204] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :3 [ 127.493207] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1
Thanks for the comments. I tried to find the reason why the amp was not detached from the bus properly and found information about CLOCK_STOP_NOW bit in 0x0044 SCP_Ctrl register. It seems like 0x2(ClockStopNow) needs to be configured before the host CLOCK STOP. I was able to get a good result if I add this command in the amp driver suspend function. The amp driver receives the detachment event and register restoration was done properly after the audio resume. I can modify the amp driver for this change but it looks like this needs to be done from the host side. May I have a comment on this? Thanks.
I tried to find the reason why the amp was not detached from the bus properly and found information about CLOCK_STOP_NOW bit in 0x0044 SCP_Ctrl register. It seems like 0x2(ClockStopNow) needs to be configured before the host CLOCK STOP. I was able to get a good result if I add this command in the amp driver suspend function. The amp driver receives the detachment event and register restoration was done properly after the audio resume. I can modify the amp driver for this change but it looks like this needs to be done from the host side. May I have a comment on this? Thanks.
This register is already taken care of in drivers/soundwire/intel.c and cadence_master.c
for pm_runtime suspend, the sequence uses sdw_cdns_clock_stop(), which will try and prepare devices for clock-stop with a callback, in case any imp-def registers is required, then it will call sdw_bus_clk_stop() which does a broadcast write:
sdw_bus_clk_stop(struct sdw_bus *bus) { int ret;
/* * broadcast clock stop now, attached Slaves will ACK this, * unattached will ignore */ ret = sdw_bwrite_no_pm(bus, SDW_BROADCAST_DEV_NUM, SDW_SCP_CTRL, SDW_SCP_CTRL_CLK_STP_NOW); if (ret < 0) { if (ret != -ENODATA) dev_err(bus->dev, "ClockStopNow Broadcast msg failed %d\n", ret); return ret; }
The codec driver is not supposed to set this bit on its own, what this indicates is that the clock will actually stop at the end of the frame. Only the master/controller driver can transmit this - there's a very strong reason why its a bus functionality.
The other point is that on pm_runtime resume, the Intel host will start a SEVERE_RESET sequence. That's a bit different from the 'traditional' description of the clock stop due to a power optimization on the Intel side (see more below), but doing a reset has precedence over any other configuration that might have happened before the clock stopped so the amplifier SHALL transition to UNATTACHED on a reset.
Somehow it looks like the amplifiers don't see the clock stopped and don't see the reset, that's rather surprising.
If this happens for system suspend/resume, then it's a different story: we don't use the clock stop mode at all, the bus will be completely reconfigured.
You could try to see if the results change by using the 'traditional' clock stop mode with a kernel module parameters
option snd-sof-intel-hda-common sdw_clock_stop_quirks=0
the default is SDW_INTEL_CLK_STOP_BUS_RESET
/* * Require a bus reset (and complete re-enumeration) when exiting * clock stop modes. This may be needed if the controller power was * turned off and all context lost. This quirk shall not be used if a * Slave device needs to remain enumerated and keep its context, * e.g. to provide the reasons for the wake, report acoustic events or * pass a history buffer. */ #define SDW_INTEL_CLK_STOP_BUS_RESET BIT(3)
In this case, the bus will not be reset, I wonder if this is the part that's problematic for the amplifier.
Hope this helps -Pierre
participants (3)
-
Mark Brown
-
Pierre-Louis Bossart
-
Ryan Lee