-----Original Message----- From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Sent: Monday, September 27, 2021 12:34 PM To: Ryan Lee RyanS.Lee@maximintegrated.com; Mark Brown broonie@kernel.org Cc: lgirdwood@gmail.com; perex@perex.cz; tiwai@suse.com; yung- chuan.liao@linux.intel.com; guennadi.liakhovetski@linux.intel.com; alsa- devel@alsa-project.org; linux-kernel@vger.kernel.org; sathya.prakash.m.r@intel.com; ryan.lee.maxim@gmail.com Subject: Re: [EXTERNAL] Re: [PATCH] ASoC: max98373: Mark cache dirty before entering sleep
EXTERNAL EMAIL
Instead of changing the suspend sequence, can we please try to modify the max98373_io_init() routine to unconditionally flag the cache as dirty, maybe this points to a problem with the management of the max98373->first_hw_init flag.
max98373_io_init() is not called because ' sdw_slave_status' remains ' SDW_SLAVE_ATTACHED' and 'max98373->hw_init' is already true. Removing 'if (max98373->hw_init || status != SDW_SLAVE_ATTACHED)' condition in max98373_update_status() function instead of adding regcache_mark_dirty() into max98373_suspend() can be an alternative way. I think it is all about where regcache_mark_dirty() is called from. The difference is that max98373_io_init() really do the software reset and do amp initialization again which could be an overhead.
that description is aligned with my analysis that there's something very wrong happening here, it's not just a simple miss in the regmap handling but a major conceptual bug or misunderstanding in the way reset is handled.
First, there's the spec: on a reset initiated by the host or if the device loses sync for ANY reason, its status cannot remain ATTACHED. There's got to be a 16-frame period at least where the device has to monitor the sync pattern and cannot drive anything on the bus.
Then there's the hardware behavior on resume: on resume by default the Intel host will toggle the data pin for at least 4096 frames, which by spec means severe reset.
And last, there's the software init: we also force the status as UNATTACHED in drivers/soundwire/intel.c:
/* * make sure all Slaves are tagged as UNATTACHED and provide * reason for reinitialization */ sdw_clear_slave_status(bus,
SDW_UNATTACH_REQUEST_MASTER_RESET);
But we've also seen the opposite effect of an amplifier reporting attached but losing sync immediately after the end of enumeration and never coming back on the bus, see issue https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith ub.com%2Fthesofproject%2Flinux%2Fissues%2F3063&data=04%7C01% 7Cryans.lee%40maximintegrated.com%7Cb9f84a1267ec4f50b7a008d981edc c46%7Cfbd909dfea694788a554f24b7854ad03%7C0%7C0%7C637683680607 026027%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rARkwTSB3 DN%2BCYxGaehOhtCGEj1eLBl6Mk7QhynQSY8%3D&reserved=0
In other words, we need to check what really happens on resume and why the amplifier keeps reporting its status as ATTACHED despite the spec requirements and software init, or loses this status after enumeration....Something really does not add-up, again it's not just a regmap management issue.
I agree that the fix is not necessary if the reset issue was not occurred. Thanks for your input about the status check on Intel driver, too. I will continue to find the culprit who cause the amp reset, but this is not simple because it is not only related to Maxim driver but also other things on both hardware and software side. I think making the amp driver code more conservative by adding regcache_mark_dirty() can make the system robust from the glitches between suspend and resume. This is what I can do from the amp driver side and the code I added is not new but a proven way on existing drivers as we all know. I just wonder why the issue was not observed when the code is removed. This probably means there is an external reason which was not exist before.