WM8962 crashing on suspend
I have an imx8m Mini with a wm8962 codec. If I run a speaker test and suspend the board while the speaker test is running, I get the following upon wake:
wm8962 3-001a: ASoC: error at soc_component_read_no_lock on wm8962.3-001a: -16
This message repeats itself over and over again. If I attempt to use any audio, it fails until I reboot the board.
If I run the audio test, then exit and suspend, the audio works upon resume, so it appears to be related to suspending while running.
I am hoping someone might have a suggestion as to what I might be able to do or try to allow this to successfully suspend and resume if the device is playing sound.
thank you,
adam
On Tue, Apr 26, 2022 at 11:36:26AM -0500, Adam Ford wrote:
I have an imx8m Mini with a wm8962 codec. If I run a speaker test and suspend the board while the speaker test is running, I get the following upon wake:
wm8962 3-001a: ASoC: error at soc_component_read_no_lock on wm8962.3-001a: -16
This message repeats itself over and over again. If I attempt to use any audio, it fails until I reboot the board.
If I run the audio test, then exit and suspend, the audio works upon resume, so it appears to be related to suspending while running.
I am hoping someone might have a suggestion as to what I might be able to do or try to allow this to successfully suspend and resume if the device is playing sound.
I do note that wm8962 doesn't have any system suspend code which is possibly an issue if power is lost during suspend, it only has runtime PM operations. If the device was runtime idle before suspend those will figure everything out, if the device was active they won't kick in.
On Tue, Apr 26, 2022 at 11:36:26AM -0500, Adam Ford wrote:
I have an imx8m Mini with a wm8962 codec. If I run a speaker test and suspend the board while the speaker test is running, I get the following upon wake:
wm8962 3-001a: ASoC: error at soc_component_read_no_lock on wm8962.3-001a: -16
This message repeats itself over and over again. If I attempt to use any audio, it fails until I reboot the board.
If I run the audio test, then exit and suspend, the audio works upon resume, so it appears to be related to suspending while running.
I am hoping someone might have a suggestion as to what I might be able to do or try to allow this to successfully suspend and resume if the device is playing sound.
Hmm... EBUSY is what regmap returns when a volatile register is read whilst the chip is still in cache only. The driver does appear to be missing the usually fairly important work around to avoid the IRQ and the PM runtime deadlocking on resume. Although not sure that would actually lead to the error message you are seeing.
Would be really handy to see a little more of the log leading up to the failure if that is possible? And would be really awesome if you had any idea which read was returning the error? You could shove a dump_stack in _soc_component_ret next to the error message.
Thanks, Charles
On Tue, Apr 26, 2022 at 12:41 PM Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Tue, Apr 26, 2022 at 11:36:26AM -0500, Adam Ford wrote:
I have an imx8m Mini with a wm8962 codec. If I run a speaker test and suspend the board while the speaker test is running, I get the following upon wake:
wm8962 3-001a: ASoC: error at soc_component_read_no_lock on wm8962.3-001a: -16
This message repeats itself over and over again. If I attempt to use any audio, it fails until I reboot the board.
If I run the audio test, then exit and suspend, the audio works upon resume, so it appears to be related to suspending while running.
I am hoping someone might have a suggestion as to what I might be able to do or try to allow this to successfully suspend and resume if the device is playing sound.
Hmm... EBUSY is what regmap returns when a volatile register is read whilst the chip is still in cache only. The driver does appear to be missing the usually fairly important work around to avoid the IRQ and the PM runtime deadlocking on resume. Although not sure that would actually lead to the error message you are seeing.
Would be really handy to see a little more of the log leading up to the failure if that is possible? And would be really awesome if you had any idea which read was returning the error? You could shove a dump_stack in _soc_component_ret next to the error message.
Because NXP had a downstream kernel, and it didn't appear to happen when using their downstream kernel, I wanted to see the difference. I found this:
static const struct dev_pm_ops wm8962_pm = { + SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
adam
Thanks, Charles
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
I found this:
static const struct dev_pm_ops wm8962_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Yes, that's fine - it's fixing the thing I was pointing out with only having runtime suspend but no system suspend.
On Wed, Apr 27, 2022 at 02:21:26PM +0100, Mark Brown wrote:
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
I found this: static const struct dev_pm_ops wm8962_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Yes, that's fine - it's fixing the thing I was pointing out with only having runtime suspend but no system suspend.
:-) Well in that case ignore my email, I don't mind if we want to go this way too.
Thanks, Charles
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
On Tue, Apr 26, 2022 at 12:41 PM Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Tue, Apr 26, 2022 at 11:36:26AM -0500, Adam Ford wrote:
I have an imx8m Mini with a wm8962 codec. If I run a speaker test and suspend the board while the speaker test is running, I get the following upon wake:
wm8962 3-001a: ASoC: error at soc_component_read_no_lock on wm8962.3-001a: -16
This message repeats itself over and over again. If I attempt to use any audio, it fails until I reboot the board.
If I run the audio test, then exit and suspend, the audio works upon resume, so it appears to be related to suspending while running.
I am hoping someone might have a suggestion as to what I might be able to do or try to allow this to successfully suspend and resume if the device is playing sound.
Hmm... EBUSY is what regmap returns when a volatile register is read whilst the chip is still in cache only. The driver does appear to be missing the usually fairly important work around to avoid the IRQ and the PM runtime deadlocking on resume. Although not sure that would actually lead to the error message you are seeing.
Would be really handy to see a little more of the log leading up to the failure if that is possible? And would be really awesome if you had any idea which read was returning the error? You could shove a dump_stack in _soc_component_ret next to the error message.
Because NXP had a downstream kernel, and it didn't appear to happen when using their downstream kernel, I wanted to see the difference. I found this:
static const struct dev_pm_ops wm8962_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Feels like those operations should be runtime PM, like there is no reason to keep the CODEC in a high power state than necessary.
Let me add the necessary stuff to at least avoid the race with the IRQ and lets see if that has any effect, although I am not totally convinced your symptoms sound like that is the issue. But it is fixing an issue regardless so might as well start there. Any of the debug I requested previously would also be handy though.
Thanks, Charles
On Wed, Apr 27, 2022 at 02:57:30PM +0000, Charles Keepax wrote:
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
static const struct dev_pm_ops wm8962_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Feels like those operations should be runtime PM, like there is no reason to keep the CODEC in a high power state than necessary.
The issue Adam reported was suspending during playback - if you suspend during playback or capture the device is not idle at the point where we start trying to suspend so it shouldn't be in runtime suspend and won't by default be runtime suspended by the PM core.
On Wed, Apr 27, 2022 at 04:24:31PM +0100, Mark Brown wrote:
On Wed, Apr 27, 2022 at 02:57:30PM +0000, Charles Keepax wrote:
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
static const struct dev_pm_ops wm8962_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Feels like those operations should be runtime PM, like there is no reason to keep the CODEC in a high power state than necessary.
The issue Adam reported was suspending during playback - if you suspend during playback or capture the device is not idle at the point where we start trying to suspend so it shouldn't be in runtime suspend and won't by default be runtime suspended by the PM core.
Yeah in my head snd_soc_suspend would have been called which would (assuming the DAI doesn't have ignore_suspend set) shut down the DAPM graph for the audio route, causing the runtime references to all be released and the CODEC to be suspended through runtime_pm. Not sure if I missed something there, and that also allows for systems where the CODEC doesn't suspend during system suspend. That said guess there probably arn't any use-cases for that on wm8962 and I am more than happy to use the force_suspend ops if you are happy with it.
Thanks, Charles
On Wed, Apr 27, 2022 at 11:48 AM Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Wed, Apr 27, 2022 at 04:24:31PM +0100, Mark Brown wrote:
On Wed, Apr 27, 2022 at 02:57:30PM +0000, Charles Keepax wrote:
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
static const struct dev_pm_ops wm8962_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL) };
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Feels like those operations should be runtime PM, like there is no reason to keep the CODEC in a high power state than necessary.
The issue Adam reported was suspending during playback - if you suspend during playback or capture the device is not idle at the point where we start trying to suspend so it shouldn't be in runtime suspend and won't by default be runtime suspended by the PM core.
Yeah in my head snd_soc_suspend would have been called which would (assuming the DAI doesn't have ignore_suspend set) shut down the DAPM graph for the audio route, causing the runtime references to all be released and the CODEC to be suspended through runtime_pm. Not sure if I missed something there, and that also allows for systems where the CODEC doesn't suspend during system suspend. That said guess there probably arn't any use-cases for that on wm8962 and I am more than happy to use the force_suspend ops if you are happy with it.
I am not familiar with this driver or the force_suspend ops, so I am not sure if there are going to be side-effects. I don't mind collecting more data if it's helpful. I probably won't be able to add more debug info until this weekend at the earliest.
adam
Thanks, Charles
On Wed, Apr 27, 2022 at 11:54:40AM -0500, Adam Ford wrote:
On Wed, Apr 27, 2022 at 11:48 AM Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Wed, Apr 27, 2022 at 04:24:31PM +0100, Mark Brown wrote:
On Wed, Apr 27, 2022 at 02:57:30PM +0000, Charles Keepax wrote:
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Feels like those operations should be runtime PM, like there is no reason to keep the CODEC in a high power state than necessary.
The issue Adam reported was suspending during playback - if you suspend during playback or capture the device is not idle at the point where we start trying to suspend so it shouldn't be in runtime suspend and won't by default be runtime suspended by the PM core.
Yeah in my head snd_soc_suspend would have been called which would (assuming the DAI doesn't have ignore_suspend set) shut down the DAPM graph for the audio route, causing the runtime references to all be released and the CODEC to be suspended through runtime_pm. Not sure if I missed something there, and that also allows for systems where the CODEC doesn't suspend during system suspend. That said guess there probably arn't any use-cases for that on wm8962 and I am more than happy to use the force_suspend ops if you are happy with it.
I am not familiar with this driver or the force_suspend ops, so I am not sure if there are going to be side-effects. I don't mind collecting more data if it's helpful. I probably won't be able to add more debug info until this weekend at the earliest.
Nah, its good your ok to upstream your out of tree patch, just making sure I fill in the holes in my knowledge with Mark :-)
Thanks, Charles
On Thu, Apr 28, 2022 at 3:23 AM Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Wed, Apr 27, 2022 at 11:54:40AM -0500, Adam Ford wrote:
On Wed, Apr 27, 2022 at 11:48 AM Charles Keepax ckeepax@opensource.cirrus.com wrote:
On Wed, Apr 27, 2022 at 04:24:31PM +0100, Mark Brown wrote:
On Wed, Apr 27, 2022 at 02:57:30PM +0000, Charles Keepax wrote:
On Wed, Apr 27, 2022 at 08:12:56AM -0500, Adam Ford wrote:
I applied this, and it appears to make the issue go away on a 5.15 kernel. I haven't tried it on a 5.18 yet. If this fixes the issue, would that be an acceptable solution to push upstream?
Feels like those operations should be runtime PM, like there is no reason to keep the CODEC in a high power state than necessary.
The issue Adam reported was suspending during playback - if you suspend during playback or capture the device is not idle at the point where we start trying to suspend so it shouldn't be in runtime suspend and won't by default be runtime suspended by the PM core.
Yeah in my head snd_soc_suspend would have been called which would (assuming the DAI doesn't have ignore_suspend set) shut down the DAPM graph for the audio route, causing the runtime references to all be released and the CODEC to be suspended through runtime_pm. Not sure if I missed something there, and that also allows for systems where the CODEC doesn't suspend during system suspend. That said guess there probably arn't any use-cases for that on wm8962 and I am more than happy to use the force_suspend ops if you are happy with it.
I am not familiar with this driver or the force_suspend ops, so I am not sure if there are going to be side-effects. I don't mind collecting more data if it's helpful. I probably won't be able to add more debug info until this weekend at the earliest.
Nah, its good your ok to upstream your out of tree patch, just making sure I fill in the holes in my knowledge with Mark :-)
I'd like to push the patch with a Fixes tag, but I am not sure that we have a definitive hash to use. Ideally, it'd get backported, but I am not sure that I have the means to test it, because the hardware platform I have doesn't go back that far. Any thoughts? If not, I'll just push it without a fixes tag.
adam
Thanks, Charles
On Thu, Apr 28, 2022 at 07:21:39AM -0500, Adam Ford wrote:
On Thu, Apr 28, 2022 at 3:23 AM Charles Keepax
Nah, its good your ok to upstream your out of tree patch, just making sure I fill in the holes in my knowledge with Mark :-)
I'd like to push the patch with a Fixes tag, but I am not sure that we have a definitive hash to use. Ideally, it'd get backported, but I am not sure that I have the means to test it, because the hardware platform I have doesn't go back that far. Any thoughts? If not, I'll just push it without a fixes tag.
Cc stable does just as well, though TBH just having the word "fix" somewhere in the changelog will probably cause the bots to pick it up.
On Wed, Apr 27, 2022 at 04:48:25PM +0000, Charles Keepax wrote:
Yeah in my head snd_soc_suspend would have been called which would (assuming the DAI doesn't have ignore_suspend set) shut down the DAPM graph for the audio route, causing the runtime references to all be released and the CODEC to be suspended through runtime_pm. Not sure if I missed something there, and
Runtime suspend won't do anything beyond tracking the reference count when we're in the middle of system suspend IIRC, it won't actually call the operations.
that also allows for systems where the CODEC doesn't suspend during system suspend. That said guess there probably arn't any use-cases for that on wm8962 and I am more than happy to use the force_suspend ops if you are happy with it.
The other option would be to move the runtime PM stuff into the bias level configuration I guess.
participants (3)
-
Adam Ford
-
Charles Keepax
-
Mark Brown