[alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
Thanks, Jon
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
This is roughly what I was thinking. Is there a good way to monitor the timing on the IPCs in cases like this shy of probing the hardware?
On Mon, Jul 29, 2019 at 4:02 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
[removing top-posting] This is roughly what I was thinking. Is there a good way to monitor the timing on the IPCs in cases like this shy of probing the hardware?
I don't think we have any magic tools here. Tracing the start and completion of an IPC, and looking at the dmesg log, along with counting number of IPCs requested and number of Acks received are the usual solutions to figure out when such problems happen.
On Mon, 2019-07-29 at 18:02 -0500, Pierre-Louis Bossart wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
Pierre/Jon
In this case it looks like the DSP boot failed leading to the IPC timeout? WOndering if increasing the boot timeout would help?
Thanks, Ranjani
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On 7/29/19 7:53 PM, Ranjani Sridharan wrote:
On Mon, 2019-07-29 at 18:02 -0500, Pierre-Louis Bossart wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
Pierre/Jon
In this case it looks like the DSP boot failed leading to the IPC timeout? WOndering if increasing the boot timeout would help?
Yes, that too. The boot timeout is typically experimentally defined, and never decreasing due to platform variations... I am still leaning more on the side of an side effect between two IPCs, the added DEBUG points to the printk which solves timing issues. The boot timeout would typically not be impacted by such changes.
On Mon, Jul 29, 2019 at 7:23 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
On 7/29/19 7:53 PM, Ranjani Sridharan wrote:
On Mon, 2019-07-29 at 18:02 -0500, Pierre-Louis Bossart wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
Pierre/Jon
In this case it looks like the DSP boot failed leading to the IPC timeout? WOndering if increasing the boot timeout would help?
I did actually try this without success.
Yes, that too. The boot timeout is typically experimentally defined, and never decreasing due to platform variations... I am still leaning more on the side of an side effect between two IPCs, the added DEBUG points to the printk which solves timing issues. The boot timeout would typically not be impacted by such changes.
I think the real struggle I'm having is finding a good debugging method that doesn't impact the timing of the IPCs significantly (as adding DEBUG seems to). This could maybe be overcome with using a stress test to reproduce. The crash only seems to occur when first booting the DSP, and so far I've been testing this by completely power cycling the machine on every test, which is very slow and tedious. So maybe the issue with DEBUG defined occurs 1 in 20 reboots rather than 1 in 2, I wouldn't know. If there's a way to reboot the DSP and reproduce this crash without rebooting the entire device that would be very helpful to me.
Thanks, Jon
On Tue, 2019-07-30 at 10:45 -0700, Jon Flatley wrote:
On Mon, Jul 29, 2019 at 7:23 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
On 7/29/19 7:53 PM, Ranjani Sridharan wrote:
On Mon, 2019-07-29 at 18:02 -0500, Pierre-Louis Bossart wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: -- message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
Pierre/Jon
In this case it looks like the DSP boot failed leading to the IPC timeout? WOndering if increasing the boot timeout would help?
I did actually try this without success.
Yes, that too. The boot timeout is typically experimentally defined, and never decreasing due to platform variations... I am still leaning more on the side of an side effect between two IPCs, the added DEBUG points to the printk which solves timing issues. The boot timeout would typically not be impacted by such changes.
I think the real struggle I'm having is finding a good debugging method that doesn't impact the timing of the IPCs significantly (as adding DEBUG seems to). This could maybe be overcome with using a stress test to reproduce. The crash only seems to occur when first booting the DSP, and so far I've been testing this by completely power cycling the machine on every test, which is very slow and tedious. So maybe the issue with DEBUG defined occurs 1 in 20 reboots rather than 1 in 2, I wouldn't know. If there's a way to reboot the DSP and reproduce this crash without rebooting the entire device that would be very helpful to me.
Maybe you've already tried this. But, how about blacklisting the audio driver and then trying a modprobe/rmmod to insert and remove themodule. This should attempt to boot the DSP upon every modprobe. But what I am not sure about is whether the rmmod would succeed if the IPC times out because the DSP has crashed.
Thanks, Ranjani
Thanks, Jon
On 7/30/19 1:47 PM, Ranjani Sridharan wrote:
On Tue, 2019-07-30 at 10:45 -0700, Jon Flatley wrote:
On Mon, Jul 29, 2019 at 7:23 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
On 7/29/19 7:53 PM, Ranjani Sridharan wrote:
On Mon, 2019-07-29 at 18:02 -0500, Pierre-Louis Bossart wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote:
I've been working on upstreaming the bdw-rt5650 machine driver for the Acer Chromebase 24 (buddy). There seems to be an issue when first setting the hardware controls that appears to be crashing the DSP:
[ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 ... [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: -- message timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: stream commit failed [ 85.279746] System PCM: error: failed to commit stream -110 [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: haswell-pcm-audio hw params failed: -110 [ 85.293963] System PCM: ASoC: hw_params FE failed -110
This happens roughly 50% of the time when first setting hardware controls after a reboot. The other 50% of the time the DSP comes up just fine and audio works fine thereafter. Adding "#define DEBUG 1" to sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue occur much less frequently in my testing. Seems like a subtle timing issue.
There were timing issues encountered during the bringup of the 2015 chromebook pixel (samus) which uses the bdw-rt5677 machine driver. Those were slightly different, and manifested during repeated arecords. Both devices use the same revision of the sst2 firmware.
Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
Pierre/Jon
In this case it looks like the DSP boot failed leading to the IPC timeout? WOndering if increasing the boot timeout would help?
I did actually try this without success.
Yes, that too. The boot timeout is typically experimentally defined, and never decreasing due to platform variations... I am still leaning more on the side of an side effect between two IPCs, the added DEBUG points to the printk which solves timing issues. The boot timeout would typically not be impacted by such changes.
I think the real struggle I'm having is finding a good debugging method that doesn't impact the timing of the IPCs significantly (as adding DEBUG seems to). This could maybe be overcome with using a stress test to reproduce. The crash only seems to occur when first booting the DSP, and so far I've been testing this by completely power cycling the machine on every test, which is very slow and tedious. So maybe the issue with DEBUG defined occurs 1 in 20 reboots rather than 1 in 2, I wouldn't know. If there's a way to reboot the DSP and reproduce this crash without rebooting the entire device that would be very helpful to me.
Maybe you've already tried this. But, how about blacklisting the audio driver and then trying a modprobe/rmmod to insert and remove themodule. This should attempt to boot the DSP upon every modprobe. But what I am not sure about is whether the rmmod would succeed if the IPC times out because the DSP has crashed.
I don't think we can really reduce the 'Heisenbug' nature of code instrumentations. But as Ranjani suggested it increasing the test frequency would make things more observable. I would go for suspend-resume tests, that would also force a DSP reboot without requiring a full reboot.
rtcwake -s 3 -m mem
I suspect modprobe/rmmod isn't likely to work, those legacy drivers were not exactly written with stress-test in mind. Suspend-resume is likely more reliable - been used in real products but tested with older kernels so your mileage may vary.
We should really have completed SOF support for Broadwell instead of supporting zombie drivers. Gah.
On Tue, Jul 30, 2019 at 12:04 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
On 7/30/19 1:47 PM, Ranjani Sridharan wrote:
On Tue, 2019-07-30 at 10:45 -0700, Jon Flatley wrote:
On Mon, Jul 29, 2019 at 7:23 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
On 7/29/19 7:53 PM, Ranjani Sridharan wrote:
On Mon, 2019-07-29 at 18:02 -0500, Pierre-Louis Bossart wrote:
On 7/29/19 4:53 PM, Jon Flatley wrote: > I've been working on upstreaming the bdw-rt5650 machine > driver for > the > Acer Chromebase 24 (buddy). There seems to be an issue when > first > setting the hardware controls that appears to be crashing the > DSP: > > [ 51.424554] haswell-pcm-audio haswell-pcm-audio: FW > loaded, > mailbox > readback FW info: type 01, - version: 00.00, build 77, source > commit > id: 876ac6906f31a43b6772b23c7c983ce9dcb18a19 > ... > [ 84.924666] haswell-pcm-audio haswell-pcm-audio: error: > audio > DSP > boot timeout IPCD 0x0 IPCX 0x0 > [ 85.260655] haswell-pcm-audio haswell-pcm-audio: ipc: -- > message > timeout-- ipcx 0x83000000 isr 0x00000000 ipcd 0x00000000 imrx > 0x7fff0000 > [ 85.273609] haswell-pcm-audio haswell-pcm-audio: error: > stream > commit failed > [ 85.279746] System PCM: error: failed to commit stream > -110 > [ 85.285388] haswell-pcm-audio haswell-pcm-audio: ASoC: > haswell-pcm-audio hw params failed: -110 > [ 85.293963] System PCM: ASoC: hw_params FE failed -110 > > This happens roughly 50% of the time when first setting > hardware > controls after a reboot. The other 50% of the time the DSP > comes up > just fine and audio works fine thereafter. Adding "#define > DEBUG 1" > to > sound/soc/intel/haswell/sst-haswell-ipc.c makes the issue > occur > much > less frequently in my testing. Seems like a subtle timing > issue. > > There were timing issues encountered during the bringup of > the 2015 > chromebook pixel (samus) which uses the bdw-rt5677 machine > driver. > Those were slightly different, and manifested during repeated > arecords. Both devices use the same revision of the sst2 > firmware. > > Any ideas for how to debug this?
this could be trying to send an IPC while you are already waiting for one to complete. we've seen this before with SOF, if the IPCs are not strictly serialized then things go in the weeds and timeout.
Pierre/Jon
In this case it looks like the DSP boot failed leading to the IPC timeout? WOndering if increasing the boot timeout would help?
I did actually try this without success.
Yes, that too. The boot timeout is typically experimentally defined, and never decreasing due to platform variations... I am still leaning more on the side of an side effect between two IPCs, the added DEBUG points to the printk which solves timing issues. The boot timeout would typically not be impacted by such changes.
I think the real struggle I'm having is finding a good debugging method that doesn't impact the timing of the IPCs significantly (as adding DEBUG seems to). This could maybe be overcome with using a stress test to reproduce. The crash only seems to occur when first booting the DSP, and so far I've been testing this by completely power cycling the machine on every test, which is very slow and tedious. So maybe the issue with DEBUG defined occurs 1 in 20 reboots rather than 1 in 2, I wouldn't know. If there's a way to reboot the DSP and reproduce this crash without rebooting the entire device that would be very helpful to me.
Maybe you've already tried this. But, how about blacklisting the audio driver and then trying a modprobe/rmmod to insert and remove themodule. This should attempt to boot the DSP upon every modprobe. But what I am not sure about is whether the rmmod would succeed if the IPC times out because the DSP has crashed.
I don't think we can really reduce the 'Heisenbug' nature of code instrumentations. But as Ranjani suggested it increasing the test frequency would make things more observable. I would go for suspend-resume tests, that would also force a DSP reboot without requiring a full reboot.
rtcwake -s 3 -m mem
I suspect modprobe/rmmod isn't likely to work, those legacy drivers were not exactly written with stress-test in mind. Suspend-resume is likely more reliable - been used in real products but tested with older kernels so your mileage may vary.
We should really have completed SOF support for Broadwell instead of supporting zombie drivers. Gah.
I've been off this issue for a couple of weeks but yesterday I made some progress.
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Thanks, -Jon
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol changed).
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of discrepancy. The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of discrepancy. The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not in boot, am I right(you can disable runtime PM for the device to confirm that)?
My points here are: 1. the commit 0d2135ecadb0 was suggested by FW team to W/A D3 potential crash issue. 2. it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side only(and may have been checked with rt5677 by Chrome team). 3. please follow sequence in broadwell.c if issue happen at boot time. If happened at runtime PM from DSP side, we should see it with all kinds of machine driver. Could you performing more test and debugging to see what it real happen there? 4. we have no reason to remove the commit directly, except correcting if some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
Thanks, ~Keyon>
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of discrepancy. The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not in boot, am I right(you can disable runtime PM for the device to confirm that)?
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3 potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of machine driver. Could you performing more test and debugging to see what it real happen there? 4. we have no reason to remove the commit directly, except correcting if some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Mon, Aug 19, 2019 at 11:08 AM Cezary Rojewski cezary.rojewski@intel.com wrote:
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of discrepancy. The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not in boot, am I right(you can disable runtime PM for the device to confirm that)?
From what I can tell that is correct. Disabling runtime PM seems to
stabilize things. I tested this over 10 reboots. I'll kick off my stress test script overnight just to see if this is 100% consistent.
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3 potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of machine driver.
I'm not really a sound guy; I've been picking this up as I go along.
From what I've gathered it doesn't make sense to me why this is an
issue on buddy, but not other bdw platforms, such as samus. If I understand correctly they both have the same DSP and use the same runtime suspend/resume code. What makes this fail with the 5650 and not the 5677 is the million dollar question.
Could you performing more test and debugging to see what it real happen there?
Yes, I'll continue poking at this. The debugging that got me this far basically just involved placing traces on the sst_shim32_write/read functions and looking at the diff from my best working reference, which is our cros-kernel-3.14 branch. This is what lead me to reverting 0d2135ecadb0, as it produced effectively identical traces as I was seeing in 3.14.
- we have no reason to remove the commit directly, except correcting if some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
I'm not suggesting removing the commit, merely observing that reverting it seems to fix the problem.
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
I can continue to work at this to see if I can make any more headway. Unfortunately without a solid intuitive understanding of the system, or insight into the DSP, I'm limited to looking at traces and git history for the most part.
Curtis: Do you think it makes sense to poke at samus and see if there are any differences in the suspend/resume process, or are they pretty much guaranteed to be identical?
Thanks for all your help on this.
- Jon
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Mon, Aug 19, 2019 at 3:37 PM Jon Flatley jflat@chromium.org wrote:
On Mon, Aug 19, 2019 at 11:08 AM Cezary Rojewski cezary.rojewski@intel.com wrote:
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of discrepancy. The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not in boot, am I right(you can disable runtime PM for the device to confirm that)?
From what I can tell that is correct. Disabling runtime PM seems to stabilize things. I tested this over 10 reboots. I'll kick off my stress test script overnight just to see if this is 100% consistent.
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3 potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of machine driver.
I'm not really a sound guy; I've been picking this up as I go along. From what I've gathered it doesn't make sense to me why this is an issue on buddy, but not other bdw platforms, such as samus. If I understand correctly they both have the same DSP and use the same runtime suspend/resume code. What makes this fail with the 5650 and not the 5677 is the million dollar question.
Could you performing more test and debugging to see what it real happen there?
Yes, I'll continue poking at this. The debugging that got me this far basically just involved placing traces on the sst_shim32_write/read functions and looking at the diff from my best working reference, which is our cros-kernel-3.14 branch. This is what lead me to reverting 0d2135ecadb0, as it produced effectively identical traces as I was seeing in 3.14.
- we have no reason to remove the commit directly, except correcting if some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
I'm not suggesting removing the commit, merely observing that reverting it seems to fix the problem.
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
I can continue to work at this to see if I can make any more headway. Unfortunately without a solid intuitive understanding of the system, or insight into the DSP, I'm limited to looking at traces and git history for the most part.
Curtis: Do you think it makes sense to poke at samus and see if there are any differences in the suspend/resume process, or are they pretty much guaranteed to be identical?
My recommendation would be to look at the machine driver and see if its making additional calls to the DSP driver that is not made in other machine drivers such as the bdw-rt5677 (Samus.) That might indicate an additional code path that might be getting exercised in your context that isn't used in samus which is causing your problems. If you find something you can always copy it over to samus to see if it causes the same breakage. So yes definitely look. Usually the suspend/resume paths aren't that long, but I would search the whole machine driver for anything that can alter state.
Thanks for all your help on this.
- Jon
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
My recommendation would be to look at the machine driver and see if its making additional calls to the DSP driver that is not made in other machine drivers such as the bdw-rt5677 (Samus.) That might indicate an additional code path that might be getting exercised in your context that isn't used in samus which is causing your problems. If you find something you can always copy it over to samus to see if it causes the same breakage. So yes definitely look. Usually the suspend/resume paths aren't that long, but I would search the whole machine driver for anything that can alter state.
The only significant difference I see in the machine drivers is that the clock dividers are smaller in this bdw-rt5660 case, the bitclock is 4.8 MHz v. 2.4 MHz in the other bdw-rt5677. It's not very clear to me why there is a need to have different clock and SSP settings, but since the patch that was suggested to be reverted touches the clock generation maybe there's a connection?
-----Original Message----- From: Rojewski, Cezary Sent: Tuesday, August 20, 2019 2:09 AM To: Jie, Yang yang.jie@intel.com; Jon Flatley jflat@chromium.org; Pierre- Louis Bossart pierre-louis.bossart@linux.intel.com Cc: benzh@chromium.org; alsa-devel@alsa-project.org; Jie Yang yang.jie@linux.intel.com; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of
discrepancy.
The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not
in boot, am I right(you can disable runtime PM for the device to confirm that)?
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3
potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side
only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of
machine driver.
Could you performing more test and debugging to see what it real happen
there?
- we have no reason to remove the commit directly, except correcting if
some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
Yes, that's true, good to hear that you will add it to CI.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
I remember Pawel Piskorski and Marcin Barlik helped me from the FW side(including explaining about the S0<->S3 sequence), please contact me offline if needed, I will try to drag for some mails which I got 5 years back.
Thanks, ~Keyon
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On 2019-08-20 04:11, Jie, Yang wrote:
-----Original Message----- From: Rojewski, Cezary Sent: Tuesday, August 20, 2019 2:09 AM To: Jie, Yang yang.jie@intel.com; Jon Flatley jflat@chromium.org; Pierre- Louis Bossart pierre-louis.bossart@linux.intel.com Cc: benzh@chromium.org; alsa-devel@alsa-project.org; Jie Yang yang.jie@linux.intel.com; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of
discrepancy.
The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not
in boot, am I right(you can disable runtime PM for the device to confirm that)?
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3
potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side
only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of
machine driver.
Could you performing more test and debugging to see what it real happen
there?
- we have no reason to remove the commit directly, except correcting if
some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
Yes, that's true, good to hear that you will add it to CI.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
I remember Pawel Piskorski and Marcin Barlik helped me from the FW side(including explaining about the S0<->S3 sequence), please contact me offline if needed, I will try to drag for some mails which I got 5 years back.
Thanks, ~Keyon
Please do not name people on official list unless you are 100% sure about their engagement in linux solutions, which for both individuals you have listed, is no longer the case. Any recommendations? - you can provide internally.
Anyway, I've contacted Marcin and once he is available, we will review the patch together. Note, that I'm a IGK dweller too, so it's highly probable whomever you had in mind I've either already met or drank a beer with.
Czarek
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On 8/22/19 5:29 PM, Cezary Rojewski wrote:
On 2019-08-20 04:11, Jie, Yang wrote:
-----Original Message----- From: Rojewski, Cezary Sent: Tuesday, August 20, 2019 2:09 AM To: Jie, Yang yang.jie@intel.com; Jon Flatley jflat@chromium.org; Pierre- Louis Bossart pierre-louis.bossart@linux.intel.com Cc: benzh@chromium.org; alsa-devel@alsa-project.org; Jie Yang yang.jie@linux.intel.com; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
> There seems to be an issue when suspending the ALC5650. I think the > nondeterministic behavior I was seeing just had to do with whether > or not the DSP had yet suspended. > > I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW > D3 potential crash issue") and things started working, including > suspend/resume of the DSP. Any ideas for why this may be? I would > like to resolve this so I can finish upstreaming the bdw-rt5650 > machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of
discrepancy.
The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not
in boot, am I right(you can disable runtime PM for the device to confirm that)?
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3
potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our
side
only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of
machine driver.
Could you performing more test and debugging to see what it real happen
there?
- we have no reason to remove the commit directly, except
correcting if
some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
Yes, that's true, good to hear that you will add it to CI.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
I remember Pawel Piskorski and Marcin Barlik helped me from the FW side(including explaining about the S0<->S3 sequence), please contact me offline if needed, I will try to drag for some mails which I got 5 years back.
Thanks, ~Keyon
Please do not name people on official list unless you are 100% sure about their engagement in linux solutions, which for both individuals you have listed, is no longer the case. Any recommendations? - you can provide internally.
Anyway, I've contacted Marcin and once he is available, we will review the patch together. Note, that I'm a IGK dweller too, so it's highly probable whomever you had in mind I've either already met or drank a beer with.
Czarek
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
I don't have a specified codec for testing so I tried with rt286. I was not able to reproduce this issue. Could you collect logs(dmesg) with enabled debug like below for S3 or enabled debug during build for resting reboot scenario? echo -n 'module snd* +p' | dd of=/sys/kernel/debug/dynamic_debug/control Since enabling debug decreases problem occurrence ratio please also check below change:
--- a/sound/soc/intel/haswell/sst-haswell-ipc.c +++ b/sound/soc/intel/haswell/sst-haswell-ipc.c @@ -81,7 +81,7 @@
/* IPC message timeout (msecs) */ #define IPC_TIMEOUT_MSECS 300 -#define IPC_BOOT_MSECS 200 +#define IPC_BOOT_MSECS 300
Gustaw
On Tue, Aug 27, 2019 at 5:53 AM Gustaw Lewandowski gustaw.lewandowski@linux.intel.com wrote:
On 8/22/19 5:29 PM, Cezary Rojewski wrote:
On 2019-08-20 04:11, Jie, Yang wrote:
-----Original Message----- From: Rojewski, Cezary Sent: Tuesday, August 20, 2019 2:09 AM To: Jie, Yang yang.jie@intel.com; Jon Flatley jflat@chromium.org; Pierre- Louis Bossart pierre-louis.bossart@linux.intel.com Cc: benzh@chromium.org; alsa-devel@alsa-project.org; Jie Yang yang.jie@linux.intel.com; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote: > > >> There seems to be an issue when suspending the ALC5650. I think the >> nondeterministic behavior I was seeing just had to do with whether >> or not the DSP had yet suspended. >> >> I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW >> D3 potential crash issue") and things started working, including >> suspend/resume of the DSP. Any ideas for why this may be? I would >> like to resolve this so I can finish upstreaming the bdw-rt5650 >> machine driver. > > Copying Keyon in case he remembers the context. > > Reverting a 5yr-old commit with all sorts of clock/power-related > fixes looks brave, and it's not clear why this would work with the > rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of
discrepancy.
The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not
in boot, am I right(you can disable runtime PM for the device to confirm that)?
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3
potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our
side
only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of
machine driver.
Could you performing more test and debugging to see what it real happen
there?
- we have no reason to remove the commit directly, except
correcting if
some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
Yes, that's true, good to hear that you will add it to CI.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
I remember Pawel Piskorski and Marcin Barlik helped me from the FW side(including explaining about the S0<->S3 sequence), please contact me offline if needed, I will try to drag for some mails which I got 5 years back.
Thanks, ~Keyon
Please do not name people on official list unless you are 100% sure about their engagement in linux solutions, which for both individuals you have listed, is no longer the case. Any recommendations? - you can provide internally.
Anyway, I've contacted Marcin and once he is available, we will review the patch together. Note, that I'm a IGK dweller too, so it's highly probable whomever you had in mind I've either already met or drank a beer with.
Czarek
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
> > Are you using the latest upstream firmware btw? Or the one which > shipped with the initial device (which could be an issue if the > protocol changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
I don't have a specified codec for testing so I tried with rt286. I was not able to reproduce this issue. Could you collect logs(dmesg) with enabled debug like below for S3 or enabled debug during build for resting reboot scenario?
Sure thing, thanks for taking a look. Here are the verbose logs for runtime suspend, followed by a failed resume invoked by a volume change in alsamixer:
[ 31.199071] haswell-pcm-audio haswell-pcm-audio: audio dsp runtime suspend [ 31.199074] System PCM: ASoC: pop wq checking: Playback status: inactive waiting: yes [ 31.213871] haswell-pcm-audio haswell-pcm-audio: Item[0] offset[48a890] - size[770e] - source[1] [ 31.221392] haswell-pcm-audio haswell-pcm-audio: Item[1] offset[491fa0] - size[3c00] - source[1] [ 31.231038] haswell-pcm-audio haswell-pcm-audio: Item[2] offset[0] - size[2821d] - source[0] [ 31.239561] haswell-pcm-audio haswell-pcm-audio: Item[3] offset[484000] - size[246c] - source[0] [ 31.248320] haswell-pcm-audio haswell-pcm-audio: Item[4] offset[486470] - size[750] - source[0] [ 31.256958] haswell-pcm-audio haswell-pcm-audio: Item[5] offset[486bc0] - size[3cc8] - source[1] [ 31.265866] haswell-pcm-audio haswell-pcm-audio: ipc: got 6 entry numbers for state 3 [ 31.273080] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0xfff8a890 dest 0x7898a890 size 30480 [ 31.284532] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 31.289114] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0xfff91fa0 dest 0x78991fa0 size 15360 [ 31.299841] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 31.304423] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0xfff86bc0 dest 0x78986bc0 size 15560 [ 31.314922] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 31.319610] haswell-pcm-audio haswell-pcm-audio: disabled block 1:13 at offset 0x68000 [ 31.327990] haswell-pcm-audio haswell-pcm-audio: block freed 1:13 at offset 0x68000 [ 31.335168] haswell-pcm-audio haswell-pcm-audio: disabled block 1:12 at offset 0x60000 [ 31.343612] haswell-pcm-audio haswell-pcm-audio: disabled block 1:11 at offset 0x58000 [ 31.352013] haswell-pcm-audio haswell-pcm-audio: disabled block 1:10 at offset 0x50000 [ 31.359324] haswell-pcm-audio haswell-pcm-audio: block freed 1:12 at offset 0x60000 [ 31.367615] haswell-pcm-audio haswell-pcm-audio: block freed 1:11 at offset 0x58000 [ 31.374813] haswell-pcm-audio haswell-pcm-audio: block freed 1:10 at offset 0x50000 [ 31.383127] haswell-pcm-audio haswell-pcm-audio: disabled block 1:9 at offset 0x48000 [ 31.390545] haswell-pcm-audio haswell-pcm-audio: disabled block 1:8 at offset 0x40000 [ 31.398998] haswell-pcm-audio haswell-pcm-audio: disabled block 1:7 at offset 0x38000 [ 31.407451] haswell-pcm-audio haswell-pcm-audio: block freed 1:9 at offset 0x48000 [ 31.414662] haswell-pcm-audio haswell-pcm-audio: block freed 1:8 at offset 0x40000 [ 31.421798] haswell-pcm-audio haswell-pcm-audio: block freed 1:7 at offset 0x38000 [ 31.430126] haswell-pcm-audio haswell-pcm-audio: disabled block 1:6 at offset 0x30000 [ 31.437414] haswell-pcm-audio haswell-pcm-audio: block freed 1:6 at offset 0x30000 [ 31.445635] haswell-pcm-audio haswell-pcm-audio: disabled block 1:5 at offset 0x28000 [ 31.453023] haswell-pcm-audio haswell-pcm-audio: block freed 1:5 at offset 0x28000 [ 31.461213] haswell-pcm-audio haswell-pcm-audio: unloading firmware [ 31.467042] haswell-pcm-audio haswell-pcm-audio: disabled block 1:19 at offset 0x98000 [ 31.475473] haswell-pcm-audio haswell-pcm-audio: disabled block 1:18 at offset 0x90000 [ 31.483918] haswell-pcm-audio haswell-pcm-audio: disabled block 1:17 at offset 0x88000 [ 31.491349] haswell-pcm-audio haswell-pcm-audio: disabled block 1:16 at offset 0x80000 [ 31.499747] haswell-pcm-audio haswell-pcm-audio: disabled block 0:5 at offset 0xc8000 [ 31.507051] haswell-pcm-audio haswell-pcm-audio: disabled block 0:4 at offset 0xc0000 [ 31.515467] haswell-pcm-audio haswell-pcm-audio: disabled block 0:3 at offset 0xb8000 [ 31.523840] haswell-pcm-audio haswell-pcm-audio: disabled block 0:2 at offset 0xb0000 [ 31.531165] haswell-pcm-audio haswell-pcm-audio: disabled block 0:1 at offset 0xa8000 [ 31.539648] haswell-pcm-audio haswell-pcm-audio: disabled block 0:0 at offset 0xa0000 [ 31.547066] haswell-pcm-audio haswell-pcm-audio: block freed 1:19 at offset 0x98000 [ 31.555378] haswell-pcm-audio haswell-pcm-audio: block freed 1:18 at offset 0x90000 [ 31.562625] haswell-pcm-audio haswell-pcm-audio: block freed 1:17 at offset 0x88000 [ 31.570883] haswell-pcm-audio haswell-pcm-audio: block freed 1:16 at offset 0x80000 [ 31.578273] haswell-pcm-audio haswell-pcm-audio: block freed 0:5 at offset 0xc8000 [ 31.585505] haswell-pcm-audio haswell-pcm-audio: block freed 0:4 at offset 0xc0000 [ 31.593690] haswell-pcm-audio haswell-pcm-audio: block freed 0:3 at offset 0xb8000 [ 31.600950] haswell-pcm-audio haswell-pcm-audio: block freed 0:2 at offset 0xb0000 [ 31.608174] haswell-pcm-audio haswell-pcm-audio: block freed 0:1 at offset 0xa8000 [ 31.616392] haswell-pcm-audio haswell-pcm-audio: block freed 0:0 at offset 0xa0000 [ 31.623651] haswell-pcm-audio haswell-pcm-audio: disabled block 1:15 at offset 0x78000 [ 31.632020] haswell-pcm-audio haswell-pcm-audio: disabled block 1:14 at offset 0x70000 [ 31.640378] haswell-pcm-audio haswell-pcm-audio: block freed 1:15 at offset 0x78000 [ 31.647331] haswell-pcm-audio haswell-pcm-audio: block freed 1:14 at offset 0x70000 [ 31.655610] haswell-pcm-audio haswell-pcm-audio: HSW_PM dsp runtime suspend [ 31.662756] haswell-pcm-audio haswell-pcm-audio: HSW_PM dsp runtime suspend exit [ 46.620599] haswell-pcm-audio haswell-pcm-audio: loading audio DSP.... [ 46.626523] haswell-pcm-audio haswell-pcm-audio: HSW_PM dsp runtime resume [ 46.644108] haswell-pcm-audio haswell-pcm-audio: HSW_PM dsp runtime resume exit [ 46.650309] haswell-pcm-audio haswell-pcm-audio: reloading firmware [ 46.657168] haswell-pcm-audio haswell-pcm-audio: header size=0x3f8c0 modules=0x8 fmt=0xfe size=32 [ 46.665766] haswell-pcm-audio haswell-pcm-audio: new module sign 0x$SST\xe0\xf7\x03 size 0x3f7e0 blocks 0xf type 0x0 [ 46.675554] haswell-pcm-audio haswell-pcm-audio: entrypoint 0x0 [ 46.681254] haswell-pcm-audio haswell-pcm-audio: persistent 0x0 scratch 0x0 [ 46.688267] haswell-pcm-audio haswell-pcm-audio: module block 0 type 0x0 size 0x10c ==> ram ffff9e2e83000000 offset 0x0 [ 46.699535] haswell-pcm-audio haswell-pcm-audio: block request 0x10c bytes at offset 0xa0000 type 0 [ 46.708128] haswell-pcm-audio haswell-pcm-audio: block allocated 0:0 at offset 0xa0000 [ 46.716458] haswell-pcm-audio haswell-pcm-audio: enabled block 0:0 at offset 0xa0000 [ 46.724754] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x7884004c dest 0xfffa0000 size 268 [ 46.733332] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.739020] haswell-pcm-audio haswell-pcm-audio: module block 1 type 0x0 size 0x16c ==> ram ffff9e2e83000000 offset 0x400 [ 46.750188] haswell-pcm-audio haswell-pcm-audio: block request 0x16c bytes at offset 0xa0400 type 0 [ 46.758830] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840168 dest 0xfffa0400 size 364 [ 46.767417] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.773078] haswell-pcm-audio haswell-pcm-audio: module block 2 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x584 [ 46.784201] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa0584 type 0 [ 46.793740] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x788402e4 dest 0xfffa0584 size 8 [ 46.802155] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.807807] haswell-pcm-audio haswell-pcm-audio: module block 3 type 0x0 size 0x4 ==> ram ffff9e2e83000000 offset 0x5bc [ 46.818967] haswell-pcm-audio haswell-pcm-audio: block request 0x4 bytes at offset 0xa05bc type 0 [ 46.827488] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x788402fc dest 0xfffa05bc size 4 [ 46.835933] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.841564] haswell-pcm-audio haswell-pcm-audio: module block 4 type 0x0 size 0x18 ==> ram ffff9e2e83000000 offset 0x5c0 [ 46.852731] haswell-pcm-audio haswell-pcm-audio: block request 0x18 bytes at offset 0xa05c0 type 0 [ 46.861296] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840310 dest 0xfffa05c0 size 24 [ 46.870764] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.876392] haswell-pcm-audio haswell-pcm-audio: module block 5 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x5fc [ 46.887515] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa05fc type 0 [ 46.896036] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840338 dest 0xfffa05fc size 8 [ 46.904714] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.910414] haswell-pcm-audio haswell-pcm-audio: module block 6 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x640 [ 46.921485] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa0640 type 0 [ 46.930032] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840350 dest 0xfffa0640 size 8 [ 46.938512] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.944220] haswell-pcm-audio haswell-pcm-audio: module block 7 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x67c [ 46.955296] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa067c type 0 [ 46.963832] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840368 dest 0xfffa067c size 8 [ 46.972307] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 46.978997] haswell-pcm-audio haswell-pcm-audio: module block 8 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x6b8 [ 46.989033] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa06b8 type 0 [ 46.998595] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840380 dest 0xfffa06b8 size 8 [ 47.007021] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.012683] haswell-pcm-audio haswell-pcm-audio: module block 9 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x6f4 [ 47.023859] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa06f4 type 0 [ 47.032373] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78840398 dest 0xfffa06f4 size 8 [ 47.040819] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.046459] haswell-pcm-audio haswell-pcm-audio: module block 10 type 0x0 size 0x8 ==> ram ffff9e2e83000000 offset 0x730 [ 47.057604] haswell-pcm-audio haswell-pcm-audio: block request 0x8 bytes at offset 0xa0730 type 0 [ 47.067175] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x788403b0 dest 0xfffa0730 size 8 [ 47.075556] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.081219] haswell-pcm-audio haswell-pcm-audio: module block 11 type 0x0 size 0x4 ==> ram ffff9e2e83000000 offset 0x76c [ 47.092354] haswell-pcm-audio haswell-pcm-audio: block request 0x4 bytes at offset 0xa076c type 0 [ 47.100916] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x788403c8 dest 0xfffa076c size 4 [ 47.109348] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.114985] haswell-pcm-audio haswell-pcm-audio: module block 12 type 0x0 size 0x27a78 ==> ram ffff9e2e83000000 offset 0x7a8 [ 47.126184] haswell-pcm-audio haswell-pcm-audio: block request 0x27a78 bytes at offset 0xa07a8 type 0 [ 47.135920] haswell-pcm-audio haswell-pcm-audio: block allocated 0:1 at offset 0xa8000 [ 47.144179] haswell-pcm-audio haswell-pcm-audio: block allocated 0:2 at offset 0xb0000 [ 47.151399] haswell-pcm-audio haswell-pcm-audio: block allocated 0:3 at offset 0xb8000 [ 47.159674] haswell-pcm-audio haswell-pcm-audio: block allocated 0:4 at offset 0xc0000 [ 47.168081] haswell-pcm-audio haswell-pcm-audio: block allocated 0:5 at offset 0xc8000 [ 47.175318] haswell-pcm-audio haswell-pcm-audio: enabled block 0:5 at offset 0xc8000 [ 47.183609] haswell-pcm-audio haswell-pcm-audio: enabled block 0:4 at offset 0xc0000 [ 47.191876] haswell-pcm-audio haswell-pcm-audio: enabled block 0:3 at offset 0xb8000 [ 47.199170] haswell-pcm-audio haswell-pcm-audio: enabled block 0:2 at offset 0xb0000 [ 47.207478] haswell-pcm-audio haswell-pcm-audio: enabled block 0:1 at offset 0xa8000 [ 47.214755] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x788403dc dest 0xfffa07a8 size 162424 [ 47.230081] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.234656] haswell-pcm-audio haswell-pcm-audio: module block 13 type 0x1 size 0x16ba8 ==> ram ffff9e2e83000000 offset 0x84000 [ 47.245946] haswell-pcm-audio haswell-pcm-audio: block request 0x16ba8 bytes at offset 0x84000 type 1 [ 47.255645] haswell-pcm-audio haswell-pcm-audio: block allocated 1:17 at offset 0x88000 [ 47.263956] haswell-pcm-audio haswell-pcm-audio: block allocated 1:18 at offset 0x90000 [ 47.272197] haswell-pcm-audio haswell-pcm-audio: block allocated 1:19 at offset 0x98000 [ 47.279482] haswell-pcm-audio haswell-pcm-audio: enabled block 1:19 at offset 0x98000 [ 47.287816] haswell-pcm-audio haswell-pcm-audio: enabled block 1:18 at offset 0x90000 [ 47.296113] haswell-pcm-audio haswell-pcm-audio: enabled block 1:17 at offset 0x88000 [ 47.303481] haswell-pcm-audio haswell-pcm-audio: enabled block 1:16 at offset 0x80000 [ 47.311817] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78867e64 dest 0xfff84000 size 93096 [ 47.323545] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.328119] haswell-pcm-audio haswell-pcm-audio: module block 14 type 0x1 size 0xe00 ==> ram ffff9e2e83000000 offset 0x9abb0 [ 47.339397] haswell-pcm-audio haswell-pcm-audio: block request 0xe00 bytes at offset 0x9abb0 type 1 [ 47.349018] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x7887ea1c dest 0xfff9abb0 size 3584 [ 47.357597] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.363255] haswell-pcm-audio haswell-pcm-audio: new module sign 0x$SST size 0x0 blocks 0x0 type 0xd [ 47.372935] haswell-pcm-audio haswell-pcm-audio: entrypoint 0x0 [ 47.378643] haswell-pcm-audio haswell-pcm-audio: persistent 0x16800 scratch 0xdc00 [ 47.385780] haswell-pcm-audio haswell-pcm-audio: new module sign 0x$SST size 0x0 blocks 0x0 type 0xb [ 47.395420] haswell-pcm-audio haswell-pcm-audio: entrypoint 0x0 [ 47.401135] haswell-pcm-audio haswell-pcm-audio: persistent 0x3800 scratch 0x0 [ 47.409266] haswell-pcm-audio haswell-pcm-audio: new module sign 0x$SST size 0x0 blocks 0x0 type 0xa [ 47.417922] haswell-pcm-audio haswell-pcm-audio: entrypoint 0x0 [ 47.423617] haswell-pcm-audio haswell-pcm-audio: persistent 0x4000 scratch 0x0 [ 47.431742] haswell-pcm-audio haswell-pcm-audio: new module sign 0x$SST size 0x0 blocks 0x0 type 0xc [ 47.440366] haswell-pcm-audio haswell-pcm-audio: entrypoint 0x0 [ 47.447127] haswell-pcm-audio haswell-pcm-audio: persistent 0x3000 scratch 0x0 [ 47.454274] haswell-pcm-audio haswell-pcm-audio: module 12 scratch req 0x0 bytes [ 47.461376] haswell-pcm-audio haswell-pcm-audio: module 10 scratch req 0x0 bytes [ 47.468433] haswell-pcm-audio haswell-pcm-audio: module 11 scratch req 0x0 bytes [ 47.476576] haswell-pcm-audio haswell-pcm-audio: module 13 scratch req 0xdc00 bytes [ 47.483771] haswell-pcm-audio haswell-pcm-audio: module 0 scratch req 0x0 bytes [ 47.490854] haswell-pcm-audio haswell-pcm-audio: scratch buffer required is 0xdc00 bytes [ 47.499248] haswell-pcm-audio haswell-pcm-audio: allocating scratch blocks [ 47.506228] haswell-pcm-audio haswell-pcm-audio: block request 0xdc00 bytes type 1 at 0xffff0a10 [ 47.515784] haswell-pcm-audio haswell-pcm-audio: block allocated 1:15 at offset 0x78000 [ 47.523046] haswell-pcm-audio haswell-pcm-audio: enabled block 1:15 at offset 0x78000 [ 47.531470] haswell-pcm-audio haswell-pcm-audio: enabled block 1:14 at offset 0x70000 [ 47.539814] haswell-pcm-audio haswell-pcm-audio: persistent fixed block request 0x3800 bytes type 1 offset 0x68000 [ 47.549754] haswell-pcm-audio haswell-pcm-audio: block allocated 1:13 at offset 0x68000 [ 47.558081] haswell-pcm-audio haswell-pcm-audio: enabled block 1:13 at offset 0x68000 [ 47.565419] haswell-pcm-audio haswell-pcm-audio: runtime id 11 created for module 11 [ 47.573707] haswell-pcm-audio haswell-pcm-audio: persistent fixed block request 0x16800 bytes type 1 offset 0x50000 [ 47.583680] haswell-pcm-audio haswell-pcm-audio: block allocated 1:11 at offset 0x58000 [ 47.591932] haswell-pcm-audio haswell-pcm-audio: block allocated 1:12 at offset 0x60000 [ 47.600249] haswell-pcm-audio haswell-pcm-audio: enabled block 1:12 at offset 0x60000 [ 47.607598] haswell-pcm-audio haswell-pcm-audio: enabled block 1:11 at offset 0x58000 [ 47.615918] haswell-pcm-audio haswell-pcm-audio: enabled block 1:10 at offset 0x50000 [ 47.624385] haswell-pcm-audio haswell-pcm-audio: runtime id 13 created for module 13 [ 47.631604] haswell-pcm-audio haswell-pcm-audio: persistent fixed block request 0x16800 bytes type 1 offset 0x38000 [ 47.642587] haswell-pcm-audio haswell-pcm-audio: block allocated 1:8 at offset 0x40000 [ 47.649916] haswell-pcm-audio haswell-pcm-audio: block allocated 1:9 at offset 0x48000 [ 47.658208] haswell-pcm-audio haswell-pcm-audio: enabled block 1:9 at offset 0x48000 [ 47.666537] haswell-pcm-audio haswell-pcm-audio: enabled block 1:8 at offset 0x40000 [ 47.673771] haswell-pcm-audio haswell-pcm-audio: enabled block 1:7 at offset 0x38000 [ 47.682049] haswell-pcm-audio haswell-pcm-audio: runtime id 13 created for module 13 [ 47.689239] haswell-pcm-audio haswell-pcm-audio: persistent fixed block request 0x3000 bytes type 1 offset 0x30000 [ 47.700471] haswell-pcm-audio haswell-pcm-audio: block allocated 1:6 at offset 0x30000 [ 47.707726] haswell-pcm-audio haswell-pcm-audio: enabled block 1:6 at offset 0x30000 [ 47.716058] haswell-pcm-audio haswell-pcm-audio: runtime id 12 created for module 12 [ 47.723207] haswell-pcm-audio haswell-pcm-audio: persistent fixed block request 0x4000 bytes type 1 offset 0x28000 [ 47.734246] haswell-pcm-audio haswell-pcm-audio: block allocated 1:5 at offset 0x28000 [ 47.741490] haswell-pcm-audio haswell-pcm-audio: enabled block 1:5 at offset 0x28000 [ 47.749799] haswell-pcm-audio haswell-pcm-audio: runtime id 10 created for module 10 [ 47.758050] haswell-pcm-audio haswell-pcm-audio: audio dsp runtime resume [ 47.763959] haswell-pcm-audio haswell-pcm-audio: restoring audio DSP.... [ 47.770944] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x7898a890 dest 0xfff8a890 size 30480 [ 47.781581] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.786154] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78991fa0 dest 0xfff91fa0 size 15360 [ 47.796266] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 47.800849] haswell-pcm-audio haswell-pcm-audio: DMA: src: 0x78986bc0 dest 0xfff86bc0 size 15560 [ 47.810971] haswell-pcm-audio haswell-pcm-audio: DMA: callback [ 48.119068] haswell-pcm-audio haswell-pcm-audio: error: audio DSP boot timeout IPCD 0x0 IPCX 0x0 [ 49.535060] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x86371000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 49.546866] haswell-pcm-audio haswell-pcm-audio: error: set mixer volume failed [ 49.553979] haswell-pcm-audio haswell-pcm-audio: ipc_tx_msgs dsp busy [ 49.855054] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x86371000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 49.868456] haswell-pcm-audio haswell-pcm-audio: error: set mixer volume failed [ 49.874794] haswell-pcm-audio haswell-pcm-audio: ipc_tx_msgs dsp busy [ 50.183057] haswell-pcm-audio haswell-pcm-audio: ipc: --message timeout-- ipcx 0x86371000 isr 0x00000000 ipcd 0x00000000 imrx 0x7fff0000 [ 50.194886] haswell-pcm-audio haswell-pcm-audio: error: set mixer volume failed
echo -n 'module snd* +p' | dd of=/sys/kernel/debug/dynamic_debug/control Since enabling debug decreases problem occurrence ratio please also check below change:
--- a/sound/soc/intel/haswell/sst-haswell-ipc.c +++ b/sound/soc/intel/haswell/sst-haswell-ipc.c @@ -81,7 +81,7 @@
/* IPC message timeout (msecs) */ #define IPC_TIMEOUT_MSECS 300 -#define IPC_BOOT_MSECS 200 +#define IPC_BOOT_MSECS 300
Gustaw
participants (7)
-
Cezary Rojewski
-
Curtis Malainey
-
Gustaw Lewandowski
-
Jie, Yang
-
Jon Flatley
-
Pierre-Louis Bossart
-
Ranjani Sridharan