On Mon, Aug 19, 2019 at 11:08 AM Cezary Rojewski cezary.rojewski@intel.com wrote:
On 2019-08-19 04:33, Jie, Yang wrote:
-----Original Message----- From: Jon Flatley [mailto:jflat@chromium.org] Sent: Thursday, August 15, 2019 5:25 AM To: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Jon Flatley jflat@chromium.org; Jie, Yang yang.jie@intel.com; benzh@chromium.org; alsa-devel@alsa-project.org; Ranjani Sridharan ranjani.sridharan@linux.intel.com; cujomalainey@chromium.org; Jie Yang yang.jie@linux.intel.com Subject: Re: [alsa-devel] [BUG] bdw-rt5650 DSP boot timeout
On Wed, Aug 14, 2019 at 1:51 PM Pierre-Louis Bossart <pierre- louis.bossart@linux.intel.com> wrote:
There seems to be an issue when suspending the ALC5650. I think the nondeterministic behavior I was seeing just had to do with whether or not the DSP had yet suspended.
I reverted commit 0d2135ecadb0 ("ASoC: Intel: Work around to fix HW D3 potential crash issue") and things started working, including suspend/resume of the DSP. Any ideas for why this may be? I would like to resolve this so I can finish upstreaming the bdw-rt5650 machine driver.
Copying Keyon in case he remembers the context.
Reverting a 5yr-old commit with all sorts of clock/power-related fixes looks brave, and it's not clear why this would work with the rt5677 and not with 5650.
No idea, I was just diffing the register writes looking for sources of discrepancy. The Chromium OS 3.14 kernel tree that Buddy uses doesn't have this patch, so I figured what's the worst that could happen?
Hi Jon, sorry about just noticing this thread. From the dmesg log, the issue happens at runtime suspend/resume but not in boot, am I right(you can disable runtime PM for the device to confirm that)?
From what I can tell that is correct. Disabling runtime PM seems to
stabilize things. I tested this over 10 reboots. I'll kick off my stress test script overnight just to see if this is 100% consistent.
My points here are:
- the commit 0d2135ecadb0 was suggested by FW team to W/A D3 potential crash issue.
- it was verified with rt286(Broadwell.c, e.g. Dell XPS) from our side only(and may have been checked with rt5677 by Chrome team).
- please follow sequence in broadwell.c if issue happen at boot time.
If happened at runtime PM from DSP side, we should see it with all kinds of machine driver.
I'm not really a sound guy; I've been picking this up as I go along.
From what I've gathered it doesn't make sense to me why this is an
issue on buddy, but not other bdw platforms, such as samus. If I understand correctly they both have the same DSP and use the same runtime suspend/resume code. What makes this fail with the 5650 and not the 5677 is the million dollar question.
Could you performing more test and debugging to see what it real happen there?
Yes, I'll continue poking at this. The debugging that got me this far basically just involved placing traces on the sst_shim32_write/read functions and looking at the diff from my best working reference, which is our cros-kernel-3.14 branch. This is what lead me to reverting 0d2135ecadb0, as it produced effectively identical traces as I was seeing in 3.14.
- we have no reason to remove the commit directly, except correcting if some lines are proved wrong. And, as Pierre mentioned, SOF driver is preferred, as there is no new development effort to support SST haswell/Broadwell driver here(no platform, no developer, :-( ).
I'm not suggesting removing the commit, merely observing that reverting it seems to fix the problem.
Thanks, ~Keyon>
Got to disagree with the last one - no platform, no developer. We are setting up some BDW/ HSW here to join our happy SKL+ family in CI. This is because of /common cleanups which will engulf aDSP project (hsw/byt) obviously.
These will be tested against the exact same BAT scope as other ADSP devices. Code here looks much better, at least compared to /skylake - ain't a high threshold though.. Given how outdated all SKL+ fw binaries are (on upstream repo) it might even come down simply to fw upgrade. Most of FW peps who took part in that project are already out. Although, found one or two who are willing to help : )
And yes, I'm setting them up with rt286 too. There are some rt56XX but I'm unsure if rt5650 is amount them. Still got some problems with ACPI, but soon two new faces should be greeting audio CI bonfire..
Czarek
I can continue to work at this to see if I can make any more headway. Unfortunately without a solid intuitive understanding of the system, or insight into the DSP, I'm limited to looking at traces and git history for the most part.
Curtis: Do you think it makes sense to poke at samus and see if there are any differences in the suspend/resume process, or are they pretty much guaranteed to be identical?
Thanks for all your help on this.
- Jon
Are you using the latest upstream firmware btw? Or the one which shipped with the initial device (which could be an issue if the protocol
changed).
The firmware I'm loading is: `FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b23c7c983ce9dcb18a1`. Hashes the same as the upstream binary.
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel