Dne 27. 02. 20 v 13:45 Kai Vehmanen napsal(a):
Hi,
On Fri, 21 Feb 2020, Jaroslav Kysela wrote:
Dne 21. 02. 20 v 20:23 Pierre-Louis Bossart napsal(a):
Ok, it's really weird that we cannot determine the firmware/driver combination which cause the DSP lock. I would propose to block the older firmware load <1.4 (or 1.4.2 which has the correct firmware version!)
[...]
It makes sense. At least a hint that something may be wrong. I believe that it might help to identify issues.
I've continued testing today on multiple machines using the official (old) v1.3 binaries [1] we have and I cannot reproduce the DSP error you Jaroslav have seen. On all of my machines, latest sound tree with old v1.3 FW works just fine. This matches earlier reports on SOF issue #2102.
I also looked back at the history of the kernel trigger order change, and it's a kernel-only change, to fix issues with certain pause-resume cases. It's not a change that was done in tandem with some specific FW side change, so I can't find a solid reason why DMA triggering order should be changed for old FW versions. One FW patch that was done at a time (and referred in the discussions) is:
dai: prevent dai_config while in active state https://github.com/thesofproject/sof/commit/c623e9246325dbee615a5cad0c8e4b0c...
.. but this is not changing the logic, just avoiding a DSP crash by returning an error (but IPC and use-case will still fail).
So although I cannot explain why Jaroslav you see the crash on the old v1.3 firmware on the Lenovo device, I would still recommend to leave current kernel code as is and not add any warnings. To summarize my rationale:
- we have known error in SOF driver logic, which was fixed in 5.5, and now backported to 5.4
- if above driver error was hit, very old FW versions would end up with DSP crash, instead returning a proper error
- for many systems, new 5.5 kernel and old 1.3 FW works ok with no notable issues
- we have at least one system, where new kernel and old FW does not work -> on these machines, upgrade to v1.4.2 firmware helps
Unless we get more reports, I'd lean towards not adding any new warnings. If someone hits a similar case as Jaroslav you did, we can see this from dmesg based on fw version and DSP oops dump (and/or reported IPC error). And the recommended action is to upgrade the FW to 1.4.2.
How about it?
Ok, it seems that it's really a combination of the driver code and 1.3.2 firmware. I tested 5.5 kernel with 1.3.2 again and it's fine on this platform.
Let's keep this without change.
Thank you for your tests.
Jaroslav
[1] https://github.com/thesofproject/sof/releases
Br, Kai