On 2019-08-23 18:26, Pierre-Louis Bossart wrote:
On 8/23/19 5:43 AM, Cezary Rojewski wrote:
On 2019-08-23 12:26, Mark Brown wrote:
On Fri, Aug 23, 2019 at 10:29:59AM +0200, Cezary Rojewski wrote:
On 2019-08-22 22:55, Pierre-Louis Bossart wrote:
On 8/22/19 2:03 PM, Cezary Rojewski wrote:
Code seen here is part of new Skylake fundament, located at the very bottom of internal mainline. Said mainline is tested constantly on at least sigle platform from every cAVS bucket (description below). This week, BDW has been added to the CI family and was essential in validating legacy changes. Baytrail platform is still missing. Changes for BYT directly mirror HSW/ BDW but due to current lack of platform were untested. Boards engaged in testing: rt286, rt298, rt274.
this is not enough, sorry. these are RVPs and you need to check with commercial devices supported in sound/soc/intel/boards/.
What machine board has to do with FW and host side? If it has, we better notify the owner so he can fix codec's code at once. All boards MUST follow recommended protocol whether its HDA or I2S in communicating with /skylake. This is hardware IP we taking about. I could as well test all platforms with AudioPrecision and say: shipit.
The machine driver defines how many links are used, and in what mode for the older cases where the topology is not used. You have configurations with very complicated links, e.g. with amplifiers in TDM mode plus IV feedback that will stress the firmware in ways that regular RVPs don't. Same for the case where the SSP clock is turned on at the request of the machine drivers. That's another case that can't be tested on RVPs.
I am not saying you need to test with every single commercial device, but that testing on RVPs is not a representative sample of the configurations and actual workloads.
Each and every FW coming from main branch gets tested on both RVP and production devices what is done with cooperation with integration teams, PAEs and such. Windows teams alone ensures each binary gets smashed by ten of thousands tests each week - this is true for any release candidate, the standards are very high. Moreover, array of platforms is engaged per target (e.g.: TGL) as single platform alone does not cut it.
So, I'd not worry about FW being vulnerable to any scenario as long as recommended protocol is followed.
...
DSP "commercial devices" with 99% of home audio being routed through HD-Audio legacy? I do contact representatives of "commercial devices" daily, you of all should be aware of fact that in almost all cases they are fed neither with upstream code nor upstream binaries. For the first time since eons sound/soc/intel/skylake code is tested before upstreaming, yet you still defend the mistakes of the past?
System vendors don't really matter here, end users with their desktops and laptops do. If a user has a system and they for whatever reason upgrade their kernel from one upstream version to another and don't touch any other aspect of their system the expectation is that they'll still have everything working after the upgrade. This means that if there's bugs in how things were deployed in the past the kernel ought to try to work with those bugs.
Noted, see below comments.
Some upstream FW binaries are not compatible with existing /skylake driver while changes found here (HARDWARE_CONFIG/ FIRMWARE_CONFIG) make use of firmware ability to offload hardware-specifics away from driver. These and more are core part of any cAVS design and are to be implemented and used by host. This too is missing on Linux upstream.
This sounds like it might be a problem.
Problem is, HARDWARE/ FIRMWARE_CONFIG (and more upcoming) should be the core part of cAVS driver, implemented before any PCM related code is added.
SKL FW binary existing on upstream is a descendant of old spt branch, obsoleted for 4-5 years now. That FW is a stub, quickly replaced by kbl which is to be used on all 1.5 cAVS platforms.
That's well within the lifespan people will expect from a PC these days, my personal systems are mostly older than that and do fine at most things except for big builds.
It's not about age itself. It's about the fact that FW binaries from non-supported or main FW branches ended here and given the date these have been added, it has already been recommended to make use of kbl or apl_auto branches.
If I could, I'd rather prefer the "detect and notify" as it is impossible to repair all the mistakes made in /sound/soc/intel/skylake.
Do we have a sense of how many such systems exist?
My understanding is that the SST driver is used for Skylake for Chromebooks only. For platforms defined for Windows the cases where the DSP is used are marginal. I'd view the risk of updating the firmware for Skylake as very limited but that's my personal opinion only. For APL/KBL it's a lot harder to track, there are industrial/embedded cases and that's where we'd really want to trap any incompatibilities.
APL/ KBL - are there any examples I should be aware of? To my knowledge we are handling all of them internally and they have not seen any obsolete binary for quite some time already. For some, FW has been updated even this week..
However, in practice there isn't any reliable way to verify the actual usability of old FW binary against host site as the interface is volatile and numbers alone don't mean much. Patch with FW binaries would not remove old ones, simply add new versions to the directory.
Can you do things the other way around and positively identify firmwares that meet whatever standards you're interested in here?
The only thing that comes to my mind is the following:
- during boot up sequence, in response to any INVALID_REQUEST or such
coming from FW, collapse and dump: "upgrade firmware" message
- once boot up sequence is completed, disregard INVALID_REQUEST check
as it is also the common response of FW in various scenarios
With the request_firmware() mechanism, the kernel cannot parse the file ahead of time, but don't you have a version information reported by the firmware post-boot that can be used by the kernel so track that the firmware isn't likely to work?
Wasn't lying about FW version being unreliable. Let's say vendor receives quick FW drop with new RCR.. such eng drop may carry invalid numbers such as 0.0.0.0.. In general, I try to avoid relying on FW version whenever possible. It can be dumped for debug reasons, true, but to be relied on? Not really.
- user removes existing sym link from /lib/firmware/intel and creates
new one, pointing to updated FW binary that should also be present in /lib/firmware/intel
That's typically handled by distributions updating the linux-firmware package. Only advanced users and developers can change these symlinks.
The other point that comes to my mind is whether we are going to see dependencies between firmware and topology files? Can you use an 'old' topology with a 'newer' firmware, or is this a 3-way interoperability issue?
Precisely! Three-way-tie! It's best FW get updated together with topology as old FW may enforce different constraints on pipeline modules.
Yay, between rock and hard place. On one side we got old buggy FWs which should (more like should NOT be even here..) be updated to improve user's experience but updating these alone won't cut it as host side needs to be aligned too. On the other we want to align upstream /skylake with actual working example, which will quickly fail if it encounters obsolete FW binary. And if that wasn't enough, lovely topologies come into picture where some of these were developed behind FDK's back and thus completely bypassing deployment process.
First thing we will do now is prioritizing topology refactor so all initialization/ load oriented thingies will be visible for upstream review. By doing so, we got all elephants in one room and can discuss how to handle it in best fashion: seamless transition for end-users.
There aren't many options available: notify user -or- fallback to defaults (hardcodes)? in case encountered binaries do not meet cAVS design criteria.
Personally, I'm against all hardcodes and would simply recommend all user to redirect their symlinks when they do switch kernel - along with dumping warning/ error message in dmesg. Hardcodes bring problems with forward compatibility and that's why host should offload them away to FW.
Czarek