On 2019-08-27 19:18, Pierre-Louis Bossart wrote:
On 8/27/19 10:08 AM, Cezary Rojewski wrote:
On 2019-08-27 17:00, Pierre-Louis Bossart wrote:
>>>>> On the second thought what if instead of duplicating kernel >>>>> code, binaries would be duplicated? I.e. rather than >>>>> targeting /intel/dsp_fw_cnl.bin, _new_ /skylake would be >>>>> expecting /intel/dsp_fw_cnl_release.bin? Same with topology >>>>> binaries. In such case, we "only" need to figure out how to >>>>> propagate new files to Linux distos so whenever someone >>>>> updates their kernel, new binaries are already present in >>>>> their /lib/firmware. >>>>> >>>>> If such option is valid, we can postpone /skylake upgrade >>>>> till 5.4 merging window closes and the patches (rough >>>>> estimation is 150) would descend upon alsa-devel in time >>>>> between 5.4 and 5.5. >>>> >>>> If the driver and FW update will be within the same kernel >>>> release then IMHO there should be no compatibility problem >>>> between those two components, right? This way kernel users >>>> willing to stick to old FW can stay on older kernel version >>>> while others can update and receive all the latest FW >>>> functionality that was developed and enabled. >>> >>> I am not comfortable with precluding a kernel update because of >>> a single firmware file. There are all sort of reasons for >>> updating a kernel, security, sideband attacks and Android CDD >>> compatibility being the most obvious ones. >>> The single firmware file will not be a blocker as the driver included in updated kernel will support it. All you have to do is the little effort to re-generate your custom topology for the new firmware target. The entire operation should not be a problem as there are dedicated utilities like FDK to do that.
The issue is the same whether it's a topology file or a firmware file. The ideal situation is that when the kernel is updated it handles both in backwards compatible ways.
If to deal with a new firmware file you have to regenerate a new topology, you are in a different model altogether.
Your statement Pierre suggest that everyone should avoid any functional changes in kernel that are not critical because that would be problematic for others who switch from older kernel version.
All I said was that you cannot assume that people who are using an old firmware/driver will remain on an old kernel.
Mark made an initial proposal to essentially freeze the current solution, which would make it possible to update the kernel but keep the same skylake driver in legacy/maintenance mode only, and an 'new' option that would rely on an updated distribution of firmware/driver. I did not get the counter proposal from Cezary at all.
Ain't my previous message:
On the second thought what if instead of duplicating kernel code, binaries would be duplicated? I.e. rather than targeting /intel/dsp_fw_cnl.bin, _new_ /skylake would be expecting /intel/dsp_fw_cnl_release.bin? Same with topology binaries. In such case, we "only" need to figure out how to propagate new files to Linux distos so whenever someone updates their kernel, new binaries are already present in their /lib/firmware.
If such option is valid, we can postpone /skylake upgrade till 5.4 merging window closes and the patches (rough estimation is 150) would descend upon alsa-devel in time between 5.4 and 5.5.
a counter proposal?
you didn't explain how the 'duplicated binaries' would be selected. And 'instead of' means you suggested an alternative to Mark's proposal.
What I have in mind:
We leave the old stuff as is, e.g: /lib/firmware/intel/dsp_fw_cnl.bin -> points to _old_ FW binaries /lib/firmware/<PCI-ID>-INTEL-<oem_data_from_NHLT -> points to old topology
Existing /skylake i.e. before our initialization refactor would (kernels <5.5?) would still point to these and since they are not being removed from linux-firmware, nothing gets broken.
And then we "duplicate" and simply append the new ones: /lib/firmware/intel/dsp_fw_cnl_release.bin -> points to _new_ FW /lib/firmware/dfw_cnl_rt274 -> points to _new_ topology
Updated /skylake would simply expect the _new_ files and totally ignore the old ones i.e.: descriptors would be pointing to dsp_fw_cnl_release and dfw_cnl_rt274.
What if those new files are not present on the filesystem?
That's the hard part - we need to propagate these the Linux distos, much like older topologies are.
5.5+ (?) /skylake would rely on those new files as if the _old_ ones never existed.
Mark suggested: "We could have a wrapper which tries to load the newer firmware and uses the fixed driver code if that's there, otherwise tries the old driver with the existing firmware paths."
Maybe that's too complicated, I had in mind some sort of opt-in Kconfig where you only use the new firmware/topology when the user/distro gives a clear hint than it's fine to use newer stuff.
In one of the email you mentioned resources - human resources. If /skylake was to be duplicated, I fear maintenance of both would require too many resources. In such case we cannot guarantee same level of quality and coverage as in the _new_ /skylake-only case.
I also wonder how you are going to deal with all these topology files with a name derived from the OEM/NHLT. There's just so many of them...For upstream you probably want to provide ONE per platform variant, which limits you to the number of machine drivers supported.
Precisely! That's why we resign from these and move to a simpler format - dfw_cnl_rt274 or something of that sort. And no, we would provide as many topologies (e.g. dfw_cnl_my_wondeful_board123) as it's necessary. _Old_ topologies are not even propagated for every OEM/NHLT - there are sightings such as: https://bugzilla.kernel.org/show_bug.cgi?id=200963 or: https://github.com/GalliumOS/galliumos-distro/issues/379
See the -ENOENT (-2) in the logs dumped. The debug-only dfw_sst.bin fallback plays a role there too when in fact, it should not be even present on upstream : )
I even saw cases where peps are copying binaries (FW) from Windows machines.
"we don't break userspace" - I'm all aboard, Pierre, but our ship has too many holes already. For a short while it was possible not to notice the water pouring in through them. But now ship is literally sinking. Userspace is broken.
Improper process led to distributed topologies missing or not even being compatible with all upstreamed FWs. These FWs are also carrying some bugs as they are deprecated for quite a while. In order to update them, host side (driver) needs to be aligned - there is no escaping that. And so the loop closes.
We want to - rather MUST - fix that and make Intel SST works as it should for the sake of all users.