On 4/30/2020 5:38 PM, Lu, Brent wrote:
Hi, yes that seems bit weird. It is bit better as it does not modify common code, but still... Maybe going back to your original idea of replacing memcpy, try replacing it with readq? It should generate one instruction read (although it is only for x64_64, for 32 bit kernel we would still need to do something else).
Thanks, Amadeusz
Hi,
I've compared the assembly to see if there is clue. Both kernels are using 64-bit mov to read register and the only difference is optimized or not. Both implementations are looking good to me. Currently I don't have answer why slower kernel hits the problem while optimized one survived.
- Old kernel. Code is optimized and not able to reproduce the issue on this kernel.
(gdb) disas sst_shim32_read64 Dump of assembler code for function sst_shim32_read64: 0x000000000000096c <+0>: call 0x971 <sst_shim32_read64+5> => call __fentry__ 0x0000000000000971 <+5>: push rbp 0x0000000000000972 <+6>: mov rbp,rsp 0x0000000000000975 <+9>: mov eax,esi 0x0000000000000977 <+11>: mov rax,QWORD PTR [rdi+rax*1] => perform 64-bit mov 0x000000000000097b <+15>: pop rbp 0x000000000000097c <+16>: ret End of assembler dump.
Hi,
That's why I would suggest trying with readq, it should also generate one instruction read x86_64 platforms, I looked a bit more and there is fallback to generate two 32 bit reads on 32bit platforms, so my previous concern about having to write separate handling for those platforms was unneeded. So I would recommend checking using it.
diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c index ec66be269b69..e96f636387d9 100644 --- a/sound/soc/intel/common/sst-dsp.c +++ b/sound/soc/intel/common/sst-dsp.c @@ -34,16 +34,13 @@ EXPORT_SYMBOL_GPL(sst_shim32_read);
void sst_shim32_write64(void __iomem *addr, u32 offset, u64 value) { - memcpy_toio(addr + offset, &value, sizeof(value)); + writeq(value, addr + offset); } EXPORT_SYMBOL_GPL(sst_shim32_write64);
u64 sst_shim32_read64(void __iomem *addr, u32 offset) { - u64 val; - - memcpy_fromio(&val, addr + offset, sizeof(val)); - return val; + return readq(addr + offset); } EXPORT_SYMBOL_GPL(sst_shim32_read64);
Thanks, Amadeusz