Dear Takashi,
On 04/23/18 14:33, Takashi Iwai wrote:
On Mon, 23 Apr 2018 14:30:36 +0200, Paul Menzel wrote:
On 04/23/18 14:21, Takashi Iwai wrote:
On Mon, 23 Apr 2018 14:05:52 +0200, Paul Menzel wrote:
From: Paul Menzel pmenzel@molgen.mpg.de Date: Sat, 24 Mar 2018 09:28:43 +0100
On an ASRock E350M1, with Linux 4.17-rc1 according to `initcall_debug` calling `azx_driver_init` takes sometimes more than a few milliseconds, and up to 200 ms.
[ 2.892598] calling azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 218 [ 2.943002] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel] returned 0 after 49195 usecs
Trying to execute the Linux kernel in less than 500 ms, this is quite a hold-up, and therefore request the probe from an async task.
With this change, the test shows, that the function returns earlier.
[ 3.254800] calling azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 227 [ 3.254887] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel] returned 0 after 66 usecs
The same behavior is visible on a Dell OptiPlex 7010. The longer times seem to happen, when the module *e1000e* is probed during the same time.
Signed-off-by: Paul Menzel pmenzel@molgen.mpg.de
What actually took so long? Could you analyze further instead of blindly putting the flag?
Well, I am not sure. Could you please give me hints, how to debug this further? Is there some debug flag?
Usually perf would help, but even a simple printk() should suffice to see what's going on there :)
Please find the messages for a 23 ms run below, and the debug patch attached.
[ 2.996238] calling azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 214 [ 3.009838] calling rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 217 [ 3.009904] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded [ 3.010721] random: get_random_bytes called from ipv6_regen_rndid+0x2c/0x100 with crng_init=1 [ 3.011283] r8169 0000:03:00.0 eth0: RTL8168e/8111e at 0x(ptrval), bc:5f:f4:c8:d3:98, XID 0c200000 IRQ 26 [ 3.011289] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko] [ 3.013876] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 3917 usecs [ 3.031754] calling pcspkr_platform_driver_init+0x0/0x1000 [pcspkr] @ 221 [ 3.031904] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 3.032288] initcall pcspkr_platform_driver_init+0x0/0x1000 [pcspkr] returned 0 after 508 usecs [ 3.034795] calling psmouse_init+0x0/0x7c [psmouse] @ 220 [ 3.034903] initcall psmouse_init+0x0/0x7c [psmouse] returned 0 after 87 usecs [ 3.043051] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.043154] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.043187] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.043198] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.043229] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.043265] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.043429] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.043479] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.043749] random: get_random_bytes called from key_alloc+0x1fc/0x5e0 with crng_init=1 [ 3.043973] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.045483] random: get_random_u32 called from arch_rnd.part.2+0x18/0x40 with crng_init=1 [ 3.045501] random: get_random_u32 called from load_elf_binary+0x76a/0x1d20 with crng_init=1 [ 3.045507] random: get_random_u32 called from arch_align_stack+0x45/0x70 with crng_init=1 [ 3.045528] random: get_random_u32 called from arch_rnd.part.2+0x18/0x40 with crng_init=1 [ 3.045558] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.045771] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.046127] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.046197] calling pcsp_init+0x0/0x1000 [snd_pcsp] @ 221 [ 3.046206] Error: Driver 'pcspkr' is already registered, aborting... [ 3.046337] initcall pcsp_init+0x0/0x1000 [snd_pcsp] returned -16 after 128 usecs [ 3.046348] random: get_random_u32 called from arch_align_stack+0x45/0x70 with crng_init=1 [ 3.046363] random: get_random_bytes called from load_elf_binary+0xaf8/0x1d20 with crng_init=1 [ 3.046383] random: get_random_u32 called from randomize_page+0x42/0x57 with crng_init=1 [ 3.047312] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.048151] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.048638] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.054325] random: get_random_u32 called from copy_process.part.33+0x186/0x2100 with crng_init=1 [ 3.054371] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.067344] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 4.17 [ 3.067350] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.067353] usb usb3: Product: OHCI PCI host controller [ 3.067356] usb usb3: Manufacturer: Linux 4.17.0-rc2+ ohci_hcd [ 3.067359] usb usb3: SerialNumber: 0000:00:12.0 [ 3.067850] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1 [ 3.067864] hub 3-0:1.0: USB hub found [ 3.067960] hub 3-0:1.0: 5 ports detected [ 3.068141] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1 [ 3.068618] 1 [ 3.068644] 2 [ 3.068953] 3 [ 3.068954] 4 [ 3.068955] 5 [ 3.068959] 6 [ 3.068960] 7 [ 3.068966] 8 [ 3.068967] 9 [ 3.069204] 1 [ 3.069217] 2 [ 3.069565] 3 [ 3.069566] 4 [ 3.069567] 5 [ 3.069568] 6 [ 3.069569] 7 [ 3.069581] 8 [ 3.069583] 9 [ 3.069774] random: get_random_u32 called from copy_process.part.33+0x186/0x2100 with crng_init=1 [ 3.069924] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel] returned 0 after 23153 usecs
Unfortunately, it looks like the time stamps of the Linux kernel and the initcall_debug time do not match. From the Linux time stamps it should be 73 ms.
I am only aware of the Ftrace framework, but in my experience it also skews the timings quite a bit, so might not be the best choice.
We know that there are some cases where the codec / controller communication stalls on the recent Coffee Lake or such platforms. But quite not sure how it happens.
Moving the stuff into async just moves something ugly, and it's no fix, per se, if such a long delay itself is unexpected.
Let’s hope, we can fix it.
Kind regards,
Paul