lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1fdbf70-018e-1d35-ae21-a3c8a29e7aa2@molgen.mpg.de>
Date:   Tue, 24 Apr 2018 13:59:58 +0200
From:   Paul Menzel <pmenzel+alsa-devel@...gen.mpg.de>
To:     Takashi Iwai <tiwai@...e.de>
Cc:     Jaroslav Kysela <perex@...ex.cz>, alsa-devel@...a-project.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ALSA: hda: Request driver probe from an async task

Dear Takashi,


On 04/23/18 14:33, Takashi Iwai wrote:
> On Mon, 23 Apr 2018 14:30:36 +0200, Paul Menzel wrote:

>> On 04/23/18 14:21, Takashi Iwai wrote:
>>> On Mon, 23 Apr 2018 14:05:52 +0200, Paul Menzel wrote:
>>>>
>>>> From: Paul Menzel <pmenzel@...gen.mpg.de>
>>>> Date: Sat, 24 Mar 2018 09:28:43 +0100
>>>>
>>>> On an ASRock E350M1, with Linux 4.17-rc1 according to `initcall_debug`
>>>> calling `azx_driver_init` takes sometimes more than a few milliseconds,
>>>> and up to 200 ms.
>>>>
>>>> ```
>>>> [    2.892598] calling  azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 218
>>>> [    2.943002] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel]
>>>> returned 0 after 49195 usecs
>>>> ```
>>>>
>>>> Trying to execute the Linux kernel in less than 500 ms, this is quite a
>>>> hold-up, and therefore request the probe from an async task.
>>>>
>>>> With this change, the test shows, that the function returns earlier.
>>>>
>>>> ```
>>>> [    3.254800] calling  azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 227
>>>> [    3.254887] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel]
>>>> returned 0 after 66 usecs
>>>> ```
>>>>
>>>> The same behavior is visible on a Dell OptiPlex 7010. The longer times
>>>> seem to happen, when the module *e1000e* is probed during the same time.
>>>>
>>>> Signed-off-by: Paul Menzel <pmenzel@...gen.mpg.de>
>>>
>>> What actually took so long?  Could you analyze further instead of
>>> blindly putting the flag?
>>
>> Well, I am not sure. Could you please give me hints, how to debug this
>> further? Is there some debug flag?
> 
> Usually perf would help, but even a simple printk() should suffice to
> see what's going on there :)

Please find the messages for a 23 ms run below, and the debug patch 
attached.

> [    2.996238] calling  azx_driver_init+0x0/0xfe4 [snd_hda_intel] @ 214
> [    3.009838] calling  rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 217
> [    3.009904] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    3.010721] random: get_random_bytes called from ipv6_regen_rndid+0x2c/0x100 with crng_init=1
> [    3.011283] r8169 0000:03:00.0 eth0: RTL8168e/8111e at 0x(ptrval), bc:5f:f4:c8:d3:98, XID 0c200000 IRQ 26
> [    3.011289] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
> [    3.013876] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 3917 usecs
> [    3.031754] calling  pcspkr_platform_driver_init+0x0/0x1000 [pcspkr] @ 221
> [    3.031904] input: PC Speaker as /devices/platform/pcspkr/input/input4
> [    3.032288] initcall pcspkr_platform_driver_init+0x0/0x1000 [pcspkr] returned 0 after 508 usecs
> [    3.034795] calling  psmouse_init+0x0/0x7c [psmouse] @ 220
> [    3.034903] initcall psmouse_init+0x0/0x7c [psmouse] returned 0 after 87 usecs
> [    3.043051] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.043154] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.043187] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.043198] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.043229] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.043265] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.043429] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.043479] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.043749] random: get_random_bytes called from key_alloc+0x1fc/0x5e0 with crng_init=1
> [    3.043973] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.045483] random: get_random_u32 called from arch_rnd.part.2+0x18/0x40 with crng_init=1
> [    3.045501] random: get_random_u32 called from load_elf_binary+0x76a/0x1d20 with crng_init=1
> [    3.045507] random: get_random_u32 called from arch_align_stack+0x45/0x70 with crng_init=1
> [    3.045528] random: get_random_u32 called from arch_rnd.part.2+0x18/0x40 with crng_init=1
> [    3.045558] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.045771] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.046127] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.046197] calling  pcsp_init+0x0/0x1000 [snd_pcsp] @ 221
> [    3.046206] Error: Driver 'pcspkr' is already registered, aborting...
> [    3.046337] initcall pcsp_init+0x0/0x1000 [snd_pcsp] returned -16 after 128 usecs
> [    3.046348] random: get_random_u32 called from arch_align_stack+0x45/0x70 with crng_init=1
> [    3.046363] random: get_random_bytes called from load_elf_binary+0xaf8/0x1d20 with crng_init=1
> [    3.046383] random: get_random_u32 called from randomize_page+0x42/0x57 with crng_init=1
> [    3.047312] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.048151] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.048638] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.054325] random: get_random_u32 called from copy_process.part.33+0x186/0x2100 with crng_init=1
> [    3.054371] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.067344] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 4.17
> [    3.067350] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> [    3.067353] usb usb3: Product: OHCI PCI host controller
> [    3.067356] usb usb3: Manufacturer: Linux 4.17.0-rc2+ ohci_hcd
> [    3.067359] usb usb3: SerialNumber: 0000:00:12.0
> [    3.067850] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=1
> [    3.067864] hub 3-0:1.0: USB hub found
> [    3.067960] hub 3-0:1.0: 5 ports detected
> [    3.068141] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=1
> [    3.068618] 1
> [    3.068644] 2
> [    3.068953] 3
> [    3.068954] 4
> [    3.068955] 5
> [    3.068959] 6
> [    3.068960] 7
> [    3.068966] 8
> [    3.068967] 9
> [    3.069204] 1
> [    3.069217] 2
> [    3.069565] 3
> [    3.069566] 4
> [    3.069567] 5
> [    3.069568] 6
> [    3.069569] 7
> [    3.069581] 8
> [    3.069583] 9
> [    3.069774] random: get_random_u32 called from copy_process.part.33+0x186/0x2100 with crng_init=1
> [    3.069924] initcall azx_driver_init+0x0/0xfe4 [snd_hda_intel] returned 0 after 23153 usecs

Unfortunately, it looks like the time stamps of the Linux kernel and the 
initcall_debug time do not match. From the Linux time stamps it should 
be 73 ms.

>> I am only aware of the Ftrace framework, but in my experience it also
>> skews the timings quite a bit, so might not be the best choice.
> 
> We know that there are some cases where the codec / controller
> communication stalls on the recent Coffee Lake or such platforms.
> But quite not sure how it happens.
> 
> Moving the stuff into async just moves something ugly, and it's no
> fix, per se, if such a long delay itself is unexpected.

Let’s hope, we can fix it.


Kind regards,

Paul

View attachment "0001-ALSA-Add-debug-statements.patch" of type "text/x-patch" (2084 bytes)

View attachment "20180423–linux_4.17-rc2–dmesg.txt" of type "text/plain" (104010 bytes)

Download attachment "smime.p7s" of type "application/pkcs7-signature" (5174 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ